I'm facing an issue where I have a massive amount of data that I need to cluster. As we know, clustering algorithms can have a very high O complexity, and I'm looking for ways to reduce the time my algorithm is running.
I want to try a few different approaches, like pre-clustering (canopy clustering) or subspace clustering, correlation clustering etc.
However, something that I haven't heard about, and I wonder why - Is it viable to simply get a representative sample from my dataset, run the clustering on that, and generalize this model to the whole dataset? Why/why not is this a viable approach? Thank you!