2

I have data set of 50 students. I want to cluster them on their sequential data ( While doing a job they followed multiple sequences A, B, c total 7 stages). I am planning to apply k-means clustering on their first order Markov chain transition probability matrix. So that means, I have 50, 7x7 transition probability matrix. Each 7x7 matrix has 49 data point. So, I can make a 50x49 matrix. If I apply k- means on this matrix, is it a proper approach for clustering sequential data?

  • Did you try straight K-Means using Euclidean distance on this data? Also for clarity, students could revisit some states? Does the magnitude of states visited also make a difference to you? – user4446237 Jul 28 '19 at 00:34

0 Answers0