If I have a matrix of co-occurring words in conversations of different lengths, is it appropriate to standardize / normalize the data prior to training?
My matrix is set up as follows: one row per two-person conversation, and columns are the words that co-occur between speakers. I cannot help but think that, as a longer conversation will likely comprise more shared words than shorter ones, I should factor this in somehow.