3

If I get some posts on reddict.com, how can I predict whether this post will (trending/hot/popular) in the future or not? I would like to use the hidden markov model to predict it, but I don`t know how to define the hidden states and observation sequence...can anyone give me any suggestion? Thanks~

I only have these time series data, such asu comments..

OOO
  • 39
  • 2

2 Answers2

2

An HMM doesn't really make sense (echoing what Dries said). If you want to use an HMM, you would have to justify it by asking "Can Reddit posts be represented by a Markov process?" I can't think of a way to make that sentence true and still take advantage of the features related to a popular post.

Consider the possible feature set: the time it was posted, the user posting it, the type of post (link / image / text), the subreddit, the number of subscribers to that subreddit, a score of positivity / negativity, number of words in the title etc. Don't count out these features.

1

I don't think it makes a lot of sense to use HMM's for this problem. What I would suggest is some kind of text-based classifier. If you want to use a cool technique, you could use a neural network to learn based on the text of successful posts.

On the other hand, If you want to use an easy technique you could make a predictor for the popularity such as a regression model (try to predict upvotes).

  • HI, can I use the regression model to predict the upvotes only with the comments (such as the interarrival time of the time series data)? But I have no idea about the clearly steps to build the model? Woud you mind give me some suggesion, please? Thanks a lot. – OOO Feb 22 '16 at 14:18