3

Is it possible to get recommendation on similar product using Mahout ?

eg :

I have data set of movies with following attributes

Movie_name, Actor_1, Actor_2, Actress_1, Actress_2, Director, Theme, Language

Now given a Movie_name the system should recommend top 3 similar movies based on the attributes .

Can this be done using Mahout. If yes how ?

Sreejithc321
  • 1,890
  • 3
  • 17
  • 32

1 Answers1

2

Generally, this is done using spark-rowsimilarity algorithm - it is a class of content based recommendation. However, the actual process of doing this is quite simple. Here are the steps:

  1. For each movie, convert your categorical variables into columns. For lets say that actor_1 has Brad Pitt, Daniel Craig, and Vin Diesel for different movies. This will become three columns with a 1 denoting which movies have each actor. Your movie matrix will look something like:

    Movie Name, Has_Brad_Pitt, Has_Daniel_Craig, Has_Vin_Diesel, ...
    MI-6      ,     1        ,       0         ,     0         , ...
    Fast&Furios,    0        ,       0         ,     1         , ...
    Casino Royale,  0        ,       1         ,     0         , ...
    
  2. Now, to find similarity score of movies, you can just compute the cross product of the two vectors. Higher the value, more they are similar.

This can be done by the spark-rowsimilarity algorithm in one shot. You may have to do some work in encoding categorical variables.

Ash
  • 181
  • 1
  • 5