13

Could anyone please recommend a good frequent itemset package in python? I only need to find frequent itemset, no need of finding the association rules.

Pluviophile
  • 3,520
  • 11
  • 29
  • 49
Edamame
  • 2,705
  • 5
  • 23
  • 32
  • In my personal exp, I found R's apriori and FP-growth much better than their Python alternatives. So, if you're open to considering R, you should try them :) – Dawny33 Mar 09 '17 at 06:09

3 Answers3

11

I also recommend MLXtend library for frequent itemsets.

usage example:

dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
           ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
           ['Milk', 'Apple', 'Kidney Beans', 'Eggs'],
           ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],
           ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]

te = TransactionEncoder()

te_ary = te.fit(dataset).transform(dataset)

df = pd.DataFrame(te_ary, columns=te.columns_)

frequent_itemsets = apriori(df, min_support=0.1, use_colnames=True)

print frequent_itemsets
MoAdel
  • 111
  • 1
  • 2
  • this package has memory error when you have too many distinct items. Not recommended for Big Data – Snow Jul 07 '20 at 12:40
4

Orange3-Associate package provides frequent_itemsets() function based on FP-growth algorithm.

K3---rnc
  • 3,442
  • 1
  • 12
  • 12
3

MLXtend library has been really useful for me. In its docummentation there is an Apriori implementation that outputs the frequent itemset.

Please check the first example available in http://rasbt.github.io/mlxtend/user_guide/frequent_patterns/apriori/.

tbnsilveira
  • 131
  • 3