The Data Mining Forum
This forum is about data mining
, data science
and big data
: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger
. No registration is required to use this forum!
Confused about FP Growth Results
Date: July 03, 2017 11:08AM
Forgive me if this is a silly question--I'm new to this type of analysis. I ran the spmf algorithm called FPGrowth_association_rules_with_lift on a pretty large dataset and with pretty low minimums. I thought support was supposed to come out as a decimal or at least a pretty low percentage given the many items I was working with, but most of the results say #SUP: 32 and a lot look like #SUP: 32 #CONF: 1.0 #LIFT: Infinity . The smallest support is about 26. How do I interpret these numbers? I know these items weren't in 32% of baskets.
Re: Confused about FP Growth Results
Date: July 03, 2017 05:06PM
Actually, the support does not need to be a percentage. There are two ways to represent the support:
- the absolute support : a number of transactions that contain a pattern, such as 32 transactions
- the relative support : a percentage of transactions that contain a pattern, such as 32 transactions divided by the total number of transactions in a database, which gives a percentage.
In SPMF, most algorithms will provide the support as an absolute support rather than a relative support. But these two ways of representing the support are equivalent. If you want to convert from absolute to relative support, you just have to divide the number by the total number of transactions. Or if you want to convert from relative to absolute support, you just have to multiply by the total number of transactions.
So if you see #SUP: 32, it means that the rules appeared in 32 baskets. Not in 32 %.
Actually, this is also explained in the documentation ;-)
#CONF: 1.0 means that the confidence is 100 % for that rule