The Data Mining Forum
This forum is about data mining
, data science
and big data
: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger
. No registration is required to use this forum!
Re: Why we use multiple threshold instead of single threshold
Date: July 31, 2019 02:22AM
This is because items have different nature in a given dataset. some items appear frequently whereas others appear rarely. mining with a proper single support threshold is hard since, with high support threshold, the interesting rare ones will be missed while mining with a small support threshold generates a huge amount of itemsets that makes them expensive to be analyzed.
To overcome this issue, assigning multiple support to reflect the nature of each item in the dataset. So, those itemsets with small support can be generated with a samll support threshold and those itemsets with high support can be generated with high support. in such case, we find both frequent and rare itemsets without generating a huge amount of itemsets. Also, interesting rare itemsets are generated.