The Data Mining Forum
This forum is about data mining
, data science
and big data
: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger
. No registration is required to use this forum!
rules count of TopSeqRules algorithm is not K, why?
Date: April 07, 2020 05:52AM
my sequence database is input into the TopSeqRules algorithm using the graphical interface, the parameter K= 50, minconf=0.6. But the sequential rules count is 4898, not 50, why? I am confused.
Re: rules count of TopSeqRules algorithm is not K, why?
Date: April 08, 2020 05:11PM
Sorry to answer late. The last two days have been very busy, and I did not check the forum.
The number of rules can be more than k because sometimes many rules have exactly the same support.
For example, if you set k=50, but there are 1000 rules that have exactly the same support, the algorithm should give you all these rules because they are equal in terms of support.
It is also possible that the algorithm return less than k rules if the database does not contain k rules. For example, if you set k = 1000 but there does not exist 1000 rules in the database, then the algorithm cannot return 1000.
Those are special cases. But it may happen.