The Data Mining Forum                             open-source data mining software open-source data mining software data science journal data mining conferences
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
the differences between RuleGrowth,TRuleGrowth, ERMiner
Posted by: xiaowei
Date: April 06, 2020 09:23PM

Hi Professor Philippe,

I'm studying sequence rule mining algorithms recently. Although I have read your related literature, I still don't understand the difference between RuleGrowth, TRuleGrowth, ERMiner algorithms. Can you explain them briefly? Especially the meaning of their respective parameters.


Kind regards,
Xiaowei

Options: ReplyQuote
Re: the differences between RuleGrowth,TRuleGrowth, ERMiner
Date: April 08, 2020 05:26PM

Hi Xiaowei,

Yes, I will explain the main idea.

CMDeo, CMRules, RuleGrowth, ERMiner: These algorithms have exactly the same input and exactly the same output. The only difference between these algorithm is how they work internally to find the output. They use different strategies to find the result and because of this some of them are faster or slower or use more or less memory. So the difference is about the performance. Generally, RuleGrowth is faster than CMRules, and CMRules is usually faster than CMDeo. ERMiner is probably faster than RuleGrowth but I think that it may use more memory. Of course, it depends on the data. It may not always be the case. So usually i recommend to use RuleGrowth or ERMiner.

These three algorithms have two parameters: minimum support and minimum confidence. The goal is to find some rules that have a high support and a high confidence (a support no less than the minimum support and a confidence no less than the minimum confidence).

TopSeqRules: This algorithm is based on RuleGrowth. It works in the same way but the parameters are different. The user need to set a number k and the minimum confidence. Then, the algorithm returns the top-k most frequent rules that have a confidence no less than the minimum confidence.

TNS: This is similar to TopSeqRules. The difference is that we also remove some rules that are said to be redundant.

TRuleGrowth: This is a modification of RuleGrowth where we let the user specify a new constraint that is the maximum window length. The parameters are the same as RuleGrowth except that there is a new parameter the maximum window length. The idea is the following.

If you look at a sequence like this:

(A) (cool smiley (C) (D) (E) (F) (G) (H)

maybe you could find a rule like this:

A --> H

But as you can see above, A and H are far appart from each other.

So using the new parameter (the maximum window length), you can set a constraint on how far appart the antecedent and consequent of a rule can be. For example, if you set the maximum window to 2, then the rule A --> H will not be considered as appearing in the sequence:

(A) (cool smiley (C) (D) (E) (F) (G) (H)

because A and H will be too far from each other.


That is the main idea.

Best regards,

Philippe

Options: ReplyQuote
Re: the differences between RuleGrowth,TRuleGrowth, ERMiner
Posted by: xiaowei
Date: April 08, 2020 06:54PM

Thanks a lot, Professor Philippe. I got it.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.