The Data Mining Forum                             open-source data mining software open-source data mining software data science journal data mining conferences
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
TopKClassRules
Posted by: Irfan
Date: July 28, 2020 11:42PM

Greetings,

I use TopKClassRules in my project and found it very useful. However, it only provides TopKClass frequent association rule.

Are there any means of modifying it so that it can provide TopKClass rare association rule? Or which rule works like TopKClassRule but provide TopKClass rare association rule?

Thank you

Options: ReplyQuote
Re: TopKClassRules
Date: July 29, 2020 12:00AM

Hi Irfan,

Glad it is useful.

I think it could be modified for rare class association rules. However, there are several definition of what is "rare".

If "rare" just means to have a support lower than some threshold maxsup, then I think it would not be hard to do. But if you use other definitions of what is a rare rule, maybe it is more complicated.

But in any case, when changing an algorithm for doing something else, there can always be some problems that will arise that we did not thought about ;-) I think it is possible, but maybe there is something I did not think about.

Best regards,

Philippe

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: July 29, 2020 05:35PM

Thank you Prof for your quick reply,

Yes, what I mean is just rare, which has support lower than some threshold maxsup.

So to have that option, in this rule. Whether it is rare/frequent.

But other input like k, minconf(%), and Fixed consequent items to remain as it is as presented in the algorithm.

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: July 29, 2020 06:24PM

But support for rare TopKClassRule should not be 0%.

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: July 31, 2020 04:37PM

Dear Prof,

Any suggestion on this about which file/part of the file should I consider for editing to add this functionality.

Or anyone who can help in this.

Sorry for this because am not familiar with java.


Thank you.

Options: ReplyQuote
Re: TopKClassRules
Date: July 31, 2020 04:43PM

Hi,

I will try to do it for you in the next days.

Best regards,

Philippe

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: July 31, 2020 05:46PM

Thank you Prof.

Options: ReplyQuote
Re: TopKClassRules
Date: August 09, 2020 09:08AM

Hi Irfan,

I have added the "maxsup" parameter:



You can try it by downloading the spmf.jar file again from the website.

For the source code version of SPMF, I will upload spmf.zip maybe in a few hours because I want to also update another algorithm.

Best,

Philippe



Edited 1 time(s). Last edit at 08/09/2020 09:10AM by webmasterphilfv.

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: August 09, 2020 04:43PM

Dear Prof,

Thank you very much.

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: August 10, 2020 04:24PM

Dear Prof,

I downloaded the new GUI version of SPMF and I tested by putting max conf of 25% and min conf 2% but when I run for generating the rules, still the rules which are being generated are frequent rules and not rare rules which belong on that interval. i.e Still I found the rule which has 90%conf with conf (2%-25%) interval, which also generated when looking for a frequent rule while putting minconf of 60%.

Options: ReplyQuote
Re: TopKClassRules
Date: August 10, 2020 06:11PM

Hi Irfan,

I would like to understand more clearly.

You said "maxconf" but rare rule is not about confidence but about the support. The parameter is "maxsup" not "maxconf".

I have implemented a parameter "maxsup" for TopKClassRules in the version that you have downloaded.

Do you mean that you would like to have a parameter "maxconf?"

Best regards,

Philippe

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: August 10, 2020 11:34PM

Yes, Prof,

Because after reading the paper of this algorithm, seems its advantage of the algorithm is to enter conf only.

So, having minconf, and maxconf, I think will handle the issue of rare. But with this maxsup and min conf, the result obtained is not rare. Because still give the same output as a frequent itemset.

Options: ReplyQuote
Re: TopKClassRules
Date: August 11, 2020 04:38AM

Hi again,

But rare means infrequent... It means something that has a low frequency.

It depends on how you set the parameter. If you set maxsup very low, you will find infrequent (rare) rules. For example, if you set maxsup = 25 %, the algorithm should not give you rule more frequent than 25%.. If you set maxsup = 0.01 %, the algorithm will not give you rules more frequent than 0.01% and so on... Maybe you need to decrease it further if you still find frequent rules.

The confidence is not related to frequent or rare. You can have frequent rules with a low or high confidence and you can also have rare rules that have a high or a low confidence...

For me it doesn't matter. I can add the constraint of maxconfidence for you to the algorithm. But it will not help you much to find rare (infrequent) rules. To find infrequent rules, you should use maxsup and decrease it.

Using maxconfidence will just help you find rules that are less "strong", but not necessarily rare rules.

Yes, in the paper, the point is that we want to use k to avoid setting a constraint on the support. But internally, the algorithm will still use a minimum support. By using the maxsupport, you will force the algorithm to search for less frequent rules.. But you need to set maxsupport low enough otherwise, you will indeed get the same rules!

Best,

Philippe



Edited 2 time(s). Last edit at 08/11/2020 04:41AM by webmasterphilfv.

Options: ReplyQuote
Re: TopKClassRules
Posted by: Irfan
Date: August 12, 2020 12:23AM

Thank you for quick reply and more explanation.

Let me practise more with dataset having a different scenario, then I will be back.

Thank you very much.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.