The Data Mining Forum                             open-source data mining software data science journal data mining conferences high utility mining book
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
FP-Growth Ass.Rule using binary dataset
Posted by: Zastiex
Date: May 31, 2015 01:24AM

Hello forum,
I need help with my bachelor final year project.
I tried to use SPMF to test association rules mining using my sales transaction data.
Unfortunately my dataset is in binary with .xlsx format. what should I do with this? thanks for the help

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Posted by: Philippe
Date: May 31, 2015 01:59AM

You could export your excel file as a CSV file (a text file). Then you could write some code to convert the CSV file to the correct input format.

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Posted by: Zastiex
Date: May 31, 2015 07:31PM

Oh, I see. so what do you mean with the correct input format? is it .txt?

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Date: June 02, 2015 02:46AM

Yes, SPMF uses text files as input.

If you go to the download page, there are instructions for installing SPMF and it comes with example input files.

http://www.philippe-fournier-viger.com/spmf/index.php?link=download.php

Besides, the input format for each algorithm is described in the documentation page of SPMF. Just click on the example for the algorithm that you want to execute and you will see the description of the input format.

http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Posted by: Zastiex
Date: June 26, 2015 04:08AM

thank you very much for your help. finally i can write the code to convert xls. but the thing is, i'm confused on how the #SUP is counted. i defined the minsup 0.08 but the result showed #SUP at least 20 for each rule. can you help me?

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Date: June 26, 2015 05:46AM

hi,

The support can be either expressed as a percentage or as an integer.

For example, if you have a pattern appearing in 2 transactions of a database containing 5 transactions.

You can say that the support of the pattern is 2 / 5 = 0.4 or 40 % (this is called the relative support).

Or you can say that the support is 2 transactions (this is called the absolute support).

In the results, the support is expressed as an absolute support. You can convert it to relative support (a percentage) by dividing by the number of transactions in your database.

Hope this helps!

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Posted by: Zastiex
Date: June 26, 2015 06:31AM

that's very helpful. so in my case, the rule shows #SUP = 20, then the relative support will be 20/507 = 0.03945. is that make sense? since i set the minsup 0.08. in my understanding, the support of each rule should be higher than minsup. CMIIW

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Date: June 26, 2015 07:03AM

Yes, it make senses.

Yes it should be higher than minsup.

If it is lower, than perhaps that there is an error somewhere. Is there really 507 transactions? Did you really set the parameter to 0.08 ? Or maybe that there is a bug somewhere.

If you cannot find the problem, you can always send the input file to me, and tell me the parameters and which algorithm you used so that i can replicate the problem. philippe.fv AT gmail.com

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Posted by: Zastiex
Date: June 26, 2015 07:46AM

OMG. You are correct, it's not 507, it's 231 !

i forgot that i already removed the nonfrequent itemset!

LOL smiling bouncing smiley

thank you so much!!

You rock ! smileys with beer

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Date: June 26, 2015 07:49AM

Ok glad you found the problem winking smiley

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Posted by: aakanksha
Date: May 18, 2017 12:56AM

Zastiex Wrote:
-------------------------------------------------------
> Hello forum,
> I need help with my bachelor final year project.
> I tried to use SPMF to test association rules
> mining using my sales transaction data.
> Unfortunately my dataset is in binary with .xlsx
> format. what should I do with this? thanks for the
> help

Options: ReplyQuote
Re: FP-Growth Ass.Rule using binary dataset
Date: May 18, 2017 02:55AM

Hi,

You can first export the data from Excel to a text file in the CSV format (comma separated values in a text file).

Then you could write a small program to convert your format to the appropriate SPMF format by using Java, C++ or your favorite programming language.

Best,

Philippe

Options: ReplyQuote


Your Name: 
Your Email: 
Subject: 
Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically.
 ********    ******   ********   ******    **    ** 
 **     **  **    **  **    **  **    **    **  **  
 **     **  **            **    **           ****   
 **     **  **           **     **   ****     **    
 **     **  **          **      **    **      **    
 **     **  **    **    **      **    **      **    
 ********    ******     **       ******       **    
This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.