The Data Mining Forum                             open-source data mining software data science journal data mining conferences high utility mining book
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
Support in Sequential Pattern Mining algorithms: another problem
Posted by: Mariana T
Date: September 27, 2017 10:26AM

Hello,

I questioned here about support in sequential pattern mining algorithms later, and your answer helped me with that sequence. However, now I have some sequences like this:

10 98 -1 12 98 -1 15 105 -1 10 91 -1 10 105 -1 15 105 -1 12 105 -1 10 91 -1 15 91 -1 10 91 -1 15 91 -1 12 98 -1 12 98 -1 12 105 -1 10 91 -1 10 91 -1 15 105 -1 10 91 -1 15 105 -1 15 91 -1 15 105 -1 12 105 -1 10 105 -1 15 105 -1 10 98 -1 15 98 -1 15 105 -1 10 98 -1 15 91 -1 15 105 -1 15 91 -1 15 91 -1 10 91 -1 -2

12 91 -1 10 91 -1 10 105 -1 10 105 -1 15 98 -1 12 91 -1 10 105 -1 15 105 -1 10 105 -1 10 91 -1 10 105 -1 10 98 -1 15 91 -1 10 91 -1 10 105 -1 10 91 -1 15 105 -1 10 91 -1 12 105 -1 12 105 -1 12 98 -1 12 91 -1 10 91 -1 10 91 -1 10 91 -1 15 105 -1 12 105 -1 10 91 -1 12 91 -1 10 91 -1 12 98 -1 12 98 -1 10 105 -1 -2

12 91 -1 10 91 -1 15 98 -1 10 91 -1 12 91 -1 12 91 -1 10 91 -1 15 91 -1 15 91 -1 10 91 -1 15 105 -1 15 91 -1 10 91 -1 10 98 -1 12 105 -1 15 105 -1 15 91 -1 15 105 -1 12 91 -1 10 105 -1 10 91 -1 10 91 -1 12 91 -1 12 98 -1 10 91 -1 10 91 -1 12 91 -1 10 98 -1 10 91 -1 15 98 -1 12 91 -1 10 105 -1 10 98 -1 -2

15 91 -1 15 105 -1 15 105 -1 10 91 -1 10 105 -1 10 98 -1 10 91 -1 10 98 -1 15 105 -1 10 105 -1 12 105 -1 12 91 -1 10 105 -1 12 105 -1 15 98 -1 12 91 -1 10 91 -1 12 105 -1 12 91 -1 10 91 -1 12 91 -1 15 105 -1 12 91 -1 12 105 -1 12 91 -1 12 98 -1 10 91 -1 10 105 -1 10 105 -1 10 91 -1 10 91 -1 10 91 -1 12 91 -1 12 91 -1 -2

(and more sequences...)

I put these in a file and submited in SPMF, chose CM-SPAM as algorithm, minsup 0.35, min pattern length 4, max gap 1 and i got this output:

9 -1 9 -1 9 92 -1 #SUP: 34848
9 -1 9 92 -1 9 -1 #SUP: 34627
9 -1 9 92 -1 92 -1 #SUP: 32739
9 -1 92 -1 9 92 -1 #SUP: 34072
9 92 -1 9 -1 9 -1 #SUP: 34650
9 92 -1 9 -1 92 -1 #SUP: 34068
9 92 -1 9 92 -1 #SUP: 37931
9 92 -1 92 -1 9 -1 #SUP: 33992
92 -1 9 -1 9 92 -1 #SUP: 33837
92 -1 9 92 -1 9 -1 #SUP: 32607

However, again, I think that is strange that I got lots of sequences here like 9 -1 9 -1 9 92 -1 #SUP: 34848 because this sequence is not "sequential". I think that it might be, for example like this one: 9 92 -1 9 92 -1 #SUP: 37931, because my itemset has two items, not one, like my first problem. I put max gap = 0 but it didnt work.

Can you help me with this problem?

Thanks a lot smiling smiley

Options: ReplyQuote
Re: Support in Sequential Pattern Mining algorithms: another problem
Date: September 28, 2017 07:23AM

Hello,

I am not sure if it does not work. Maybe you can send your whole dataset to my e-mail : philfv8 AT yahoo.com

Because, in the above example, I cannot see anything wrong. Actually, the pattern 9 -1 9 -1 9 92 -1 #SUP: 34848 does not seem to appear in the sequences that you have shown to me.

By the way, to more easily check the results, you can set the parameter "Show sequences ids (optional)" to true for CM-SPAM in the graphical interface of SPMF. Then, for each pattern found, you will get the list of sequences that contain the pattern, and you can then check if the result make sense. But since there is more than 30,000 sequences, this may make a big file. Another possibility is to take a subset of your file like 100 or 500 sequences to do some small tests first.

Best regards,

Philippe

Options: ReplyQuote
Re: Support in Sequential Pattern Mining algorithms: another problem
Posted by: Mariana T
Date: September 28, 2017 11:31AM

Philippe, I just sent you an email!

Options: ReplyQuote


Your Name: 
Your Email: 
Subject: 
Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically.
       **  **        **     **  **     **  ******** 
       **  **        **     **  **     **  **       
       **  **        **     **  **     **  **       
       **  **        **     **  *********  ******   
 **    **  **        **     **  **     **  **       
 **    **  **        **     **  **     **  **       
  ******   ********   *******   **     **  **       
This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.