The Data Mining Forum                             open-source data mining software data science journal data mining conferences high utility mining workshop
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
add lenght constrain to output on sequential pattern algorithms
Posted by: rogelio andrade
Date: October 19, 2017 08:50AM

Hello,

I am using the fantastic SPMF library. Specifically, I am using GSP/SPDE/SPAM algorithms.

I have a problem, my sequence databse is very large and so performance is an issue. Given that I am only interested on finding sequential patterns of X length, given a minsup, I am wondering if someone, with knowledge about the source code, can point me into the right direction of where in the source code can I efficiently add such constrain.

Thanks!

Options: ReplyQuote
Re: add lenght constrain to output on sequential pattern algorithms
Date: October 19, 2017 07:32PM

Hi,

The length constraint has not been implemented for these algorithms. But it has been implemented for the CM-SPAM algorithm, which takes the same input and produce the same output as GSP/Spade/Spam. So the easiest solution would be to use CM-SPAM which have these features already and should be faster than those algorithms.

Best,

Philippe

Options: ReplyQuote
Re: add lenght constrain to output on sequential pattern algorithms
Posted by: rogelio andrade
Date: October 20, 2017 06:22AM

@Philippe: Thanks for the advice! I'll take a look to it. I also saw the code GoKrimp which has very appealing properties for my research.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.