add lenght constrain to output on sequential pattern algorithms

rogelio andrade
Date: October 19, 2017 08:50AM

Hello,

I am using the fantastic SPMF library. Specifically, I am using GSP/SPDE/SPAM algorithms.

I have a problem, my sequence databse is very large and so performance is an issue. Given that I am only interested on finding sequential patterns of X length, given a minsup, I am wondering if someone, with knowledge about the source code, can point me into the right direction of where in the source code can I efficiently add such constrain.

Thanks!

webmasterphilfv
Date: October 19, 2017 07:32PM

Hi,

The length constraint has not been implemented for these algorithms. But it has been implemented for the CM-SPAM algorithm, which takes the same input and produce the same output as GSP/Spade/Spam. So the easiest solution would be to use CM-SPAM which have these features already and should be faster than those algorithms.

Best,

Philippe

rogelio andrade
Date: October 20, 2017 06:22AM

@Philippe: Thanks for the advice! I'll take a look to it. I also saw the code GoKrimp which has very appealing properties for my research.