This forum is about

Issue in My Dataset for HUSPM

Posted by:
**
P N RAMESH
**

Date: March 29, 2020 06:12AM

I am working with HU Sequential Pattern mining algorithm. I got following error for my dataset.

Index 5 out of bounds for length 5

at SPFM/ca.pfv.spmf.algorithms.sequentialpatterns.uspan.AlgoUSpan.runAlgorithm(AlgoUSpan.java:395)

my data set is

1[9] 2[7] -1 -2 SUtility:16

1[8] 3[8] 4[8] 5[8] -1 -2 SUtility:32

6[9] 7[8] 8[9] -1 -2 SUtility:26

9[8] -1 -2 SUtility:8

10[8] -1 -2 SUtility:8

**11[8] 3[8] -1 12[6] 4[8] 5[8] -1 -2 SUtility:38**

10[8] -1 -2 SUtility:8

6[6] 8[6] 13[8] -1 -2 SUtility:20

14[8] 15[8] 16[8] -1 -2 SUtility:24

if i remove 6th sequence, it is working.

is it anything wrong in 6th sequence?

Thanks in advance.

Posted by:
**
webmasterphilfv
**

Date: March 29, 2020 09:39AM

Good evening!

Thanks for using SPMF! The problem is the following:

It may not be explained clearly in the documentation, but there is an assumption that the items whithin an itemset are ordered by ascending order (e.g. 1, 2, 3 4...). If that order is not respected then, the algorithm may produce some incorrect results.

So this sequence:

**11[8]** 3[8] -1 **12[6]** 4[8] 5[8] -1 -2 SUtility:38

should be replaced by:

3[8]**11[8]** -1 4[8] 5[8] **12[6]** -1 -2 SUtility:38

so that items are in ascending order.

I will explain this more clearly in the documentation. Why this order? Because it allows to do some optimization.

Then it works.

