The Data Mining Forum                             open-source data mining software data science journal data mining conferences icgec 2017 conference
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
Window constraint and TRuleGrowth
Posted by: Binayak
Date: April 04, 2017 02:57AM

Hi,

While reading the text at ..http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php#trulegrowth
referring to below sequential data info...I have some query on window size...

ID Sequences
S1 (1), (1 2 3), (1 3), (4), (3 6)
S2 (1 4), (3), (2 3), (1 5)
S3 (5 6), (1 2), (4 6), (3), (2)
S4 (5), (7), (1 6), (3), (2), (3)

Here window size 2 or 3 means what ?

if window size is 2...then in S1, it can consider (1),(1 2 3) or (1 2 3) (1 3) ..i.e a window size of 2 basically refers to 2 consecutive itemsets? ..is this understanding in window size correct?

Any references to calculation by hand using above algorithm with different window sizes?
Thanks
Binayak

Options: ReplyQuote
Re: Window constraint and TRuleGrowth
Date: April 04, 2017 08:02AM

Hello,

I think that you understand the basic idea.

If you set window_size = 2, it means that a pattern (sequential rule) appears in a sequence if it appears whithin 2 consecutive itemsets.

For example, if you consider the first sequence:

S1	(1), (1 2 3), (1 3), (4), (3 6)


Since window-size = 2, a pattern only appears in this sequence if it appears whithin two consecutive itemsets. That means that a pattern has to appear in one of the following four pairs of consecutive itemsets:
(1), (1 2 3),
          (1 2 3), (1 3),
                   (1 3), (4),
                          (4), (3 6)


If a pattern does not appear in one of these four windows, then the pattern is considered as not appearing in sequence S1.

For example, the pattern {1} --> {4} appears in this sequence because it appears in the third window: (1 3), (4),

For example, the pattern {1} --> {6} does not appear in that sequence, because it does not appear in any of the windows.

However, if we set window_size = 3, the pattern {1} --> {6} appears in that sequence, since the three last itemsets contains {1} --> {6}.

If you want more details, you can also read the TKDE paper about TRuleGrowth. It gives a formal definition of the window constraint and it also contains some examples:

http://www.philippe-fournier-viger.com/spmf/TKDE2015_sequential_rules.pdf



Edited 2 time(s). Last edit at 04/04/2017 08:03AM by webmasterphilfv.

Options: ReplyQuote


Your Name: 
Your Email: 
Subject: 
Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically.
 **     **        **  **    **  **     **  **    ** 
  **   **         **  ***   **  ***   ***  **   **  
   ** **          **  ****  **  **** ****  **  **   
    ***           **  ** ** **  ** *** **  *****    
   ** **    **    **  **  ****  **     **  **  **   
  **   **   **    **  **   ***  **     **  **   **  
 **     **   ******   **    **  **     **  **    ** 
This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.