This forum is about

closure Jumping in EFIM Close Algorithm

Posted by:
**
sharare1370
**

Date: July 26, 2020 06:52AM

Hi

I read EFIM Close algorithm but i did not understand the concept of prune withe closure jumping would you please explain this prune strategy with example?

Thanks

I read EFIM Close algorithm but i did not understand the concept of prune withe closure jumping would you please explain this prune strategy with example?

Thanks

Posted by:
**
webmasterphilfv
**

Date: July 27, 2020 12:28AM

Hi,

I will try to explain this optimization with a simple example.

Let's say that we have a database containing only three transactions:

Transaction 1: A B C D E F

Transaction 2: A B C D E F

Transaction 3: A B C

Then, the algorithm will star to search for the closed itemsets. At some point the algorithm will find {A,B,C} and will then want to extend that itemset further to find larger itemsets starting with {A,B,C}.

The algorithm will count what items could extend {A,B,C}. The answer is:

"D" and it appears two times after {A,B,C}

"E" and it appears two times after {A,B,C}

"F" and it appears two times after {A,B,C}

Because all the items D, E, and F that could extend {A,B,C} have the same support (two times), then we can directly jump to the itemset {A,B,C,D,E,F} and we know that this itemset will be closed.

The other itemsets starting with ABC like ABCD, ABCE, ABCF, ABCDE, ABCDF etc cannot be closed because they will all have a support of 2, and {ABCDEF} has a support of 2.

This is the reason why we can jump directly to ABCDEF... This will save time because we dont need to explicitly check ABCD, ABCE etc...

So this is an optimization for closed itemset mining.

Hope that this is more clear.

Best regards,

Philippe

I will try to explain this optimization with a simple example.

Let's say that we have a database containing only three transactions:

Transaction 1: A B C D E F

Transaction 2: A B C D E F

Transaction 3: A B C

Then, the algorithm will star to search for the closed itemsets. At some point the algorithm will find {A,B,C} and will then want to extend that itemset further to find larger itemsets starting with {A,B,C}.

The algorithm will count what items could extend {A,B,C}. The answer is:

"D" and it appears two times after {A,B,C}

"E" and it appears two times after {A,B,C}

"F" and it appears two times after {A,B,C}

Because all the items D, E, and F that could extend {A,B,C} have the same support (two times), then we can directly jump to the itemset {A,B,C,D,E,F} and we know that this itemset will be closed.

The other itemsets starting with ABC like ABCD, ABCE, ABCF, ABCDE, ABCDF etc cannot be closed because they will all have a support of 2, and {ABCDEF} has a support of 2.

This is the reason why we can jump directly to ABCDEF... This will save time because we dont need to explicitly check ABCD, ABCE etc...

So this is an optimization for closed itemset mining.

Hope that this is more clear.

Best regards,

Philippe

Posted by:
**
sharare1370
**

Date: July 27, 2020 08:47AM

Hi

Thank you for your good explanations i understand it.

Best regards

Thank you for your good explanations i understand it.

Best regards

Posted by:
**
webmasterphilfv
**

Date: July 27, 2020 06:12PM

Great :-)