I will not explain the algorithm into details because it would be too long. But the major features of EFIM
- EFIM is a one phase algorithm for high-utility itemset mining
. Thus it does not generate candidates unlike UPGrowth, TwoPhase and such algorithms
- Unlike algorithms such as HUI-Miner and D2HUP, EFIM will calculate the remaining utility (called sub-tree utility) at the parent node rather that at child nodes in the depth-first search, thus it can prune more nodes in the search space. Moreover, the upper-bound in EFIM is better as locally unpromising items are removed from the sub-tree utility. In FHM and HUI-Miner, locally unpromising items cannot be removed from the remaining utility since these algorithms use a vertical database. Thus, sub-tree utility is a tighter upper bound than the remaining utility used in HUI-Miner and D2HUP and generally also better than algorithms using the TWU such as UPGrowth+, UPGrowth, IHUP and Two-Phase.
- EFIM utilizes the idea of transaction merging which is hard or very difficult to implement in HUI-Miner and D2HUP since they are respectively vertical algorithm and hyperlink-based algorithms. Transaction merging greatly improve the runtime.
- A key design decision in EFIM is to perform all operations for processing an itemset in the search space in linear time, while utility-lists in HUI-Miner can be constructed in O(n^3) time in some case.
By all these strategies and optimizations, as you can see in the experiments, EFIM can be up to 1000 times faster
than the state-of-the-art HUI-Miner, FHM, UPgrowth and D2HUP algorithms. Also, another key feature of EFIM is that it is very memory efficient. It can use up to eight times less memory
than the previous best algorithms such as HUI-Miner and FHM. This is due to the use of simple data structures.
So this is the main idea. You can get more information in the paper describing EFIM published at MICAI 2015:
Zida, S., Fournier-Viger, P., Lin, J. C.-W., Wu, C.-W., Tseng, V.S. (2015). EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining.
Proceedings of the 14th Mexican Intern. Conference on Artificial Intelligence (MICAI 2015), Springer LNAI, to appear.
Also, if you send me an e-mail, I can send you an extended version of the paper with more details about the optimizations used in EFIM, and also more experiments.
Hope that this answer your question.
Edited 9 time(s). Last edit at 12/14/2015 09:00PM by webmasterphilfv.