The Data Mining Forum                             open-source data mining software data science journal data mining conferences high utility mining workshop
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
Chess and mushroom datasets for sequential rule mining
Posted by: Ed
Date: November 07, 2017 04:23AM

Good morning. Can I run sequential rule mining algorithm using chess/mushroom data? It said the data can be used directly in SPMF but I don't see any separator (-1) or terminator (-2) in the files. Any help would be appreciated.



Edited 1 time(s). Last edit at 11/07/2017 06:39AM by webmasterphilfv.

Options: ReplyQuote
Re: datasets
Date: November 07, 2017 06:38AM

Hello,

On the dataset page of the website, you should only use the datasets provided in the subsection"Datasets for Sequential Pattern Mining / Sequential Rule Mining / Sequence Prediction" for the sequential rule mining algorithms.

The Chess and Mushroom dataset are in the category for itemset mining and association rule mining, which does not have the -1 and -2 as you have observed. These datasets are not sequence databases. They are transaction databases. So it does not make sense to mine sequential rules in these datasets.

However, if you really want to apply sequential rule mining on these datasets, you could still do it by converting them to a sequence database. To do this, you would use the algorithm called "Convert_transaction_database_to_sequence_database" in the GUI of SPMF. How to use that tool to convert database is explained in the documentation:
http://www.philippe-fournier-viger.com/spmf/Converting_a_transaction_database_to_sequence_database.php

However, since Chess and Mushroom are not sequence database, it is not a very good idea to do this. Chess is a dataset about the Chess game, and Mushroom is a dataset about Mushroom. It does not make too much sense to convert them to a sequence database.

Best regards,



Edited 1 time(s). Last edit at 11/07/2017 06:38AM by webmasterphilfv.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.