The Data Mining Forum
This forum is about data mining
, data science
and big data
: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger
. No registration is required to use this forum!
Dataset For VMSP Algorithm
Date: September 28, 2018 11:43AM
I want to use VMSP algorithm for web clickstream analysis. For VMSP algorithm, I searched that it accepts data in SPMF format(The format is sequence of numbers with each number representing an event). So how can I get such a dataset in SPMF format which also gives me the name of the pages the user visits. I might be wrong about this, so correct me if I am. And can someone also guide me for the same...
Re: Dataset For VMSP Algorithm
Date: September 28, 2018 02:41PM
Thanks for using SPMF. There is some datasets on the SPMF website but you are right that they do not provide the name of the webpages.
The original FIFA dataset contained this information: http://ita.ee.lbl.gov/html/contrib/WorldCup.html
But it is not in SPMF format. So you could convert it to SPMF format again and keep the page label. That would be a solution to obtain click streams with the page names.
The FIFA dataset on the SPMF website was converted from this but my student did not keep the page labels and that was several years ago ;-)
Otherwise, another way is to find other logs from webservers using a search engine like Google or Bing. I think it is not hard to find.