The Data Mining Forum
This forum is about data mining
, data science
and big data
: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger
. No registration is required to use this forum!
Temp files generated while using some of the algorithms
Date: November 05, 2021 04:59AM
Firstly, allow me to say that the work put in SPMF is amazing. Grateful to have such an open source tool available for research.
I would like to describe an issue I am facing while running SPADE and PrefixSpan algorithms for sequential pattern mining.
When running the algorithms on a database, a .tmp file is created after some time has elapsed (I have not determined how much time this is), in the directory and with file name as specified in the OutputFilePath argument. I was wondering why this is happening and at which step of the algorithm this .tmp file generation is created.
The above issue kept happening while running the aforementioned algorithms on a database with total number of sequences 32K, 3K distinct items, with mean (sd) of sequence length 490(459) and median number of itemsize 5. The size of the .tmp file created was 3TB after which I had to stop the algorithm from proceeding.
Your help to understand why this is happening would be valuable.
Kind regards, Solomon