The Data Mining Forum
This forum is about data mining
, data science
and big data
: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger
. No registration is required to use this forum!
Implemented algorithm for timestamped web logs?
Date: February 03, 2020 07:42AM
I need to know if I can use an algorithm that can be used for timestamped data.
I have web logs in the format :
time; ip source; ip destination; protocol; source port; source destination; label
Actually, I want to detect if there is a DDoS attack or not. I've been able to use my data in a sequential way, but I really want to extract the most frequent TIMESTAMPED chronicles (not sequenced).
For example, I want the result to show me that this event happened between 1.3 and 2.6 seconds after this event, and is the most frequent in the database. The labels are here to say if a log is an attack or not (based on the short time interval between 2 logs coming from the same IP).
I would really appreciate your help !
Re: Implemented algorithm for timestamped web logs?
Date: February 07, 2020 08:55PM
That is an interesting problem. In SPMF, there are some algorithms that deals with timestamps like Hirate and Yamana but it is very strict with how it handles the timestamp and it is likely not what you want.
I will soon release a new version of SPMF with episode mining algorithms (in perhaps 1 week) that can deal with timestamps. Maybe it could be used.
Otherwise, in SPMF, there are a few other algorithms maybe but not many for timestamps.
Maybe there are some more appropriate algorithms in the literature.