No registration is required to post in this forum!

Results 1 - 30 of 1626

5 days ago

webmasterphilfv

You may want to check this:
"Comment spam detection by sequence mining"
R Kant, SH Sengamedu, KS Kumar (2012)
They have applied sequential pattern mining.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

9 days ago

webmasterphilfv

List has been updated again with a few more conferences.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

10 days ago

webmasterphilfv

The diagram looks quite nice. It seems like a good idea for path visualization. But I also don't know how to generate such visualization. If you find something good please share it. Maybe other people know?

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

11 days ago

webmasterphilfv

You can also check Kaggle. It has a lot of data.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

11 days ago

webmasterphilfv

Hello,
Ruhallah Ahmadian Wrote:
-------------------------------------------------------
> I'm looking for these algorithm, are there these
> in SPMF ?
> Apriori with Transaction reduction
No.
> Apriori with Partitioning
No.
> Apriori with Sampling
No.
> Apriori with Dynamic itemset counting
No.
> Vertical Data Format
Yes, several algorithm

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

12 days ago

webmasterphilfv

tianikowa Wrote:
-------------------------------------------------------
> Thank you for your attention, Yes, I'll definitely
> be looking for spmf.
> But just one question of "get the initial
> population P with the proposed problem-specific
> initialize strategy" :
>
> What is the concept of the following sentence?
>
> "the child individuals

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

12 days ago

webmasterphilfv

Hi,
Thanks for your interest in the software. The main reason why that algorithm is not implemented is because my time is limited and there are hundreds of new algorithms every year. Thus, I have to choose which algorithms I will implement carefully. I usually pick some algorithm that I am interested in. But often, some people will also contribute some code to the software. Then, I don't need t

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

13 days ago

webmasterphilfv

Hi,
I am not the author of this paper so I do not have examples for these algorithms, and I do not have time to read that paper, understand it and write an example for that. But if you are interested, I can tell you that in the next version of SPMF, we will release the code for another paper:
Wei Song, Chaomin Huang. Mining High Utility Itemsets Using Bio-Inspired Algorithms: A Diverse Optim

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

13 days ago

webmasterphilfv

Hi, my e-mail is philfv8 AT yahoo.com. If you have a question about how to use SPMF or how the code of SPMF works, you can ask me. But I cannot do the programming for your or help you to debug your code. If you send me a lot of code, I probably don't have time to read it. But if you ask me a specific question about how to use SPMF, I can give you some advice.
Best regards,

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

17 days ago

webmasterphilfv

10. Re: Density

I did not read that paper. But in general density means the number of data points in some part of the space. If some part of the space has many data points then it is said to be dense. If some part of the space do not have many data points, then it is not dense (it is sparse).
A change in density would mean that amount of data points in some part of the space is changing.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

19 days ago

webmasterphilfv

Hi,
You can send me an e-mail at philfv8 AT yahoo DOT com and I will send the snake dataset to you.
Best,

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

24 days ago

webmasterphilfv

Do you really mean low confidence or low support?
The problem for finding the low support rules is that there is perhaps so many rules if you decrease the support threshold. But if you want to make it faster and reduce the number of rules, you can also apply some constraints. For example, if you apply FPGrowth_Association_rules in SPMF, you can set a maximum antecedent length and a maximum cons

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

24 days ago

webmasterphilfv

Rahul Wrote:
-------------------------------------------------------
> What we mean by the accuracy is 0.96 ± 0.14 .I
> have seen it in many papers and it is written 0.14
> is standard deviation.
> Even some time it is written in table delay in
> time like
> What is the relation between standard deviation
> and other evaluation metrics why we put plus minus
> th

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

26 days ago

webmasterphilfv

So basically, you have items and transactions, so you just need to convert it to SPMF format.
For example, a database like this:
TID A B C D E
T1 1 1 1 0 0
T2 1 1 1 1 1
T3 1 0 1 1 0
T4 1 0 1 1 1
T5 1 1 1 1 0
would look like this in SPMF format:
1 2 3
1 2 3 4 5
1 3 4
1 3 4 5
1 2 3 4
where 1 = A, 2 = B, 3 = C, 4 = D, 5 = E and where each line is a transaction.
The

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

26 days ago

webmasterphilfv

15. Re: big data

I see. Actually, the challenge of privacy is not just for large datasets. As long as you have to transfer data on a network or between different organizations, the challenge of privacy will occur. I am not sure for other ideas. That's all the ideas that I have now ;-) Maybe others have some other ideas to share.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

27 days ago

webmasterphilfv

I am not sure what you mean by "headers included". But you can check the documentation of SPMF. All the features of SPMF are described there. If it is no there, then it is not implemented.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

27 days ago

webmasterphilfv

Hi,
Yes, the format used by SPMF is explained in the documentation of SPMF for each algorithm offered in SPMF. You can go to "Documentation" on the website, click on an algorithm, and then you will see how to run the algorithm and the input format.
Because there exists so many formats out there, I could unfortunately not support all of them. The most simple is to write a short pro

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

27 days ago

webmasterphilfv

18. Re: big data

Hello again,
Specifically for itemset mining, there are several challenges related to big data:
- design some parallel algorithms that run on big data architectures like hadoop, spark, etc. There exists a few already, but perhaps they can be improved or you can design algorithms for other pattern mining problems or variations of the itemset mining problem.
- you can design algorithms for min

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

28 days ago

webmasterphilfv

19. Re: big data

Hi,
About Big Data, some researchers say that there are the five V of Big data that are important: Volume, Velocity, Variety, etc.
But besides that, I would like to point out that some problems are easy even if we have big data, and some problems are difficult even if you don't have a lot of data. Actually, the difficulty of a computing problem is sometimes more influenced by the parameters

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

29 days ago

webmasterphilfv

The aim of this special issue on Advances on Managing ang Mining Large-Scale Time Dependent Graphs (TD-LSG) is to bring together active scholars and practitioners of dynamic graphs. Graph models and algorithms are ubiquitous of a large number of application domains, ranging from transportation to social networks, semantic web, or data mining. However, many applications require graph models that ar

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

30 days ago

webmasterphilfv

Hi David,
Glad it is helpful. That is a good question. I don't know much about bioinformatics so I cannot really say something about that. But maybe some people have used frequent itemsets in bioinformatics before... You could have a look at that perhaps. In my opinion, it could perhaps be used to find some correlations in the data that could provide insights for example about some related gene

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

30 days ago

webmasterphilfv

Hello,
Yes that is a typo. Thanks for reporting it. I will fix it in a few minutes ;-)
Best regards,
Philippe

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

4 weeks ago

webmasterphilfv

You can have a look at Efficient Java Matrix Library (EJML). I did not use it. But I had a quick look at the website. Maybe it can do what you want.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

4 weeks ago

webmasterphilfv

Hello,
There are quite many algorithms for subgraph mining. I think you can find a good answer by searching on a website like Google Scholar and sort by years to find what are the latest algorithms for this problem.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

4 weeks ago

webmasterphilfv

Hello,
Null pointer exception is quite general, and which algorithm are you running with which file? Can you give the details and also the full error?
One of the possible reason is that the files have not been installed properly into NetBeans. Another possibility is that you have tried to load some input file but you did not place the file in the correct directory. Thus the software cannot

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

4 weeks ago

webmasterphilfv

Yes, most likely. The challenge is to develop a parallel version of an algorithm and adapt it for the MapReduce or other big data framework like Spark. You could start from a non parallel algorithm and try to transform it in a parallel algorithm. But not algorithm are easy to transform in a parallel algorithm. Or you could design something new. But there are already some algorithms for big data.

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

4 weeks ago

webmasterphilfv

Hi, thanks for following the forum. :-) I think you mean "unstructured data". For example, a text document or a tweet do not have a clear structure. In that case, yes, we could do some pattern mining.
For example, you can consider sentences of a text as sequence of symbols (items), and then apply sequential pattern mining to find subsequences of words that appear frequently in tweets

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

4 weeks ago

webmasterphilfv

Hi David,
If you have high dimensional data (many attributes), then itemset mining can still be applied because each itemset will generally only involve a few attributes. Thus, even if you have many attributes, by applying frequent itemset mining, you will only find the small sets of values that appear often together using a subset of the attributes.
Now, if you have high dimensional data, a

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

5 weeks ago

webmasterphilfv

13th International Conference on Practical Applications of Computational Biology & Bioinformatics
University of Salamanca
Ávila (Spain) | 26th - 28th June, 2019
SCOPE
The success of Bioinformatics in recent years has been prompted by research in Molecular Biology and Molecular Medicine in several initiatives. These initiatives gave rise to an exponential increase in the volum

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum

5 weeks ago

webmasterphilfv

BMDA 2019 Call for papers
-------------------------
2nd International Workshop on Big Mobility Data Analytics (BMDA)
co-located with EDBT/ICDT Conference, March 26, 2019 (Lisbon, Portugal)
http://www.datastories.org/bmda19/
*Submissions due December 15, 2018 (11:59pm PDT)*
Workshop Description
====================
From spatial to spatio-temporal and, then, to mobility data. The n

Forum: The Data Mining / Big Data Forum

Forum: The Data Mining / Big Data Forum