shenzhen forum
The Shenzhen Message Board is a forum related to everything Shenzhen, China: jobs, housing, renting appartments, shop, universities, education, events, classified ads, and paperwork.
No registration is required to post in this forum!
 

Pages: 12345...LastNext
Current Page: 1 of 53
Results 1 - 30 of 1568
2 days ago
webmasterphilfv
Dear all, Just to let you know that the PDFs of articles from the UDM 2018 workshop on utility driven mining are online at: http://philippe-fournier-viger.com/utility_mining_workshop_2018/program.php Best regards, Philippe
Forum: The Data Mining / Big Data Forum
2 days ago
webmasterphilfv
Hi, Yes, I think you are right. There seems to be a bug in the implementation. Thanks for reporting it. I should release a new version of SPMF in about 1 week and half because I will have a week of holiday. I will then fix the bug, and also add several new algorithms related to high utility itemset mining that some people have sent to me recently. By the way, I will also add your name to th
Forum: The Data Mining / Big Data Forum
9 days ago
webmasterphilfv
Thanks Dang, I see. An XML format. I don't like too much XML-based format personally. It wastes a lot of space with all these tags. Already the text-based format of SPMF takes a lot of space because it is a text file. This format would maybe make the output files 10 times or more larger. Just my opinion. But I understand that it can be useful for interoperability with other software. Is it w
Forum: The Data Mining / Big Data Forum
9 days ago
webmasterphilfv
Looks like an interesting concept. Wish you good luck with your product. Philippe
Forum: The Data Mining / Big Data Forum
9 days ago
webmasterphilfv
IEEE Big Data 2018 Call for Workshop Papers & Posters 2018 IEEE International Conference on Big Data (BigData 2018) http://cci.drexel.edu/bigdata/bigdata2018/index.html Dec 10-13 2018, Seattle, WA, USA The IEEE Big Data 2018 has received more than 600 full papers in the main conference and industry and government program. If you miss the submission deadline, there are still chances f
Forum: The Data Mining / Big Data Forum
11 days ago
webmasterphilfv
Hi, Thanks for using SPMF. I do not know what is the PMML format. But if you are comfortable with Java, you could modify the code for writing the rules to the file. This should not be hard. In SPMF, each algorithm is in a separated package. So you could first find the code of the algorithm that you want to modify and then change the code. But what is PMML? Can you give me a link to a websit
Forum: The Data Mining / Big Data Forum
13 days ago
webmasterphilfv
Hi, There is a lot of possible topics. You can choose to work on something more fundamental like algorithm design or something more applied such as how to best solve a given applied problem. A good way of choosing a research problem is to look at some recent papers and find something that you are interested in. Personally, I am quite interested in pattern mining problems and algorithm desi
Forum: The Data Mining / Big Data Forum
23 days ago
webmasterphilfv
Hi all, It is my pleasure to announce that my data mining blog is now also available in Chinese: The data mining blog (Chinese). About every week some articles will be translated to Chinese and put on this Chinese version of the blog. I will not translate all the content of my English blog but the most important posts will be translated. Besides, some guest authors may also write blog p
Forum: The Data Mining / Big Data Forum
25 days ago
webmasterphilfv
You may export the data from the database to a text file in the proper format and they apply Apriori to the text file to obtain the result. But it depends on your implementation of Apriori. If you are using the implementation from the SPMF software, then you should read the documentation to see which format is required as input. Best
Forum: The Data Mining / Big Data Forum
28 days ago
webmasterphilfv
If your data has time information then yes.
Forum: The Data Mining / Big Data Forum
8 weeks ago
webmasterphilfv
Hello, Thanks for reading our papers ;-) It is a little bit late, so I will answer the easy questions first, and answer other questions maybe tomorrow. Philippe > after I read the article of EFIM and I'm lost at > certain page. > correct me if my understanding is not correct. > The FHM(you created) is able to > accelerate(improve) the performance of MUI-MINER, >
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hello all, As many of you know, I have a data mining blog that talk about various topics related to data mining and research: http://data-mining.philippe-fournier-viger.com/ Recently, I have tried to update the blog once every week. I have actually prepared weekly blog posts already until the end of August. If you are interested, you can have a look ;-) Philippe
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
The most likely reason is that you need to adjust the parameters. It is possible that in your training data, there is no sequence that match with the sequence that you want to predict. Thus, no prediction is given. If you adjust the parameters of the prediction model, I think we will get some prediction ;-) By the way, sorry for the delay to answer. I have been a bit busy during the last few da
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hello, Pradivana Wrote: ------------------------------------------------------- > Hi, i did some research using spmf sequence > pattern library and i would like to know is there > anyway to show the accuracy from spmf library? > especially for Markov, CPT and TDAG algorithm > > I've read the documentation and i think i miss > something, please help, and i'm sorry f
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hi, Thanks for using SPMF! I checked the paper just now, and I think that is not explained clearly what Checking_and_removing_item() is supposed to do. Thus, it would be hard to implement this function. If you want to implement it, I think that you should contact the authors of the paper to ask more details about what this function is supposed to do. But with the current paper, I think we don
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hello, In algorithms like HUI-Miner, the items are sorted according to a total order. What is a total order? It means that there is some order between the items. For example, it could be the alphabetical order. According to the alphabetical order, an item "a" must be processed before an item "b", and "b" must be processed before an item "c". The al
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hello, Sure, if you want to discuss this in the forum, you can share details. Is it improving the performance by a great amount? If so, your improvement could be integrated in the SPMF library and you could become a contributor. Best regards, Philippe
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hi, If you want to use sequential patterns to generate rules, then you could check the RuleGen algorithm from SPMF which allows to do that. But it would need to be modified because this algorithms does not consider timestamps. I mean you could maybe draw inspiration from that... Another algorithm in SPMF that find rules and with a windows constraint is TRuleGrowth. But it does not consider t
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hi, In HUI-Miner, you should only combine two itemsets if they are identical except for one item. Thus, you should not combine the itemset d,f,g with the itemset f,g,b because they have two different items (d and b). This is one part of the problem. Best, regads
Forum: The Data Mining / Big Data Forum
2 months ago
webmasterphilfv
Hello, If you clicked on "GENERATE_DATASET.bat" to generate the dataset, it should generate a sequence database, where each line is a sequence. But I did not use that IBM generator for a long time, so I do not remember how it works. There is some database generators that are perhaps easier to use in my SPMF library. Best regards,
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
In frequent subgraph mining, you typically have edges and vertices that have names. For example, you could have a graph about a water molecule, and you would have two nodes that have the same label "Hydrogen" and one node with the label "Oxygen" Hydrogen ---- Oxygen ----- Hydrogen Now, when you check if two graphs are isomorphic, yes, you need to check that the structure
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Those are mostly the same thing. In simple words, two graphs are isomorphic if we map the edges and vertices of one graph to the other and they are equivalent. Subgraph isomorphim checking is the same thing. But since you add the word "subgraph" it means that you are comparing subgraphs of a graph to check if these subgraphs are equivalent. Yes, the idea of graph isomorphism is
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Hello, Yes, I call this a class sequential rules. There is this algorithm in SPMF: the TopSeqClassRules algorithm for mining the top-k class sequential rules that does that. It will let you select {i} to find the k most frequent sequential rules of the form X --> {i}. This algorithm is similar to RuleGrowth but modified to do that. Best Philippe
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Hello, Sorry for the delay to answer. I saw your e-mail but actually was too busy in the last few days. I will provide some answer/opinion/suggestion below. How to represent the data is always a good question because depending on how you represent the data, you may obtain different results using a data mining algorithms. A possibility could be that each sequence represents a sequence o
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Hi Victor, I see. There is no such implementation in SPMF that does exactly that. It could be done, I think, but it would require some programming to modify the algorithm and it can be more or less complicated. If one modifies it, then it would need to check to make sure that the algorithm remains correct, and sometimes combining two ideas results in an algorithm that cannot find all the patter
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Hi, 1) the likely reason is that the input format is not correct. At the end of each itemset, there should be a -1 to separate. For example, the first sequence should be in this format: <10> 42 45 -1 <11> 31 42 45 -1 <20> 18 23 31 42 45 -1 <36> 48 -1 -2 It is the same for the other sequences. 2) Yes, if you have a pattern: <0> 1 2 <1> 3
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Great. I will fix the error in the documentation. Thanks for reporting it.
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Hi Victor, I will answer your question below. > So how large this large should be set to make it > exact not approximate algorithm? Does it depends > on data size? The problem with the maximal time interval constraint is that if you apply this constraint when doing closed sequential pattern mining, you may miss some patterns. If you don't care about missing a few patterns
Forum: The Data Mining / Big Data Forum
3 months ago
webmasterphilfv
Dear Victor, > My current way of defining itemset is that for a > sequence of events that a customer took in the > history, an itemset is the events happened in the > same day. So in the final frequent sequence, I am > able to know this sequence covers how many days. > But I also want to reserve the original order of > events in an itemset. So what if I reserve the &
Forum: The Data Mining / Big Data Forum
Pages: 12345...LastNext
Current Page: 1 of 53

This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.