The Data Mining Forum                             open-source data mining software open-source data mining software data science journal data mining conferences
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
How to generate probabilities using data mining?
Posted by: some_math_guy
Date: July 13, 2012 07:42AM

Options: ReplyQuote
Re: How to generate probabilities using data mining?
Date: July 13, 2012 07:00PM


Welcome to the forum!

Here is my answer:

Decision trees produce probabilities. Each leaf of a decision tree correspond to a set of training instances that have been classified by the decision tree. If the decision tree can exactly separate the data, all the leaf will always contain instances belonging to the same class (e.g. "buy" or "not buy"winking smiley. This is equivalent to a probability of 0 or 1. But there are also some cases where a decision tree cannot perfectly separate the data given the attributes that you have. If this happens, the probability will be different from 0 and 1. For example, a leaf may contain 55 % of buy and 45 % of not buy. This is actually a probability and you can consider it as a probability.

Second, you could consider using the "Naive bayes classifier". These classifier are built on the Bayesian theorem from the field of statistics. Therefore the result is a probability. But you have to be careful about some underlying hypothesis for this classifier about independency between variables. You can check wikipedia for some information about this classifier:

Those are the two techniques that I'm thinking about now. There might be some other techniques too..



Edited 1 time(s). Last edit at 07/13/2012 07:01PM by webmasterphilfv.

Options: ReplyQuote

This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.