How to generate probabilities using data mining?
Posted by: some_math_guy
Date: July 13, 2012 07:42AM

Re: How to generate probabilities using data mining?
Date: July 13, 2012 07:00PM


Welcome to the forum!

Here is my answer:

Decision trees produce probabilities. Each leaf of a decision tree correspond to a set of training instances that have been classified by the decision tree. If the decision tree can exactly separate the data, all the leaf will always contain instances belonging to the same class (e.g. "buy" or "not buy"winking smiley. This is equivalent to a probability of 0 or 1. But there are also some cases where a decision tree cannot perfectly separate the data given the attributes that you have. If this happens, the probability will be different from 0 and 1. For example, a leaf may contain 55 % of buy and 45 % of not buy. This is actually a probability and you can consider it as a probability.

Second, you could consider using the "Naive bayes classifier". These classifier are built on the Bayesian theorem from the field of statistics. Therefore the result is a probability. But you have to be careful about some underlying hypothesis for this classifier about independency between variables. You can check wikipedia for some information about this classifier:

Those are the two techniques that I'm thinking about now. There might be some other techniques too..



Edited 1 time(s). Last edit at 07/13/2012 07:01PM by webmasterphilfv.

This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.