This forum is about

CPT+ Scoring

Posted by:
**
lv1984
**

Date: March 15, 2018 03:20AM

I'm testing the CPT+ but I can't understand how to interpret the scoring.

Is it already normalized?

What's the min and the max values for the scoring?

Can I already interpret it as a probability or it must be normalized?

Is it already normalized?

What's the min and the max values for the scoring?

Can I already interpret it as a probability or it must be normalized?

Posted by:
**
Luis
**

Date: March 20, 2018 07:04AM

Hey lv1984,

no, the scores are not normalized by default. If you want to normalize them by yourself, you could do it in the CountTable::getBestSequence() method:

However the scores do not represent real proportions, because of the multiplication of the individual subscores in the CountTable::push() method.

You would have to rewrite the score system if you are interested in real proportional probabilities.

Disclaimer: I am just a student who worked with this algorithm for half a year, so I can not guarantee correctness

Best regards,

Luis

no, the scores are not normalized by default. If you want to normalize them by yourself, you could do it in the CountTable::getBestSequence() method:

//Filling a sequence with the best |count| items Sequence seq = new Sequence(-1); sd.normalize();// Implement this method in the ScoreDistribution class List<Integer> bestItems = sd.getBest(1.002);

However the scores do not represent real proportions, because of the multiplication of the individual subscores in the CountTable::push() method.

You would have to rewrite the score system if you are interested in real proportional probabilities.

Disclaimer: I am just a student who worked with this algorithm for half a year, so I can not guarantee correctness

Best regards,

Luis

Posted by:
**
webmasterphilfv
**

Date: March 24, 2018 07:03AM

Thanks for answering, Luis :-)

Yes, the scores are not normalized in CPT+. The score for a prediction is the sum of its score for all the sequences that are used to make that prediction. Thus, the sum can be greater than 1. Beides, it cannot be negative.

Yes, the scoring system could be replaced by something else. When designing CPT/CPT+, my student Ted actually tried different scoring systems, and the one provided in CPT+ is the one that we found to work the best on our datasets. But maybe that other scoring systems are better or have other advantages. We found that it was more simple to have some scores that are not normalized.

Best regards,

Philippe

Edited 1 time(s). Last edit at 03/24/2018 07:04AM by webmasterphilfv.

Yes, the scores are not normalized in CPT+. The score for a prediction is the sum of its score for all the sequences that are used to make that prediction. Thus, the sum can be greater than 1. Beides, it cannot be negative.

Yes, the scoring system could be replaced by something else. When designing CPT/CPT+, my student Ted actually tried different scoring systems, and the one provided in CPT+ is the one that we found to work the best on our datasets. But maybe that other scoring systems are better or have other advantages. We found that it was more simple to have some scores that are not normalized.

Best regards,

Philippe

Edited 1 time(s). Last edit at 03/24/2018 07:04AM by webmasterphilfv.