The Data Mining Forum                             open-source data mining software open-source data mining software data science journal data mining conferences
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
discovering patterns in sport data
Posted by: Arthur
Date: April 02, 2020 10:57PM

I want to do work on sport analytics

Has anyone tried to find patterns in sport data?

Has anyone has the good references for papers on sport analytics.

Help greatly appreciated

Options: ReplyQuote
Re: discovering patterns in sport data
Date: April 03, 2020 08:40AM

Yes, there are a lot of papers on sport analytics. For example, you can check the PKDD workshop on sport analytics. It is held almost every year.

I have for example published a paper there about predicting passes in a football match a few years ago:

Fournier-Viger, P., Liu, T., Lin, J. C.-W. (2018). Football Pass Prediction using Player Locations. Proc. of the 5th Machine Learning and Data Mining for Sports Analytics (MLSA 2018), in conjunction with the PKDD 2018 conference, Springer LNAI 11330, pp. 1–7, 2019.

The source code and dataset

Options: ReplyQuote
Re: discovering patterns in sport data
Posted by: tassieTom
Date: April 04, 2020 03:43PM

Information Efficiency in Financial and Betting Markets Hardcover – 23 Nov 2005
by Leighton Vaughan Williams.

This consists of various papers and has terrific information on tennis and horse racing, market efficiency, arbitrage etc.About 380 pages, recommended.

If you want to know anything about the horse racing markets from around the world, various papers have been compiled into a whopper 580 page book that was hard to find and cost over $2000 about 15 years ago, but has now been reprinted and is more affordable. A classic--

https://www.amazon.com.au/Efficiency-Racetrack-Betting-Markets-Donald/dp/9812819185

Hope this helps.

Options: ReplyQuote
Re: discovering patterns in sport data
Date: April 05, 2020 05:52AM

Hi Tom,

Very nice reference. Thanks for sharing it. My father likes horse racing and I sometimes had some discussion about the possibility of predicting it but we never designed any model. This is a very interesting topic. I will share the book with my father. Are you working on predicting horse races?

Best regards,

Philippe

Options: ReplyQuote
Re: discovering patterns in sport data
Posted by: tassieTom
Date: April 05, 2020 05:33PM

Hi Philippe,

I did work on the horses and greyhounds in the 80's and 90's with fairly primitive software and computers compared with what is available today. The problem is that the market is SEMI efficient, meaning that not all the information is in the prices, but MOST is. This makes the public fairly accurate in predicting winners.

So to be more accurate than the public, a LOT of data is required. I lost interest in getting/updating data.

Another way to look at public prices is to look for market anomalies or mistakes.
This is what Dr Ziemba created in the 80's-- he used the win odds to create probabilities which went into the HARVILLE formula to look for anomalies in another pool--the place pool.

These approaches are what I find more interesting, particularly with betfair, vast amounts of data are now available that is time stamped to show prices shortening etc. This to me has big potential.

Also home track advantage where a horse runs on a home track but bets in another state dont have all available knowledge and so prices can vary a lot. These situations have been studied to some extent in my links.

So to answer your question, I am only interested in market (price) anomalies, mistakes in the TAB compared to betfair which is more accurate, or anomalies in different markets that can be compared, for example the quinella is first 2 horses in any order, exacta is first 2 horses in exact order....these markets can be directly compared yet will have amazing pricing differences. Also betfair data is interesting, but I am not up to speed on time stamped data yet.

I will make a separate post about the interesting data I have found!

PS--Philippe, I remember reading a book on racing in the 90's by a british statistician who said that there was SOME evidence of Markov Chains in horse racing results....I understand this to mean that you might pick up a sequence in the winning numbers, due to the fact that some stables run a couple of horses in a race, barrier effect etc etc. So it might indeed be possible to find weak sequences with winning numbers--a project for another day!
Regards,
Tom Berger



Edited 1 time(s). Last edit at 04/06/2020 12:06AM by tassieTom.

Options: ReplyQuote
Re: discovering patterns in sport data
Date: April 10, 2020 07:50AM

Hi Tom,

Very interesting and informative reply. That is very interesting to read. I see that you know that very well, and that it is quite complicated indeed. It seems to you have the good approach by looking at anomalies. Perhaps that indeed timestamps or sequences has some potential..

I dont know too much about horse races. I saw them with my father when I was a kid many years ago, and now my father still watch the horse races and bet on them almost weekly. It used be the races from Canada and/or US. I know that my father use to buy some data and then tried various formulas to try to get some results but I think indeed the challenge is to get the relevant data to be able to make good predictions, as many factors seems to be involved.

I say that but I dont know too much on the topic ;-)

But very interesting to hear about what you are doing and learn a bit more about that.

Best regards,

Philippe

Options: ReplyQuote
Re: discovering patterns in sport data
Posted by: Arthur
Date: April 05, 2020 04:32PM

Thanks Tom and Philippe for the feedback. I will read your suggestions.

Options: ReplyQuote
Re: discovering patterns in sport data
Posted by: tassieTom
Date: April 05, 2020 05:37PM

Arthur,
Leighton Vaughan Williams is based in the UK and used to have a blog.I think he called himself the Betfair Professor. He had access to some studies.

Another tip is also to get Google Scholar and type in various search terms such as tennis market efficiency, favourite longshot bias, and market bias.
Tennis has a few great papers on the favourite longshot bias.

Good luck.
Tom Berger



Edited 1 time(s). Last edit at 04/05/2020 06:30PM by tassieTom.

Options: ReplyQuote
Re: discovering patterns in sport data
Posted by: Arthur
Date: April 10, 2020 06:42PM

Thanks!

Arthur

Options: ReplyQuote
Re: discovering patterns in sport data
Posted by: b_johns
Date: April 28, 2020 08:46PM

Although I don't have much experience, I recently did some experiments with patterns in Esports positional pattern mining at different timestamps.

What kind of patterns are you looking to discover?

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.