Doctor SpinThe PR BlogGuest PostsHow I Used Machine Learning to Predict Soccer Games for 24 Months...

How I Used Machine Learning to Predict Soccer Games for 24 Months Straight

For better or worse, machines do not place bets with their hearts.

Cover photo: @jerrysilfwer

This is a guest post by Ola Lidmark Eriksson.

Can machine learnยญing make you rich from sports betting?

Two years ago, I asked myself if it would be posยญsible to use machine learnยญing to preยญdict socยญcer gamesโ€™ outยญcomes better.

I decided to give it a serยญiยญous try, and today, two years and conยญtexยญtuยญal data from 30,000 socยญcer games later, Iโ€™ve gained many interยญestยญing insights.

Here we go:

The Big Data Challenge: Let the Data Mining Begin

Step 1: To begin with, I harยญvesยญted as many data points as posยญsible. I mined old game data from every source and API I could find. Some of the more importยญant ones were Football-data, Everysport, and Betfair.

Step 2: I merged these data points with their corยญresยญpondยญing resยญults, quanยญtiยญfied them, and put everything into one database. 

Step 3. Finally, I used the data to train a machine learnยญing modยญel, to be used as my softยญware for preยญdictยญing upcomยญing socยญcer games.

How To Measure Predictions of the Unpredictable

Of course, the nature of a socยญcer match is that it is unpreยญdictยญable. I guess thatโ€™s why we love the game, right?

Still, I was someยญwhat obsessed with the naรฏve notion that I, armed with a data-drivยญen machine-learnยญing modยญel, could preยญdict games betยญter than I usuยญally would. At that point, I based most of my sports bets on emoยญtions (โ€œgut feelยญingsโ€) rather than actuยญal data.

The first chalยญlenge was to find out how to measยญure whethยญer or not my modยญel was sucยญceedยญing. I quickly realยญized that measยญurยญing the actuยญal perยญcentยญage of corยญrectly guessed games didnโ€™t add much valueโ€‰โ€”โ€‰not without some form of context.

I decided to comยญpare the modelโ€™s outยญput with the best guesses of the actuยญal marยญket. The easiยญest way to assess such data was to harยญvest marยญket-regยญuยญlated odds. Therefore, I starยญted comยญparยญing how my modยญel would perยญform if Betfair, only because their odds are adjusยญted based on real people betยญting real money against each other.

The Results: Did My Model Make Me Rich?

Fast-forยญward to today: Nowโ€‰โ€”โ€‰two years have passed. Has the modยญel made me a rich man?

Well, no.

I soon realยญized that my preยญdicยญtions, for the most part, were aligned with the marketโ€™s best performance.

Since I used a regresยญsion-based modยญel, I could preยญdict the strength of the probยญabยญilยญity of a speยญcifยญic game outยญcome. And at the most subยญstanยญtial probยญabยญilยญity grades, my modยญel preยญdicts roughly 70% of the games corยญrectly. Since the marยญket perยญforms just as well, makยญing serยญiยญous money from my bets is difficult.

But, to be honยญest, I nevยญer thought I would creยญate a โ€œmoney machine,โ€ either. Instead, I came to sevยญerยญal rather excitยญing insights about the posยญsibยญilยญitยญies (and limยญitยญaยญtions!) of big data and machine learning:

Learning 1: Machine Learning and Diminishing Gains

In theยญory, machine learnยญing should be able to improve over time. The amount of data the modยญel has to learn from grows, enhanยญcing the outยญcome of the predictions.

Well, this wasnโ€™t my experยญiยญence at all.

Two years ago, I starยญted with about 2,000 games in my dataยญbase and relยญatยญively limยญited data sets attached to them. Today, I have almost 30,000 games in the dataยญbase, with metadata covยญerยญing everything from weathยญer and disยญtances between the teamโ€™s home grounds to shots and corners.

All this added dataโ€‰โ€”โ€‰and the modยญel has been able to โ€œlearnโ€ over time!โ€” it still didnโ€™t improve its preยญdicยญtions. Big data and machine learnยญing will only take you so far in preยญdictยญing the unpredictable.

Learning 2: The Power of Unbiased Generalizations

The power of machine learnยญing seems closely tied to its abilยญity to make unbiased genยญerยญalยญizยญaยญtions.

For example, I was curiยญous to see if my modยญel could preยญdict when winยญning or losยญing streaks would be broken over the past two years. For instance, it could expect that Barcelona would finally lose after winยญning ten games straight. Could my modยญel prove cerยญtain anomยญalies to be significant?

Well, it has shown to be not that good at that.

Instead, I found that the modยญel was surยญprisยญingly good at betยญting against overยญvalยญued teams over time.

Last seaยญson, I saw how my socยญcer preยญdicยญtion machine often preยญdicted against Borussia Dortmund while the marยญket made anothยญer preยญdicยญtion. Dortmund had a lousy seaยญson makยญing my modยญel advantยญageยญous comยญpared to marยญket preยญdicยญtions. I have seen the same in teams like Liverpool and Chelsea this season.

So the lesยญson learned is that some people tend to make sports betยญting based on emoยญtions. Liverpool and Dortmund are teams liked by lots of people, and at times, you make preยญdicยญtions with your heart instead of your brain. My machine learnยญing modยญel, well, it does not.

Learning 3: Machine Learning and Easy Gains

If nothยญing else, I learned that makยญing preยญdicยญtions that outยญperยญform the marยญket is comยญplex. Still, when I starยญted lookยญing at what I had achieved (instead of just obsessยญing over what I hadnโ€™t), I found one quite surยญprisยญing fact:

From a simple Python proยญgram and less than 10,000 lines of code, I still had made someยญthing that perยญformed just as well as the marยญket. How many perยญson-hours arenโ€™t behind bookยญiesโ€™ odds modยญels and preยญdicยญtions? The modยญel can pick out attractยญive bets weekly, just as any newsยญpaยญper or expert would. By makยญing genยญerยญalยญizยญaยญtions, you might not be able to find that one bet that will make you richโ€‰โ€”โ€‰but it may save you lots of time in the propยญer context.

Implementing Machine Learning to Wide Ideas

With these insights in mind, I starยญted to look at anothยญer proยญject Iโ€™ve been involved in for the last five years: Wide Ideas, a platยญform for comยญpanยญies to crowdยญsource ideas and creativity.

What I wanted to do was to look at the ideas comยญpanยญies gathered from their employยญees and try to preยญdict whethยญer they would impleยญment the idea or not.

The team and I quanยญtiยญfied the data, but instead of shots on goal and weathยญer foreยญcasts, we looked at how many had interยญacยญted with an ideaโ€‰โ€”โ€‰and in what way. And lo and behold, the outยญcome was on par with the socยญcer predictions:

We can now make decent preยญdicยญtions on whethยญer or not we will impleยญment a creยญatยญive idea. We can visuยญalยญize this to encourยญage more great ideas through gamification.

Can we find a good idea that doesnโ€™t folยญlow the genยญerยญal patยญterns of a good idea? No, notโ€‰โ€”โ€‰not yet, at least.

Still, for the product, and givยญen that you look at an organยญizยญaยญtion that can harยญvest 10,000 ideas per year, findยญing ways to highยญlight and encourยญage parยญticยญuยญlar ideas can save time and resources. So just going from 10,000 to 100 (perยญhaps) good ideas and visuยญalยญizยญing the resยญult saves lots of time.

The gap between makยญing machines just as good as humans and makยญing them betยญter than we are.

Big data and machine learnยญing might preยญdict anyยญthing from early-stage canยญcer to makยญing self-drivยญing cars antiยญcipยญate potenยญtial dangers. Models like this will probยญably prove most useยญful where genยญerยญalยญizยญaยญtions save time.

Take medยญicยญal impleยญmentยญaยญtions, for example. Sifting through thouยญsands of birthยญmark picยญtures, a modยญel could help pick the most likely ones to be canยญcer, thus savยญing docยญtors valuยญable time and resources.

However, human behaยญviour may prove to be tricky. In what way is human behaยญviour preยญdictยญable? Weโ€™re rationยญally irraยญtionยญal. We can genยญerยญalยญize, plaยญcing people into difยญferยญent catยญegorยญies based on what they like to eat, watch or do, but there might be too many factors that set us apart as individuals.

Will big data and machine learnยญing detect the anomยญaliesโ€‰โ€”โ€‰or will it just be superb at generalizations?

I hope weโ€™ll experยญiยญence a future where comยญpanยญies focus on actuยญal data anaยญlysยญis instead of thinkยญing that โ€œbig dataโ€ by default equals โ€œbetยญter data.โ€

So, until someone proves me wrong (or Arnold Schwarzenegger returns from the future, whichever comes first!), We should put machine learnยญing to use where genยญerยญalยญizยญaยญtions best can save time from real humans. 

Otherwise, the risk is that weโ€™ll end up with so many metยญrics that the sheer amount would sufยญfocยญate any posยญsibยญilยญity of makยญing sense of it.

About the writer: Ola Lidmark Eriksson is CTO at Wide Ideas.

Signature - Jerry Silfwer - Doctor Spin

Thank you. Please supยญport my blog by sharยญing artยญicles with othยญer comยญmuยญnicยญaยญtions- and marยญketยญing proยญfesยญsionยญals. Please also conยญsider my PR serยญvices or speakยญing engageยญments.

PR Resource: More Guest Posts

Jerry Silfwer
Jerry Silfwerhttps://doctorspin.net/
Jerry Silfwer, alias Doctor Spin, is an awarded senior adviser specialising in public relations and digital strategy. Currently CEO at Spin Factory and KIX Communication Index. Before that, he worked at Kaufmann, Whispr Group, Springtime PR, and Spotlight PR. Based in Stockholm, Sweden.

The Cover Photo

The cover photo isn't related to public relations obviously; it's just a photo of mine. Think of it as a 'decorative diversion', a subtle reminder that it's good to have hobbies outside work.

The cover photo has

.

Subscribe to SpinCTRLโ€”itโ€™s 100% free!

Join 2,550+ fellow PR lovers and subscribe to Jerryโ€™s free newsletter on communication and psychology.
What will you get?

> PR commentary on current events.
> Subscriber-only VIP content.
> My personal PR slides for .key and .ppt.
> Discounts on upcoming PR courses.
> Ebook on getting better PR ideas.
Subscribe to SpinCTRL today by clicking SUBSCRIBE and get your first free send-out instantly.

Latest Posts
Similar Posts
Most Popular