You are viewing a single comment's thread from:
RE: Soccer Predictions using Python (part 4)
Hey bud, the only data it uses is the scores from the previous games. We take averages for home and away goals in the competition, then the averages for each team we want to predict, use these to calculate expected goals for each team. These are then used to pick random samples from the poisson distribution which get averaged out to give us our prediction. Here's an excellent description of the process - it's the one I followed to get started.
You are guilty for making me lose a couple of hours already, boy I look dumb staring at all this stuff on my screen, we have a saying in portuguese, "like an ox staring at a palace".
Wouldn't going so far back in games kind of ruin the prediction a bit, when calculating averages for each team? (Because you go into past seasons, diferent players, managers etc.)
Anyway I tried to run your code and got this error: ModuleNotFoundError: No module named 'pandas'
I need the database right? Can I use my own database?
The possibiltys with this are immense, unfortunatly my knowledge of this isn't.
It looks like going further back helps more than hinders. The backtest tries everything from 50 to 500 games and most competitions I've tested with seem to do best considering around 400 games. I believe there are 380 games in a season of the Premier League so it looks like a full year worth of history would be a sensible default.
No database required. For the pandas error, you can use your package manager (synaptic probably) to install "python3-pandas".
Or try
pip3 install pandas
from the command line.You'll likely need to install python3-bs4, python3-selenium and python3-numpy as well.
If you have any other issues, let me know which distro you're using and I'll help as best as I can.
Steven