CLASSIFICATION OF OUTCOMES OF ENGLISH PREMIER LEAGUE MATCHES USING MACHINE LEARNING MODELS
Loading...
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Football remains an important sport in the world and it has a lot of followers. Researchers are often interested in the analysis of the results of football matches, which
helps in the prediction or classification of outcomes (results) of football matches based
on some variables. Most of the available models of prediction and classification of outcomes are based on a selected variable or a large number of variables. The use of a
few variables can not predict accurately and the use of large variables leads to the
problem of interpretation (Parsimony). This work used feature selection methods to
reduce sixteen selected independent variables (football related) to six variables in the
classification of the outcome variable (home win, away win, and draw) of five seasons
of English premier league matches. As expected, a home win is a modal observation
in all five seasons. The Kruskal Wallis test showed that the median outcome was
not the same for the five seasons, while four machine learning models classified the
outcome using the six best variables recommended via the feature selection. Furthermore, the result of the first half and second half was used to classify the final outcome.
Five performance metrics attest that the ML models are good in the classification.
Cross-Validation ensured that the issues of over-fitting were adequately addressed.
Bookmakers may find this research interesting as some variables were identified as
key to the classification of outcomes of football matches.