Data Science for Cricket

Authored by Raghunathan Rengaswamy, IIT Madras

Data Science, ML, AI are the flavors of the season all over the world and India is no exception. Interest in these techniques is at a fever pitch; academicians and industrial practitioners alike are exploring the fundamental underpinnings and application potential of these techniques. As a result, data science is being evaluated for implementation in all imaginable fields and sub-fields. Sports is one such area with tremendous potential for data science applications. While sports analytics has been around for a long time, sabermetrics being a prime example, the scale of what can be achieved now with analytics is quite stupendous. Disparate data from various sources can be collected and analyzed towards making decisions regarding almost all aspects of any sport. Take cricket, which is the focus of this blog, as an example. Ball-by-ball commentary, video feed, bats that are actual IoT devices producing fascinating information, can all be merged together for analysis. This blog describes a data science tool for cricket that was created in India by ESPNcricinfo, IIT Madras, and Gyan Data Private Limited (GDPL). While one would usually associate sport analytics with tools that can be used to improve player performance or game outcomes, this effort targeted enriching spectator experience. The aim was to go beyond traditional statistics by bringing in multiple layers of nuanced analysis that heightens the sport-lovers’ engagement with the game.

This particular effort was a result of ESPNcricinfo’s interest in rationally addressing two talking-points that seize spectators the most. The first one is related to the impact of luck on the result of a game. The second is on deciding how important a particular performance is to the outcome of a game or equivalently, quantification of an inherent value of a performance that takes into account as much game context as possible. Sport lovers would surely have participated in arguments centered on these two themes. At the end of these discussions - usually highly energetic and contentious - one is more often than not left with a sense of dissatisfaction or non-closure. ESPNcricinfo wanted to arm the debaters with a little more data-based reasoning to bolster their arguments. In other words, the effort was to mathematically frame these questions, albeit not from a viewpoint of definitive answers (which obviously is not possible), but from a spectator engagement perspective. At this juncture, many of you reading the blog may probably chuckle at the foolhardiness of such an exercise. You might wonder how one would be able to quantify a fundamentally abstract concept such as luck, especially in a debate where the participants have pre-conceived notions and favorite teams. Let us pause here for a moment to enunciate the fundamental guiding principle that underlies this work. Clearly, there is no unique best approach to address this problem. Rather, it is desirable that any approach that one develops satisfy the following two requirements. One, consistent application of the same approach to all scenarios leading to “apple-to-apple” comparisons should be possible and two, it should make “cricketing sense”. At this point, there might be dismay that we are trading one abstraction, “quantify luck” for another, “cricketing sense”. If one were to think a little more carefully, it will become apparent that this is not as arbitrary a concept as may seem at first encounter. Assume that you make a statement about some cricketing situation to a room full of spectators. If a large majority of the room agree to the statement, then the statement is deemed to make cricketing sense. Of course, how one practically realizes this (we cannot assemble a room full of spectators and do polling) is a tricky question and in our case, a panel of ESPNcricinfo experts painstakingly checked if the algorithm results make cricketing sense for the large number of games that we tested on. Essentially, the underlying idea is some form of “majority is right”, however small the sample may be. We apply this principle to considerably more important things in life than cricket. Any result of a presidential style election, however important the country be on the world stage, cannot be viewed as the right answer (in any mathematical or practical sense) but as an answer that is delivered by a majority (in some cases, not even that).

Let us return to the first problem at hand. Given a score-card of a game that is completed, how can we quantify the impact of luck on the game? In rather short order one would realize that just a scorecard is not enough information. In this project we decided to work with information at a higher level of granularity, which is the easily available, “ball-by-ball” commentary. Ball-by-Ball commentary is, in general, unstructured data; however, some structuring of this data and a resultant database was already available with ESPNcricinfo. When the problem of quantifying luck is broken down further, there were several more questions that we had to contend with. A list of these are:

1. What are luck events?

2. Do these events affect the batsman and bowler in the same manner? In essence, is it zero-sum?

3. Is the impact of a luck event on the batting or bowling team the same as the impact on batsman or bowler?

4. How would one quantify the impact of disparate luck events in an apple-to-apple fashion anyways?

5. What is the cumulative impact of all the luck events on the two teams? How does one account for the luck event in both the innings together?

Of course, at this point, answering all these questions looks like a formidable endeavor, and a comprehensive solution might be elusive. In the search for a data science approach, the first decision that was made was to enumerate a reasonably comprehensive list of luck events. A list of such luck events is shown in the figure above.

It can be seen that we have a hierarchical arrangement of the luck events. At the highest level of hierarchy, we have dismissal and non-dismissal events. Dismissal events are further categorized into replacement and reinstatement events. There are multiple luck events under each of these final nodes of this classification. We report an illustrative list here and in the actual application many more events have been considered. The non-dismissal events have a reasonably simple logic for their run impact computations. Replacement events are ones where the alternate situation is where a batsman has to be replaced by another one. In contrast, reinstatement events are ones where a batsman has been given out unluckily and one has to imagine what would happen to the scorecard if the batsman was reinstated. This hierarchical arrangement and the identified luck events answer our first question in the enumerated list of five questions.

The table above is developed for all the luck events (the table shows a subset of events). This table allows us to answer questions 2 and 3. For each of the luck events, how (with the correct interpretation of positive or negative luck) and if it impacts the batsman, bowler and the respective teams is described in these tables (only one of the multiple tables shown here). Y stands for impact and N stands for no impact in luck computations. Using these tables, we also address differences between luck and skill to some extent. If one looks at event description Catch dropped, for a regulation catch that is dropped, the bowling team is not unlucky but rather they have not executed a basic skill properly (N entry). Now armed with this formalism, one can then proceed to answer question 4, which is the identification of a quantifying metric for these luck events that will make commensurate comparisons possible. The most obvious quantifying metric is the run impact of the luck event. This would allow luck events to be compared on an equal footing. This necessitated the development of a core data science module that can predict future runs that will be scored from any given situation in a game. This was named the forecaster.

The basic mathematical problem is, given a score at the end of n^thover (runs scored and wickets fallen), how does one predict the score at (n+k)^th over? Initially, since it looks like a nice time series problem, we used a recurrent neural network architecture for this prediction. However, there were difficulties with this approach, largely related to data requirements and explainability. We could also not explore this solution fully given the incredibly short time that we had (3 months), starting from a blank page all the way to a deployed application in the ESPNcricinfo website. It would be interesting to revisit this with more data and deeper (figuratively and literally) architectures. Nevertheless, we abandoned this approach and moved onto a more operations research approach, with machine learning models as required. Here, from a given situation, there are a certain number of balls remaining to be bowled (resource) and these need to be allocated to the remaining batsman (allocation). We solve this resource allocation problem based on multiple statistical parameters derived from the data. Once this problem is solved, for predicting the score after (n+k)^th over, we need to predict the strike rates of the batsmen who will play-out the allocated balls. Here, we use different machine learning models with self-correction abilities trained on data for all the batsmen in the database. These models take several factors into account, and are also conceptually extendable to include other factors in the future. From our experience, the most accurate machine learning model to be used depends on the format of the game (T20, ODI). This module for prediction can then be integrated to predict the impact of luck event. The score prediction algorithm is run on the actual situation and luck removed alternate situation. The difference in the predicted scores quantify the luck impact. Though not used in luck computations, probability of a result (win or loss) for the teams was also developed based on the forecaster and historical data. There are also other nuances such as post game and live game luck computations and so on that are not discussed here, for reasons of brevity. Further, the computations were carefully designed so that these impact numbers could be cumulated to address question 5 in the list of questions.

Now that the luck events are enumerated, each delivery bowled can be annotated with a luck code. This necessitated that the database be altered to include as many columns in the table as there are luck events. As a commentator is providing text commentary, he or she will also score the presence or absence of luck events for each of the deliveries. The default value is zero, which signifies absence of the luck event; this ensures that online scoring of luck events is simple and efficient. Traditionally, ESPNcricinfo was not scoring these luck events and hence considerable effort at retrospectively scoring a selected set of matches for luck events through manual curation of the commentary had to be undertaken. In some cases, the original match footage had to be revisited for this annotation exercise. A set of 50 odd games were annotated for luck events and then used to benchmark and evaluate the appropriateness of the algorithms that were developed.

We also developed algorithms for identifying the inherent value of different performances – a suite of algorithms collectively called smartstats. Here, the key idea is to value performances based on a notional pressure felt by a batsman or bowler when they are performing. Performances in high pressure situations are valued more than the ones where the pressure is minimal. The pressure that we feel (and presumably the players also feel similar pressure) while watching the game is directly related to the scoreboard pressure. To capture this, the difference between the predicted score and the target is mathematically transformed into a value for pressure. The first innings pressure is calculated based on a notional target, akin to the par score that teams bating first usually target. This instantaneous pressure is used to appropriately increase or decrease runs scored of every ball. Based on this the algorithm identifies an alternate score card from which smart strike rates and other smart statistics can be derived.

We will now look at some of the results from our suite of algorithms for the IPL 2019 season and the recently concluded ODI world cup. We sample some interesting results and describe them briefly. One of the first successes of the forecaster tool in the world cup ODI came in a game between South Africa and Bangladesh. The forecaster predicted a final score of 335 for Bangladesh after 25 overs and they went onto make 330 at the end of the innings. This was one of the early scores greater than 300 predicted by the forecaster.

In another Bangladesh match featuring West Indies, the forecaster gave a thumbs-up for Bangladesh by the half-way mark with a win percentage of about 63%. At this point in the game, Bangladesh had still about 160 runs to score with three top order batsmen gone. It turned out that the forecaster was right and the game ended in Bangladesh’s favor. Of course, there are also cases where the forecaster’s predictions didn’t turn out to be as accurate.

In terms of luck index, there were several interesting results throughout the IPL season. Here, we point out a consolidated result in terms of the overall impact of luck as judged by the algorithms. Below, you will see two tables, one with actual standings at the end of the league games, where MI, CSK, DC and SRH were the top four teams and these teams moved onto play-offs. If we were to remove all luck events from all the games, our algorithms predict that RR would have replaced CSK and gone onto the play-offs. Whatever you make of this result, one thing is for sure; this table will not make us popular with the CSK fans (no lucky guesses needed here!!).

Let us look at some results from the smartstats algorithms that were developed. We describe two prototypical results here, one for batsmen and one for bowlers. Let us look at what smartstats says about the performances of KL Rahul and M Agarwal in a KXIP v MI match. The pressure was high (required run rate was over 10) when Mayank came out to bat. Mayank scored 43 off 21 balls and turned the match in Punjab's favor. During his partnership with Rahul, Mayank scored the bulk of the runs at a high strike rate and reduced the pressure of the required rate on Rahul and the other batsmen to follow. Though Rahul scored 28 runs more than Mayank, Mayank scored more 1 smart run more than Rahul in the innings as judged by the smartstats algorithms (shown below).

From a bowling viewpoint, let us look at the performances of Axar Patel and Sandeep Sharma in a KXIP vs RCB game (IPL 2017). Both Sandeep Sharma and Axar Patel took three wickets each for Punjab However, while Axar took the wickets of Shane Watson, Pawan Negi and Samuel Badree, Sandeep was the bowler to derail RCB's chase with wickets of Chris Gayle, Virat Kohli and de Villiers inside the Powerplay. Sandeep's three wickets were worth 4.86 on smart wickets. Axar's three were worth 2.85.

One of the fun aspects of this work has been the feedback of fans who followed IPL and ODI world cup in the ESPNcricinfo website. Here is a representative collection of comments from the website. The first commenter has words of encouragement for the forecaster and one another is impressed by forecaster’s early precise prediction.

In the comment “has Forecaster seen this Sri Lanka team even bat” the commenter seems to be skeptical of the forecaster’s prediction of a big score for Sri Lanka. It turned out that the forecaster’s prediction in this case was quite accurate in the end. Of course, the comment following that shows the interest of fans in wanting direct access to the forecaster tool.

In summary, it was an incredible experience working at the intersection of data science and cricket, both of which are exciting domains. Let me end this blog with an answer to an interesting question that we pondered over when we started to build these algorithms. At what point in the game will we get the best predictions from our algorithms? Based on the performance of our algorithms in IPL 2019 and the ODI world cup matches, we see that for T20 games, the 11^th over predictions seem to be best and for ODI, the 25^th over predictions seem to be the best in terms of accuracy of the final score predicted. As we can see, this is right about in the middle of the game. This might be so because predictions towards the end are generally plagued by random errors (with not enough overs to average them) and predictions at the beginning might not have enough information about the current game to work with.

About the author -

Raghunathan Rengaswamy is an Institute Chair Professor at the Department of Chemical Engineering and a core member of the Robert Bosch Center for Data Science and AI (RBC-DSAI) at IIT Madras. He is also a co-Founder and Director of Gyan Data Pvt. Ltd. (GDPL), GITAA Pvt. Ltd., and Elicius Energy. He was elected fellow of Indian National Academy of Engineering in 2017.

Comments

Lipika DeyNovember 26, 2019 at 9:22 PM
Very interesting! For a nation full of cricket lovers this is like Mannah from the Heavens - bound to liven up conversations. But more importantly it shows the way to build applications that need to combine knowledge from multiple sources and predict outcomes in a complex situation. Enjoyed it.

Search This Blog

Welcome to the Data Science Blog

Data Science for Cricket

Comments

Post a Comment

Popular posts from this blog

Submission Guidelines for this Blog

The Humble Beginning and Steady Evolution of IKDD (ACM SIGKDD India Chapter)