How to Get Ahead of Your Competition with Data In our exclusive roundtable discussion, we spoke with some of the most innovative experts in sports analytics. Here we learn about their career experience and cutting-edge technical expertise.
2. Remove the Blindspots with Geographical Data
reason we work with multiple data providers is that we want
Cintia made the point that if companies plan to scale globally,
with different data providers. Therefore we must tender for
extensive geographical data is key, especially in terms of scouting. “For performance monitoring and scouting, having a wide geographical coverage is fundamental,” said Cintia. “We discover new players from the beginning of their careers, and this is very valuable for clubs. Data has to be machine readable—it’s not obvious, unfortunately—and should cover all the aspects of player and club performances: from matches
1. Train Your Algorithms
to training load, financial aspects and media coverage.”
When it comes to boosting the accuracy of your analysis, the first step is training your algorithms by integrating data.
Cintia explained that with international players from all
Lorenzo Malanga, Head of Data Science at sports betting
continents and world leagues becoming more popular in
and data company Mercurius explained their process.
football, it’s now extremely important to have data available from leagues across the globe.
“What we do is find patterns in historical data in order to gain an edge over the betting markets,” said Malanga. “Hence, we
“Historical data allows us to find new players, and to extend
make use of Hudl data in our predictive algorithms to measure
the possibility of performance analysis and comparison,”
the skills of every team at every point in time. Without it, it
said Cintia. “As an instance, we found that the player with
would be harder to spot market biases and misjudgments.”
most goals scored in a single game is a Chinese female player. Now we know that it’s possible to score nine goals in
Of course, better data accuracy will relate to more profitable
a single match, it was in a game scored 16-0 to China over
outcomes, and will also allow an advantage over competing
Turkmenistan. This could seem kind of fun, but we can do
entities. But with predictions comes risk. Paolo Cinta, CEO
the same for plenty of similar metrics—it’s really valuable.”
and co-founder of player performance evaluation company PlayeRank shared his thoughts on the link between human
Extensive geographical data is also very useful in terms of
and machine skills.
influencing algorithms and comparing data between different leagues.
“The key is explanation, as machine learning techniques use data to automatically generate new algorithms, capable of
“We tend to model each league and country as a universe per-
performing a task they have been trained to,” said Cintia.
se,” said Malanga. “Of course, some analysis may benefit by
“Such algorithms tend to be black-boxes: they work, they
grouping together different leagues’ data. But in general, an
produce predictions, but the underlying motivations leading
extensive geographical data coverage would simply mean a
to that prediction are not completely transparent. This
wider set of leagues for which we train predictive algorithms.”
process of knowledge extraction and explanation enforces towards a continuous refinement for both.”
way to keep accuracy of predictions high. But what about the value of using data from past seasons? “The more past data, the better,” according to Malanga. “We extensively make use of past seasons data to train our models and to backtest trading strategies,” said Malanga.
48
Hudl Magazine
September 2021
insight from it for clients, and different clients have licenses multiple data providers ourselves to increase our potential pool of customers. “Also, we endeavor to set up our tools in a way where the details of the data are still available to the end users through our tools—for example, Hudl’s [Wyscout] ‘tags’ on event data can be used in our tools, to also empower the provider to innovate in their data, and for that to make its way to the end users.” In terms of taking advantage of multiple data providers to get ahead of competitors, Cintia said, “[I] would say it allows us to image and develop better products.” Malanga concluded, “It allows us to mitigate the risk of corrupted data, plus it means we are able to capture a higher level of detail, as different companies have different data compiling procedures.” In certain countries, football is now flourishing where it isn’t traditionally the primary national sport—and the level of play is rising along with its revenue. A key example is in Japan where the J1 League and DAZN have closed an exclusive deal for the next 10 years. When asked how important it is to enter these emerging markets, Meza said, “It’s definitely important as if coverage of J1 League opens up commercial opportunities to our clients, for example content providers like DAZN, then us having those leagues within our coverage means we can tap into those customers and share the revenue.” Cintia also sees big potential for commercial benefit in emerging markets. “We actually attracted the interest of some players from the Asian continent, and there is no doubt that many countries are fascinated by football, aiming at developing their clubs.
a cooperation between human and machine skills, heading
Having the most up-to-date and accurate data is the best
to provide technology to make use of the data and extract
“By having more data, we can make more experiments without increasing the risk of ‘overfitting’, i.e., the common curse of ‘remembering’ past data rather than ‘learning’ from it. There is also a clear alignment between technical and commercial values: the deeper the historical set on which we tuned our algorithms, the more credible and robust our strategies become to our clients’ eyes.”
3. Use Large Data Sets to Stay Ahead
We could be an opportunity for them, providing tools to speed
Wyscout data can be combined with multiple other data
competitions would become harder, but the whole sector
providers to provide added statistical benefits. Meza explained the value of working with multiple data providers from the perspective of Twenty3 Sport. “For us it’s not about ‘combining’ the data to improve algorithms or models—in fact we keep data from data providers in completely different silos,” said Meza. “The
up the process of decreasing the gap. Both market and sport would gain from this, and consequently grow and expand.” “Coverage and scale are good friends of data-driven processes—there’s no doubt about that,” said Meza. “Tapping into that is crucial in providing value to customers, and therefore finding revenue opportunities.”
Hudl Magazine
September 2021 49