<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Central Winger]]></title><description><![CDATA[Occasional thoughts on Sports Analytics.]]></description><link>https://www.centralwinger.com</link><image><url>https://substackcdn.com/image/fetch/$s_!IoRW!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e1b381-ba01-41fd-a38f-c24add0c3916_300x300.png</url><title>Central Winger</title><link>https://www.centralwinger.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 06 May 2026 08:27:21 GMT</lastBuildDate><atom:link href="https://www.centralwinger.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Devin Pleuler]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[devinpleuler@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[devinpleuler@substack.com]]></itunes:email><itunes:name><![CDATA[Devin Pleuler]]></itunes:name></itunes:owner><itunes:author><![CDATA[Devin Pleuler]]></itunes:author><googleplay:owner><![CDATA[devinpleuler@substack.com]]></googleplay:owner><googleplay:email><![CDATA[devinpleuler@substack.com]]></googleplay:email><googleplay:author><![CDATA[Devin Pleuler]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Getting Stuck Into Lawn Signs]]></title><description><![CDATA[I took a picture of (almost) every political lawn sign in Davenport.]]></description><link>https://www.centralwinger.com/p/getting-stuck-into-lawn-signs</link><guid isPermaLink="false">https://www.centralwinger.com/p/getting-stuck-into-lawn-signs</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 02 Jun 2025 12:02:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7edf16af-0c46-4633-8bd3-da37da45b471_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>If this is your first time here, I'd like to assure you that the blog name is a reference to an oxymoronic soccer position and not some edgy political position. </p></blockquote><p>This blog typically covers ideas in sports analytics, but over the last few months I have been on a parental leave that has included long walks around the <a href="https://en.wikipedia.org/wiki/Davenport_(federal_electoral_district)">Davenport federal electoral district</a>. And in the lead up to the <a href="https://en.wikipedia.org/wiki/2025_Canadian_federal_election">2025 Canadian Federal Election</a>, I observed three things that didn't quite add up and led me down a bit of a rabbit hole.</p><ul><li><p>In the previous 2021 Canadian Federal Election, Davenport was decided by just 76 votes with a turnout of 47,736, with the NDP candidate <strong>Alejandra Bravo</strong> narrowly losing to Liberal incumbent <strong>Julie Dzerowicz</strong>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p></li><li><p>There seemed to be an incredible number of lawn signs on display for the NDP candidate <strong>Sandra Sousa</strong>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p></li><li><p>The <a href="https://338canada.com/35022e.htm">Canada338</a> election forecasting website offered a <em>greater-than-99%</em> chance of the Liberal incumbent retaining her seat in Davenport.</p></li></ul><p>With my anecdotal evidence of an increased NDP enthusiasm in a district that was previously decided by a razor thin margin, it felt pretty strange that the leading forecast website suggested such a landslide in favour of the Liberal candidate.</p><p>My hypothesis was that the forecasts were grafting too much of the country-wide shift toward the Liberals in districts that had strong NDP bases of support.</p><p>And this might be a good time to remind the readers that I am not a political scientist, just a computer scientist with an unreasonable level of confidence<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>.</p><p>There are two additional dynamics that led me to formulate this hypothesis:</p><ul><li><p>It is difficult to perform riding-level polling in Canada because, unlike in the United States, area codes don't reliably correspond to electoral districts and you can't randomly call numbers in an area code and gather a representative sample.</p></li><li><p>The multi-party system, coupled with first-past-the-post voting, causes situations where informed voters can rationally be motivated to vote strategically. This effect seems really difficult to reliably forecast.</p></li></ul><p>Obviously, my hypothesis was eventually proven <strong>very wrong</strong>, but I set out to capture some additional data that I thought might help illuminate the political dynamics in the riding and can perhaps be useful when forecasting future elections.</p><p>Over the week leading up to the election, my son and I strolled <strong>65 kilometers</strong> up-and-down the Davenport federal electoral district taking pictures of every single lawn sign that we passed. We also visited 8 parks &#8211; he loves the swing.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a></p><p>After realizing that there wasn&#8217;t enough time to walk <em>every</em> single street in the riding<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>, we decided to focus on all of the largely residential North-South streets<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>, hoping that there isn't much political bias between the populations that live on North-South versus East-West streets.</p><p>A <a href="https://www.toronto.ca/legdocs/mmis/2024/bu/bgrd/backgroundfile-242407.pdf">2024 budget document from the City of Toronto</a> estimated that there were about 5,600 km of roads in the entire city. Assuming the density of roads per square km is roughly consistent across the city, Davenport should have about 107 kms of roads.  </p><p>This back-of-napkin math suggests that I covered somewhere around 60 percent of all the roads in Davenport. Since I largely walked North-South, and given the unusual shape of the electoral district, that sounds about right to me.</p><p>Using these rough ratios, I extrapolated my observed signs to estimate the number of votes that each candidate earned per sign.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/D3XlO/1/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5052dd58-01ca-472a-8791-6d2f8a54b47d_1260x660.png&quot;,&quot;thumbnail_url_full&quot;:&quot;&quot;,&quot;height&quot;:400,&quot;title&quot;:&quot;| Created with Datawrapper&quot;,&quot;description&quot;:&quot;Create interactive, responsive &amp; beautiful charts &#8212; no code required.&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/D3XlO/1/" width="730" height="400" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><p>Any statistical test that you throw at these numbers will easily and confidently assure you that the sample that I observed was a very poor representation of the actual voting population. I anticipated that there was going to be an enthusiasm gap between each of these parties, but not something so extreme!</p><p>There are some unlikely yet realistic explanations for this enthusiasm gap other than political peculiarities. Maybe the Liberal party ran out of signs? Perhaps the Conservative party decided that it wasn't worth printing many at all (like the Green Party)? Those constraints would certainly skew the the observation numbers.</p><p>Each sign is internally traceable by the distributing political party because it is their responsibility to ensure that political signs are removed promptly following election day. I'd be curious how closely my numbers match up with their actuals.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p><p>When you take a picture on a modern mobile device, the GPS location is logged alongside other metadata at the time of capture. After categorizing each of the signs<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>, here's the distribution of Davenport lawn signs!<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zdr2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zdr2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 424w, https://substackcdn.com/image/fetch/$s_!zdr2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 848w, https://substackcdn.com/image/fetch/$s_!zdr2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 1272w, https://substackcdn.com/image/fetch/$s_!zdr2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zdr2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png" width="1456" height="2339" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2339,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:934019,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.centralwinger.com/i/164755789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zdr2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 424w, https://substackcdn.com/image/fetch/$s_!zdr2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 848w, https://substackcdn.com/image/fetch/$s_!zdr2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 1272w, https://substackcdn.com/image/fetch/$s_!zdr2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f951235-d550-4122-82ca-3aafc828a3f3_1463x2350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The spatial data paints a very clear picture &#8211; Davenport is far from a political monolith. There are pockets where political preference skews in different directions.</p><p>Some observations:</p><h3>Estimates very Rough.</h3><p>We&#8217;re a month past and Elections Canada still hasn&#8217;t released poll-by-poll data for the election and I haven&#8217;t been able to find a updated boundaries for the polling divisions since the redistricting. With this data, you could repeat this riding-level analysis at a polling-level and construct more precise estimates. Would anyone be interested in that?</p><h3>Davenport is Big.</h3><p>There are quite a few regions of the riding that I didn&#8217;t quite get to, particularly in the more hilly northern areas and western industrialized sectors. This probably biased the numbers a little bit. There were entire polling divisions that I didn&#8217;t step foot in.</p><p>Below is a map of the North-South streets that I covered, highlighted in purple.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tsq3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tsq3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 424w, https://substackcdn.com/image/fetch/$s_!Tsq3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 848w, https://substackcdn.com/image/fetch/$s_!Tsq3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 1272w, https://substackcdn.com/image/fetch/$s_!Tsq3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tsq3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png" width="1456" height="2339" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2339,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:695803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.centralwinger.com/i/164755789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tsq3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 424w, https://substackcdn.com/image/fetch/$s_!Tsq3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 848w, https://substackcdn.com/image/fetch/$s_!Tsq3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 1272w, https://substackcdn.com/image/fetch/$s_!Tsq3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff557a77f-3910-40f0-a200-1daa48e6841a_1463x2350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Putting a sign on your lawn is a social act.</h3><p>Just north of Geary, I walked along two practically identical streets. One had a half dozen signs proudly stuck in the grass. On the other, absolutely none. We&#8217;re working with low sample sizes here, but I can&#8217;t imagine one street is statistically less inclined to be politically active. The houses are the same age. The sidewalks the same width. The yards no more or less sign-friendly.</p><p>What this really illustrates is how public displays of political support &#8212; like putting up a lawn sign &#8212; are not just expressions of individual belief, but also social acts influenced by what others around us are doing. It doesn&#8217;t take much imagination to picture the first sign going and slowly emboldening the neighbours. Maybe someone sees it while getting the mail. Maybe they bring it up in a chat on the sidewalk. Whatever the path, there's a contagious quality to it.</p><p>This sort of <em>peer signalling</em> is well-documented in social science: humans take cues from their environment to decide what&#8217;s acceptable or encouraged. Lawn signs aren&#8217;t just about supporting a candidate &#8212; they&#8217;re about being seen supporting a candidate. And that visibility becomes permission for others to do the same.</p><p>On the other street, perhaps no one wanted to be first. The absence of signs isn&#8217;t necessarily apathy &#8212; it may just be inertia. Without a visible cue, the feedback loop never gets started.</p><p>It&#8217;s a small example, but it hints at something powerful: much of what we think of as political engagement happens at the intersection of personal conviction and public perception. All it might take to tip a street from silent to orange is one neighbour and a garden stake.</p><h3>Redistricted Areas Seemed Different.</h3><p>Davenport went through a redistricting process in 2023, adding a few square kilometers in both the northeast and the northwest corners of the riding. And a small chunk in the south-east.</p><p>These areas had demonstrably fewer signs per kilometer covered, and they even had a few signs on display for candidates in their old riding.</p><p>The new voters in the added regions also skewed against the NDP. As a part of the redistricting process, Elections Canada created a <a href="https://www.elections.ca/content.aspx?section=res&amp;dir=rep/tra/2023rep&amp;document=index&amp;lang=e">report</a> that &#8220;<em>summarizes the transposition of the results of the 44th general election in 2021 to the new boundaries established by the 2023 decennial Federal Electoral District (FED) boundary readjustment process.</em>&#8221;</p><p>This report erased that narrow 76 vote margin and created a wider gap 2,144 votes. And looking at the map above, you can see why. The signs in the northern extremities of the district skew more conservative.</p><h3>Wear Sunscreen!</h3><p>And reapply regularly. Even when it&#8217;s not too sunny out.</p><h3>What I&#8217;d do Next Time.</h3><p>With the exception of the more northern reaches of the district, signs were probably too dense to reliably catch them all unless you were walking. In theory you could do this in a car or on bike, but you would be stopping too regularly and your visibility would be too constrained to be confident that you&#8217;re exhaustive. </p><p>On one-way streets, it&#8217;s important to walk in the same direction as vehicular traffic. Signs are placed relatively strategically to make them more noticeable for drivers. When I was walking against the flow, I frequently had to back-track. </p><p>Because one-way streets are often interleaved in an alternating fashion, you can stay in the correct direction of flow with a little strategic mapping beforehand.</p><p>And this doesn&#8217;t need to be a one-dad-and-toddler exercise. If you can reliably de-dupe the data, I think this could be easily parallelized across a dozen canvassers in sneakers to capture the entire riding in a day.</p><h3>What&#8217;s Next?</h3><p>This exercise was never meant to predict the outcome of the 2025 election in Davenport &#8212; and it didn&#8217;t. But that doesn&#8217;t mean it wasn&#8217;t worthwhile. What I&#8217;ve collected is a snapshot: a record of visible political enthusiasm, gathered systematically and (somewhat obsessively) on foot. It&#8217;s not representative, and it&#8217;s certainly not predictive. But it <em>is</em> something to compare against in the future.</p><p>If I repeat this walk in a future election, these numbers &#8212; enthusiasm rates, spatial clustering, sign density &#8212; become a baseline. A kind of hyper-local prior. And while lawn signs alone won&#8217;t forecast an outcome, they might illuminate shifts in confidence, messaging, or engagement that broader models miss.</p><p>In districts like Davenport, where multi-party dynamics and strategic voting have genuine potential to influence outcomes, even small sources of local insight can matter. I suspect that uncertainty about how close a race truly is often leads voters to adopt a more cautious approach.</p><p>Having better tools for observing and interpreting neighbourhood-level political dynamics won&#8217;t eliminate that uncertainty. But it might shrink it and that&#8217;s progress.</p><div><hr></div><h3>Sports Overlap.</h3><blockquote><p>A quick epilogue for the sports nerds that somehow made it this far &#8211;</p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><p>From my very limited impression, political campaigns have certain similarities to professional sport management. Political strategists, like team operations staff, are regularly making tactical decisions among the fog of incomplete information while under some sort of time pressure. It is difficult to cultivate genuine competitive advantage in these <em>wicked learning environments</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-11" href="#footnote-11" target="_self">11</a>.</p><p>A useful framework that I use to think about competitive advantage in these environments is to categorize it into three streams: <em>informational, analytical, and behavioural</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-12" href="#footnote-12" target="_self">12</a>. </p><p>The data in this post represents a potential <em>informational</em> advantage. Since lawns signs are only on display for a limited time, this information is ephemeral unless systematically captured. And in my experience, informational advantages are both the rarest and most valuable in the applied theatre.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>It's also worth noting that the most recent Ontario General Election was held in February 2025 (i.e 3 months ago!) and the NDP Candidate <strong>Marit Stiles</strong> defeated the Liberal Candidate by a whopping 37 percentage points with a slightly reduced turnout of 39k voters.</p><p>And <strong>Alejandra Bravo</strong> went on to win the 2022 Toronto Municipal Election for Davenport Ward 9 City Councillor with 70 percent of the vote!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>We don&#8217;t have much of a lawn, but we did have a sign.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>I am an American who is relatively new to the Canadian political system. It's probably not a coincidence that some of this increased civic interest came during a period where I was studying for my Canadian citizenship test. I passed!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Walking Log:</p><ul><li><p>Tuesday, April 22nd &#8211; 14.3km</p></li><li><p>Wednesday, April 23rd &#8211; 12.5km</p></li><li><p>Thursday, April 24th &#8211; 13km</p></li><li><p>Monday, April 28th (Election Day) &#8211; <strong>25km (!)</strong></p></li></ul></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p><em>Riding</em> is an informal Canadian word for Electoral District.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>Of course, this entirely excludes populations that live in some of the higher-density condos in the district. While this is a minority of the population, I don&#8217;t think it&#8217;s a safe assumption that their political preferences are proportional to the rest of the riding. You can see this clearly in some of the historical poll-by-poll data.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>Estimated Signs = Observed Signs / 0.60</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>In Toronto, election signs must be removed within 72 hours after voting day, as required by <em><a href="https://www.toronto.ca/legdocs/municode/1184_693.pdf">Toronto Municipal Code Chapter 693</a></em>. If signs are not taken down in time, the City may remove them and charge the candidate or party <strong>$25 per sign</strong>, deducted from their election sign deposit. This enforcement mechanism, enabled by the <em>Municipal Elections Act, 1996</em>, gives parties a strong incentive to track their signs internally.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>As a small side-side-project, I trained a tiny image classifier on top of <a href="https://huggingface.co/docs/transformers/model_doc/mobilenet_v2">MobileNetV2</a> that got about 95% accuracy on out-of-sample images, but I ended up just labelling everything by hand since it only took an hour. It was only 743 images.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p>These plots are constructed with <a href="https://www.openstreetmap.org/">Open Street Map</a> data, utilizing Python tools like <a href="https://github.com/gboeing/osmnx">osmx</a>, <a href="https://github.com/shapely/shapely">shapely</a>, and <a href="https://github.com/geopandas/geopandas">geopandas</a>. Much of this code was written at the <a href="https://1rg.space/">1RG</a> Side Project Social. I  used the <a href="https://apps.apple.com/us/app/arc-timeline-trips-places/id1063151918">Arc Timline</a> iPhone app to record which streets I walked and <a href="https://github.com/myles/arc-to-sqlite">arc-to-sqlite</a> to extract the data.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-11" href="#footnote-anchor-11" class="footnote-number" contenteditable="false" target="_self">11</a><div class="footnote-content"><p>The term <em>wicked learning environment</em> was introduced by cognitive psychologist Robin Hogarth to describe settings where feedback is misleading, delayed, or incomplete&#8212;making it difficult to learn accurate patterns or develop reliable intuition. This contrasts with <em>kind learning environments</em>, where feedback is immediate and reliable, like in chess or golf.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-12" href="#footnote-anchor-12" class="footnote-number" contenteditable="false" target="_self">12</a><div class="footnote-content"><p>This framework was introduced by Davenport, Harris, and Morison in <em>Analytics at Work: Smarter Decisions, Better Results (Harvard Business Press, 2010)</em>, where they outline how organizations build competitive advantage through data-informed behaviour, superior information access, and advanced analytical capabilities.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Tracking Data was a Red Herring]]></title><description><![CDATA[It's not anymore.]]></description><link>https://www.centralwinger.com/p/tracking-data-was-a-red-herring</link><guid isPermaLink="false">https://www.centralwinger.com/p/tracking-data-was-a-red-herring</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 31 Mar 2025 10:31:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Tracking data has required an immense amount of research and development to yield dividends that were far from certain at the outset. For years, it served as a dangerous distraction from other more important topics that soccer analysts should have been attending to.</p><p>It&#8217;s not anymore. Broadcast tracking data has changed the calculus of resource allocation inside of soccer organizations and practical tracking data research is no longer exclusive to academic circles.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><p>It had potential in the High Performance and Sport Science discipline, but those practitioners typically prefer accelerometer and GPS-based player monitoring systems. This is for good reason &#8212; wearables are used identically in both game and practice settings, granting the luxury of reasonable apples-to-apples comparisons across environments.</p><p>In the Performance Analysis and Video discipline, it can absolutely give you a much deeper understanding of how your team and opponent behave tactically and quantify game-model relevant measures. But the dividends here have a low ceiling. And the floor is gradually being lifted by genuinely good in-game tooling offered across the industry.</p><p>The last team operations discipline is Player Recruitment. With a few notable exceptions, scouting departments could not acquire tracking data on a majority of players they were monitoring and therefore couldn&#8217;t take full advantage of the tracking data research that was burgeoning within the performance analysis or sport science domains. Without this economy of scale, truthfully, tracking data wasn&#8217;t really a compelling investment on an organizational level<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>.</p><p>However, tracking data sings a siren song. The richness of the data can be irresistible to someone who is fascinated with how the game functions at an atomic level and has the technical skills to potentially unlock secrets. I&#8217;ve learned a lot from working with tracking data, but I have also paid an opportunity cost. </p><p>The primary value of tracking data research has been indirect. It allows you to demonstrate an understanding of the game and deep technical competency, which opens other more important doors. Because of this opinion, I&#8217;ve generally been very open with colleagues about what I&#8217;m working on with tracking data.</p><p>The world is changing rapidly with Broadcast Tracking data bursting upon the scene, widening the access to this spatial player data to any forward-thinking team. Suddenly, there is enough elbow-room to compete and potential motivation to speak less freely. But for the reasons discussed in a <a href="https://www.centralwinger.com/p/unexpected-origins-and-the-fermi">Previous Blog Post</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>, I still believe this industry will benefit from a default of transparency.</p><div><hr></div><p>Hungarian soccer plays a foundational role in Jonathan Wilson&#8217;s frustratingly verbose <em>Inverting the Pyramid</em>. It&#8217;s one of the central threads in the book&#8217;s narrative about the evolution of football tactics &#8211; especially in how the game transitioned from fixed-to-fluid<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> systems of play. Coincidentally, Hungarians also have invented some useful mathematics to demonstrate the concept.</p><p>I recently delivered a guest lecture on the <em>Foundational Methods for (Soccer) Tracking Data</em> and the concept that I concentrated on was the <a href="https://en.wikipedia.org/wiki/Hungarian_algorithm">Hungarian Algorithm</a> (i.e. Kuhn-Munkres). I think it&#8217;s one of the most under-utilized tools for evaluating player tracking data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zXN5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zXN5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 424w, https://substackcdn.com/image/fetch/$s_!zXN5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 848w, https://substackcdn.com/image/fetch/$s_!zXN5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 1272w, https://substackcdn.com/image/fetch/$s_!zXN5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zXN5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png" width="889" height="605" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad7af473-c742-4de9-8e16-1ba7398848da_889x605.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:605,&quot;width&quot;:889,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zXN5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 424w, https://substackcdn.com/image/fetch/$s_!zXN5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 848w, https://substackcdn.com/image/fetch/$s_!zXN5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 1272w, https://substackcdn.com/image/fetch/$s_!zXN5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7af473-c742-4de9-8e16-1ba7398848da_889x605.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For our purposes, it can be used to compare two tracking frames. In the above visual<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>, we are comparing the location of home players in the 350th and 800th frame<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> of <a href="https://github.com/metrica-sports/sample-data">Metrica&#8217;s sample data</a>. After performing a team centroid normalization, the algorithm solves the assignment between a set of points by minimizing the (in this case<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>) euclidian distance. You can interpret this assignment cost as a crude measure of similarity<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a>.</p><p>If this is at all unclear, I think the visual is very intuitive: we&#8217;re finding the combination of pairs which minimizes the combined distance of the dotted lines.</p><p>As it turns out, this is also an accidental solution to the pesky tracking data preprocessing issue of player-order invariance!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!THyE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!THyE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 424w, https://substackcdn.com/image/fetch/$s_!THyE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 848w, https://substackcdn.com/image/fetch/$s_!THyE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 1272w, https://substackcdn.com/image/fetch/$s_!THyE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!THyE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png" width="1456" height="946" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:946,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:368354,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.centralwinger.com/i/160203922?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!THyE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 424w, https://substackcdn.com/image/fetch/$s_!THyE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 848w, https://substackcdn.com/image/fetch/$s_!THyE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 1272w, https://substackcdn.com/image/fetch/$s_!THyE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e47f988-864d-4653-93f6-240d2fa833e3_2148x1396.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Tracking data gives you a set of player coordinates per frame, but it doesn&#8217;t tell you which coordinate belongs to which role or position. The ordering is arbitrary (ish). This creates a major problem when trying to feed tracking data into a machine learning model.</p><p>Machine learning models expect consistent, structured input &#8211; feature vectors where each element has a stable semantic meaning. But with unordered player positions, you don&#8217;t know whether the <em>n</em>th coordinate pair represents a center back or a left winger.</p><p>This means you can&#8217;t just flatten the coordinates and plug them into a model &#8211; the same player could be in a different column from frame to frame. Any learned patterns would be unstable or misleading.</p><p>My proposed workaround is to compare each frame to a static positional template &#8211; a canonical formation, for example &#8211; and extract the displacement vectors<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> for each role. This gives you a fixed-order, interpretable feature space: how each positional role deviates from the template at that moment in time.</p><p>There are a few established methods for addressing player order invariance, such as:</p><ul><li><p><a href="https://www.kaggle.com/c/nfl-big-data-bowl-2020/discussion/119400">Convolution with a 1x1 kernel on dense tensors of relative player positions</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>.</p></li><li><p><a href="https://arxiv.org/abs/2010.10202">Sparse matrices that represent a coarsened field</a>.</p></li><li><p><a href="https://github.com/UnravelSports/UnravelSports?tab=readme-ov-file">Graph neural networks with players as nodes and relative metrics as edges</a>, which I suspect is the way to go.</p></li></ul><p>Each of these are genuine contributions to the field of sports analytics, and are individually superior to my suggested approach above for the purposes of actual applied situations. But, this quick and dirty approach can get you a pretty far toward answering a variety of different of questions.</p><p>Anyway, this was a bit of a meandering post across both industry commentary and tracking modelling minutia. Subscribe if you want more occasional parental-leave sleep-deprived scrawls.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Ian Graham implies a similar insight in his recent book <em>How to Win the Premier League</em> when mentioned his first exposure to Will Spearman&#8217;s presentations on Pitch Control were &#8220;<em>the first times I&#8217;d seen anyone doing anything sensible with tracking data</em>&#8221;.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Which was actually referenced in a <a href="https://dtradke.github.io/aamas25_simtracking.html">Cool New Paper</a> by David Radke. I&#8217;m blushing.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><em><a href="https://github.com/devinpleuler/research/blob/master/frame-by-frame-position.md">Fixed to Fluid: Frame-by-Frame Role Classification</a></em> &#8211; thematically similar prior research.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Per usual, I&#8217;m utilizing <a href="https://mplsoccer.readthedocs.io/en/latest/index.html">mplsoccer</a> and <a href="https://kloppy.pysport.org/">kloppy</a> to produce these visuals.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>At 25fps, these frames are 18 seconds apart, so it&#8217;s unsurprising that they&#8217;re pretty similar &#8211; but you can easily find even closer matches across larger time spans.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>This doesn&#8217;t have to be euclidian! In fact, in Shaw and Glickman&#8217;s <em><a href="https://www.sportperformanceanalysis.com/s/Dynamic-analysis-of-team-strategy-in-professional-football-By-Laurie-Shaw-And-Mark-Glickman.pdf">Dynamic analysis of team strategy in professional football</a></em>, they utilize the Hungarian Algorithm with <a href="https://en.wikipedia.org/wiki/Wasserstein_metric">Wasserstein distance</a> to compare distributions of player positions! Really super smart.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>This is sneaky-useful way to detecting and filtering set piece situations (or goal celebrations) because the assignment cost explodes in these obscure moments.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>In this example, I&#8217;ve used &#916;X and &#916;Y, but you could just as easily use a magnitude and angle of displacement &#8211; which might actually be better.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>Honestly, I still don&#8217;t really understand how this works. Anyone want to write a guest post that explains it?</p></div></div>]]></content:encoded></item><item><title><![CDATA[Big Data, Tiny Teams]]></title><description><![CDATA[Introducing streamlit-soccer and some thoughts on team tools]]></description><link>https://www.centralwinger.com/p/big-data-tiny-teams</link><guid isPermaLink="false">https://www.centralwinger.com/p/big-data-tiny-teams</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 13 Jan 2025 13:03:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8080aab7-e5f6-4bd9-9e61-1e21b53b95f6_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>One of my favourite soccer analytics blog posts over the last few years is Ben Torvaney&#8217;s <em><strong><a href="https://www.statsandsnakeoil.com/2021/05/28/tools-for-tiny-teams/">Tools for Tiny Teams</a></strong></em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. Even though many soccer analytics departments  have outgrown the &#8220;tiny&#8221; label, his insights remain strikingly relevant.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h4>Boring Web Apps</h4><p>In <em><strong>Tools for Tiny Teams</strong></em>, Ben highlights the value of &#8220;<em>Boring Web Apps</em>&#8221; &#8211; focusing on well-supported frameworks like Flask or Django, building on Postgres databases, and generally avoiding flashy but fragile new tech. He calls out how small technical teams can gain huge efficiencies by sticking to stable tools with excellent documentation and community support.</p><p>For tiny analytics squads, this approach is especially handy because you can rapidly prototype new tools while trusting that you won&#8217;t have to debug an obscure integration issue at 3 AM after a weekend game.</p><p>As your team grows (or even if it doesn&#8217;t), these practices continue to pay dividends. But it&#8217;s not just about &#8220;tiny teams&#8221; &#8211; it&#8217;s also about how we handle data in a context where the user base is often surprisingly small.</p><div><hr></div><h4>Big Data, Tiny Users</h4><p>A related observation I&#8217;ve made in modern sports analytics is the paradox of big data, tiny user base. Typically, in most industries, the scale of an application&#8217;s data volume increases proportionally with its user base. That assumption underpins most web technology.</p><p>However, in sports, you might juggle massive amounts of tracking data &#8211; say, 30 FPS for entire leagues &#8211; while only serving insights to a relatively tiny operations staff. This inverts many common engineering assumptions and creates a unique opportunity to rethink industry standards and prioritize data analysis over unnecessary bells and whistles.</p><p>That&#8217;s where <strong>Streamlit</strong> excels. It sacrifices features that are crucial for high-concurrency, many-user applications in favour of attributes that make rapid, data-driven prototyping much simpler.</p><p>While Ben doesn&#8217;t mention Streamlit in <em><strong>Tools for Tiny Teams</strong></em>, I suspect it would be included in a future update. The framework shares many of the same principles: it lets you build fully functional applications without stepping outside the comfortable data science stack of Pandas, NumPy, and Matplotlib.</p><div><hr></div><h4>Introducing <code>streamlit-soccer</code></h4><p>However &#8211; soccer is a spatial invasion game and tracking data has become increasingly ubiquitous inside of team analytics department. Visualizing this data has always been a major pain point. </p><p>Native python libraries like <strong>mplsoccer</strong> have been great for static tracking data plots, but animations are painful. Writing a front-end in something like <strong>D3.js</strong> is the natural next step, but it adds a whole layer of complexity and probably a ton of technical debt.</p><p>That&#8217;s why I built <strong>streamlit-soccer</strong>. It&#8217;s a custom React component for Streamlit applications that visualizes soccer tracking data. It&#8217;s built on top of <strong><a href="https://bsky.app/profile/probberechts.bsky.social">Pieter Robberechts</a></strong>&#8217;s <strong><a href="https://github.com/probberechts/d3-soccer">d3-soccer</a></strong> package and Streamlit&#8217;s <strong><a href="https://github.com/streamlit/component-template">custom-component</a></strong> library.</p><p>You can find it on <strong><a href="https://github.com/devinpleuler/streamlit-soccer">GitHub</a></strong> and a deployed example on <strong><a href="https://st-soccer.streamlit.app/">Streamlit Cloud</a></strong>. And a video, below (Substack doesn&#8217;t allow me to embed it natively):</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;1c466ab8-1ef7-404a-a271-a460e892fc78&quot;,&quot;duration&quot;:null}"></div><p>I&#8217;ve also got it on <strong><a href="https://pypi.org/project/streamlit-soccer/">PyPi</a></strong> so you can install it like this:</p><pre><code>pip install streamlit-soccer</code></pre><p>Currently, <strong>streamlit-soccer</strong> is in version <code>0.0.1</code>, so it&#8217;s admittedly fragile and doesn&#8217;t do much beyond basic tracking animations. But if your workflow involves hefty tracking data and minimal external users, it could be precisely what you need to visualize tracking data without having to write a lick of Javascript!</p><p>Right now, it only supports one-way communication between Streamlit (Python) and React (Javascript), but it can support two-way message passing with minimal adjustments.</p><p>For example, you can attach event listeners to the player nodes to make them draggable. After dragging a player, you could feed the adjusted tracking frame back into python and have your pickled pitch-control model calculate a new surface and re-render.</p><div><hr></div><h4>What&#8217;s Next?</h4><p>Since <strong>streamlit-soccer</strong> is so early in its lifecycle, I&#8217;m actively seeking contributors who want to build out new features or help stabilize it. Whether you&#8217;ve got ideas for advanced controls, event overlays, or performance optimizations, I&#8217;d love your input. Drop by the <a href="https://github.com/devinpleuler/streamlit-soccer">GitHub repo</a> and open an issue or pull request.</p><p>I&#8217;m planning on using <strong>streamlit-soccer</strong> to build small proof-of-concept applications for tracking data modelling and visualization and attaching it to <strong>centralwinger.com</strong>, but I&#8217;d love to see it used in other ways too!</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>I referenced Ben&#8217;s article in my <strong><a href="#">B&#233;ziers, Bivariates, and Beyond</a></strong> piece recently, which you should check out.</p></div></div>]]></content:encoded></item><item><title><![CDATA[Let's Bring Back Big Chances]]></title><description><![CDATA[Entropy and a Unified Model of Goal Probability]]></description><link>https://www.centralwinger.com/p/lets-bring-back-big-chances</link><guid isPermaLink="false">https://www.centralwinger.com/p/lets-bring-back-big-chances</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 23 Dec 2024 13:00:56 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/fe78cd15-973a-43b1-8fd9-68f08dec8987_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The Big Chance is an intriguing artifact of soccer analytics history. In the early times, Opta defined Big Chances as shots where a player should reasonably be expected to score: a one-on-one with the goalkeeper, an open-net tap-in, or perhaps a free header in the six-yard-box with a prone goalkeeper. These tags were subjective, applied by analysts watching the game, and while they were inconsistent, they were undeniably powerful.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><p>Big Chances worked because they acted as proxies for many things that weren't systematically collected at the time: the pressure on the shooter, the positioning of defenders (particularly the goalkeeper), and various additional attributes of the buildup play. These were moments when the likelihood of scoring wasn&#8217;t just high &#8211; it felt inevitable. But as the event data specs improved and xG models started incorporating these elements directly, the need for the Big Chance qualifier faded. They lacked consistency, tied chance evaluation too closely to outcomes, and led to some very weird bi-modal distributions of chance quality.</p><p>But what if the original concept of Big Chances deserves a second act? Not as a subjective evaluation of shots, but as a tool for identifying repeatably dangerous moments within possessions where scoring becomes genuinely probable, regardless of whether a shot is even taken.</p><div><hr></div><h3>Inflection Detection</h3><p>Soccer is a game of uncertainty, and one way to understand that uncertainty is through the lens of entropy &#8211; which we will interpret as the variability in possible outcomes as a sequence unfolds. At the start of a possession, entropy might seem intuitively high because the ball could end up anywhere, but the likelihood of a goal is overwhelmingly low. Almost every possession ends without a goal, so the outcome entropy at the beginning of a possession is usually correspondingly low.</p><p>However, as a possession progresses and nears dangerous areas, entropy rises alongside the probability of a goal. Entropy peaks at the moment when the uncertainty about whether the possession will result in a goal is at its highest. This peak often corresponds to moments when a team is poised to make a decisive action, such as taking a shot or attempting a key pass.</p><p>Entropy &#8211; measured in bits between zero and one for binary outcomes &#8211; correlates closely with the probability of a goal for values between 0.0 and 0.5. As xG values continue to rise toward 1.0, entropy logarithmically dwindles back down toward zero. It makes intuitive sense that entropy would reach an inflection point around 0.5 xG since that value represents the highest possible degree of uncertainty for the outcome of a possession.</p><div><hr></div><h3>Breakaway Entropy</h3><p>Consider a breakaway situation. An attacker races toward goal and the goalkeeper rushes out to close the angle. At this point, entropy is high: the possession might end in a 0.99 xG tap-in if the attacker rounds the keeper, or no shot at all if the goalkeeper manages to smother the ball. The entropy only collapses as the attacking player and the opposing goalkeeper collide and the play resolves with either a goal scored or a heroic save.</p><p>(Of course, there could also be a penalty or a rebound. For that &#8211; let&#8217;s get a refresher on <a href="https://www.centralwinger.com/p/penalties-and-conditional-probability">conditional probability</a>)</p><p>Plotted over time, entropy will occasionally demonstrate asymptotic-like behaviour. In the split seconds before-and-after a ball is struck, it will jump instantly between values derived from the pre-shot and post-shot xG values.</p><p>In most cases, this entropy spike coincides with the moment the ball is struck for a shot. But there are instances, like the breakaway, where the moment of highest uncertainty can occur a few moments earlier. This brings the problem with cumulative xG into focus and earns Expected Goals a reputation of having an outcome bias.</p><div><hr></div><h3><strong>Reframing Big Chances</strong></h3><p>Let&#8217;s start by detaching Big Chances from shots entirely. Instead of focusing on shots, we could reframe Big Chances as <strong>possessions where the goal probability crosses an arbitrary threshold along a relatively smooth entropy path</strong>. The arbitrary threshold to be determined later, of course, by someone else. Minimal math in this blog post.</p><p>This definition will filter out chaotic moments where a chance falls into a striker&#8217;s lap and emphasize moments of deliberate, repeatable opportunities created through controlled progression and decision-making.</p><p>Under this reframed concept, provided that it didn&#8217;t emerge from a sudden defensive blunder, a breakaway would qualify as a Big Chance even if a shot was never taken. This approach avoids rewarding unearned xG tied to chaotic or situational factors. Instead, it focuses on whether the team successfully created a repeatable, high-value opportunity.</p><p>A penalty, however, would not qualify as a Big Chance due to the dramatic entropy whiplash &#8211; a residual of the xG of the possession suddenly exploding. I&#8217;d argue this is a formulation of the intuition behind why we&#8217;ve generally excluded penalties from cumulative xG totals.</p><div><hr></div><h3><strong>Unified Model</strong></h3><p>This also plays nicely with an important soccer analytics discovery from the early times &#8211; shooting ability seems to be deeply unstable. An entropy-based approach may imply a unified theory since post-shot xG values can reach far above the 0.5 entropy/xG inflection point, suggesting there shouldn&#8217;t be much signal to be detected in the xG stratosphere.</p><p>The motivation of this approach is the hypothesis that teams and players don&#8217;t have much control over these extraordinary xG values that are occasionally attributed to individual moments. And if they don&#8217;t have much control over these moments, they probably don&#8217;t reflect a pattern of behaviour that would lead to similar moments occurring in the future.</p><p>I suspect that counting Big Chances in this manner might serve as a better predictor of future goals than the current established practice of using xG to predict future goals, as long as you carefully select the right xG threshold. This can probably be tested. On here, I&#8217;m just a theorist &#8211; but I&#8217;d gladly grant a guest <em>Central Winger</em> blog post to someone who constructs a compelling experiment.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Béziers, Bivariates, and Beyond]]></title><description><![CDATA[Some Tricks I Wish I Knew Earlier]]></description><link>https://www.centralwinger.com/p/beziers-bivariates-and-beyond</link><guid isPermaLink="false">https://www.centralwinger.com/p/beziers-bivariates-and-beyond</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 09 Dec 2024 13:02:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/340145e5-b6e2-44a3-9702-eadec0ed5ced_1456x832.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This week, I spent time thinking about a few tricks that I've picked since the time I stopped regularly blogging that I have found particularly useful for soccer analytics, and where I've found them. A few of them have sample code available in my <a href="https://github.com/devinpleuler/analytics-handbook">Soccer Analytics Handbook</a>, which probably needs an update at this point!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p><strong>Single-Pixel Selection</strong> &#8211; Javier Fernandez and Luke Bornn, <em><a href="https://arxiv.org/abs/2010.10202">SoccerMap: A Deep Learning Architecture for Visually-Interpretable Analysis in Soccer</a></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rxnd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rxnd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 424w, https://substackcdn.com/image/fetch/$s_!rxnd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 848w, https://substackcdn.com/image/fetch/$s_!rxnd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 1272w, https://substackcdn.com/image/fetch/$s_!rxnd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rxnd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png" width="602" height="389.07670454545456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1408,&quot;resizeWidth&quot;:602,&quot;bytes&quot;:353095,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rxnd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 424w, https://substackcdn.com/image/fetch/$s_!rxnd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 848w, https://substackcdn.com/image/fetch/$s_!rxnd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 1272w, https://substackcdn.com/image/fetch/$s_!rxnd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86b6bda6-dfb4-4c1b-ad2c-94fef3517627_1408x910.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This elegant technique blew my mind when I first dug a bit deeper into this paper, and opened my eyes to the surprising flexibility of deep learning. When building a pass difficulty model in a supervised manner &#8211; as opposed to Spearman's physical approach (which I adore!) &#8211; it certainly was not obvious to me how you would produce a continuous probability surface from a training set of passes.</p><p>Since only a single point in the true output surface is observed &#8211; the event&#8217;s location &#8211; the prediction problem becomes significantly more complex. SoccerMap &#8220;<em>provides a novel solution for learning a full prediction surface when there is only a single-pixel correspondence between ground-truth outcomes and the predicted probability map</em>.&#8221; Even evaluating loss at just one pixel, the model can infer continuous probabilities across the entire field. Really clever, and super useful.</p><p><strong>Monotonic Constraints</strong> &#8211; Dinesh Vatvani, <em><a href="https://statsbomb.com/articles/soccer/upgrading-expected-goals/">Upgrading Expected Goals</a></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zW_W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zW_W!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 424w, https://substackcdn.com/image/fetch/$s_!zW_W!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 848w, https://substackcdn.com/image/fetch/$s_!zW_W!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 1272w, https://substackcdn.com/image/fetch/$s_!zW_W!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zW_W!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif" width="1200" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zW_W!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 424w, https://substackcdn.com/image/fetch/$s_!zW_W!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 848w, https://substackcdn.com/image/fetch/$s_!zW_W!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 1272w, https://substackcdn.com/image/fetch/$s_!zW_W!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f33f544-d0e2-4da5-8c36-95ae68d81165_1200x400.gif 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Sometimes you should be opinionated about your data, but it hasn't always been clear to me how to make my opinions known to a model. In this Statsbomb white paper, Dinesh demonstrated an easy-to-grasp concept for controlling your gradient boosted decision tree models by applying monotonic constraints on certain features. </p><p>In other words, this allows you to force certain attributes to always have a positive or a negative effect on the model. For example, in a post-shot xG model, it might be reasonable to force goal probability to always increase alongside ball velocity. When you're confident about your assumptions, this is a great method to help prevent model overfitting.</p><p><strong>Clustering B&#233;zier Curves</strong> &#8211; Sam Gregory, <em><a href="https://static.capabiliaserver.com/frontend/clients/barca/wp_prod/wp-content/uploads/2020/01/40ba07f4-ready-player-run-barcelona.pdf">Ready Player Run: Off-ball run identification and classification</a></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MrZK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MrZK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 424w, https://substackcdn.com/image/fetch/$s_!MrZK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 848w, https://substackcdn.com/image/fetch/$s_!MrZK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 1272w, https://substackcdn.com/image/fetch/$s_!MrZK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MrZK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png" width="486" height="381.45995423340963" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e625443d-7fdd-436b-99d0-f26de5310336_874x686.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:686,&quot;width&quot;:874,&quot;resizeWidth&quot;:486,&quot;bytes&quot;:190889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MrZK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 424w, https://substackcdn.com/image/fetch/$s_!MrZK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 848w, https://substackcdn.com/image/fetch/$s_!MrZK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 1272w, https://substackcdn.com/image/fetch/$s_!MrZK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe625443d-7fdd-436b-99d0-f26de5310336_874x686.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first time this concept (outside of Adobe Illustrator) came on my radar was actually via the <em><a href="http://www.lukebornn.com/papers/miller_ssac_2017.pdf">Miller and Bornn 2017</a></em> paper, but Sam did a great job bringing the techniques into soccer for the purposes of clustering run types.</p><p>B&#233;zier curves serve as a unintuitive yet remarkably robust method for dimensionality reduction on player tracks. They can reduce complicated trajectories of varying frame lengths down to vectors of a consistent shape and size, allowing for all kinds of potential downstream analysis like clustering.</p><p><strong>Player Position Distributions</strong> &#8211; Laurie Shaw and Mark Glickman, <em><a href="https://static.capabiliaserver.com/frontend/clients/barca/wp_prod/wp-content/uploads/2020/01/56ce723e-barca-conference-paper-laurie-shaw.pdf">Dynamic analysis of team strategy in professional football</a></em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gqKs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gqKs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 424w, https://substackcdn.com/image/fetch/$s_!gqKs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 848w, https://substackcdn.com/image/fetch/$s_!gqKs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 1272w, https://substackcdn.com/image/fetch/$s_!gqKs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gqKs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png" width="1456" height="367" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:367,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:802120,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gqKs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 424w, https://substackcdn.com/image/fetch/$s_!gqKs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 848w, https://substackcdn.com/image/fetch/$s_!gqKs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 1272w, https://substackcdn.com/image/fetch/$s_!gqKs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9da22e-881e-479f-bd0a-56c4b5d8bffe_1666x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is one of my favourite soccer analytics papers of all time. Not necessarily because it's the best way for identifying shifts in team formation &#8211; I think there are better ways to do that these days &#8211; but because of all the different tricks that Laurie uses in the paper.</p><p>The most impactful one (for me, at least) is the modelling of each player's position as a bivariate normal distribution with a covariance matrix estimating the extent of a player's positional deviation. It&#8217;s a clear upgrade over plotting average positions where you don&#8217;t have any understanding of positional variability.</p><p><strong>Player Order Invariance</strong> &#8211; The Zoo, <em><a href="https://www.kaggle.com/c/nfl-big-data-bowl-2020/discussion/119400">Big Data Bowl 2020 Winner</a></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x0yt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x0yt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 424w, https://substackcdn.com/image/fetch/$s_!x0yt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 848w, https://substackcdn.com/image/fetch/$s_!x0yt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 1272w, https://substackcdn.com/image/fetch/$s_!x0yt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x0yt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png" width="973" height="228" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:228,&quot;width&quot;:973,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x0yt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 424w, https://substackcdn.com/image/fetch/$s_!x0yt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 848w, https://substackcdn.com/image/fetch/$s_!x0yt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 1272w, https://substackcdn.com/image/fetch/$s_!x0yt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51f3cb2a-8b4f-4387-9ff3-e66914cdeea0_973x228.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Honestly, I'm not completely sure that I understand how this one works. When playing around with neural networks, I have struggled to find architectures that produce deterministic outcomes no matter what order I populate the players features into a vector. In the past I mostly side-stepped this issue and went the route of sparse matrices or team-level surfaces that played nicely with the convolutional layers that seemed to be doing the heavy lifting.</p><p>This winning entry of the 2020 Big Data bowl utilizes a 1x1 CNN layer with pooling to process a dense tensor of relative player features. This technique allows for player order invariance and allows the state to be represented in a much more economical fashion. GNN's are probably the modern solution to this now, but this opened my eyes to a whole new world of possibility.</p><p><strong>Pareto Frontier</strong> &#8211; Ian Graham, <em><a href="https://www.penguin.co.uk/books/462193/how-to-win-the-premier-league-by-graham-ian/9781529934632">How to Win the Premier League</a></em></p><p>Unlike most concepts in soccer analytics, Ian didn&#8217;t invent this one. However, he did introduce it to the soccer analytics canon in the chapter <em>Stats and Snakeoil</em>, under the section <em>The Tyranny of Metrics</em>. His explanation gave me a fresh perspective on player profiles and highlighted the risks of relying on too many metrics for player evaluation.</p><p>The Pareto Frontier represents the line where individual players achieve the best possible trade-off between two or more competing metrics. Players on this frontier are those for whom improving one metric would necessarily worsen another, forming the outer boundary of performance for the given metrics. In the book, Ian uses <em>Expected Assists</em> and <em>Pressure Regains</em> as an example. If you&#8217;re looking for a player with both, there will be many players who have a lot of one given the other.</p><p>The key takeaway is that as you extend this concept to include a large number of metrics, the likelihood increases that any player might sit on the Pareto Frontier for some combination of metrics. This enables you to find a context where any player can be framed as &#8220;the best&#8221; at something, based on certain trade-offs. While this flexibility might seem appealing, it can easily lead to misleading conclusions, underscoring the need for caution.</p><p><strong>Tools for Tiny Teams</strong> &#8211; Ben Torvaney, <em><a href="https://www.statsandsnakeoil.com/2021/05/28/tools-for-tiny-teams/">Stats and Snakeoil</a></em></p><p>To round out this list, and to keep our feet on the ground, I wanted to add my favourite blog post of the last few years. A lot of analytics teams are tiny and don't have the time or resources to chase some of the exciting ideas above. The principals discussed in this post will get you most of the way there.</p><p>I find myself going back to this blog post every couple of months and pick up something new. On my latest read, I learned about <code>unaccent</code> for Postgres &#8211; neat! While I've moved away from it recently, we also implemented <code>dbt</code> because I learned about it here. It's all good advice.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Unexpected Origins and the Fermi Paradox]]></title><description><![CDATA[Can foundational soccer ideas achieve escape velocity?]]></description><link>https://www.centralwinger.com/p/unexpected-origins-and-the-fermi</link><guid isPermaLink="false">https://www.centralwinger.com/p/unexpected-origins-and-the-fermi</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 02 Dec 2024 13:03:24 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ba91bb1b-5f70-4904-8653-c47143849965_420x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I recently finished Ian Graham's <em><a href="https://www.penguin.co.uk/books/462193/how-to-win-the-premier-league-by-graham-ian/9781529934632">How to Win the Premier League</a></em>, which recounts his time at Liverpool from an inside perspective. I found it very enjoyable, and you should pick it up.</p><p>My European football knowledge is severely limited, so a lot of the specific characters and transfer talk didn't really land with me. This is not a criticism &#8211; you need characters for a story. But what I found especially captivating was his retelling of the history of soccer analytics. It felt like I was reading an alternative history of a time period that I thought I knew quite well.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><p>The book retcons the entire canon of provenance regarding soccer analytics. This includes the origin of foundational concepts such as Expected Goals and Possession Value, and probably others that I don't remember off the top of my head.</p><p>Personally, I credit Sam Green for the creation of Expected Goals via an <a href="https://www.statsperform.com/resource/assessing-the-performance-of-premier-league-goalscorers">OptaPro blog post in 2012</a>, though I concede that the concept wasn&#8217;t new. And <a href="https://bsky.app/profile/srudd-src.bsky.social">Sarah Rudd</a>'s Markov model from <a href="https://nessis.org/nessis11/rudd.pdf">NESSIS 2011</a> seems to be the progenitor of modern possession value models like Expected Threat. </p><p>According to the book, Ian and the teams surrounding him independently invented these concepts well before being introduced to the public. And Ian suggests that his efforts, just like Sam&#8217;s, were performed without knowledge of the prior work such as by Richard Pollard and <a href="https://en.wikipedia.org/wiki/Charles_Reep">Charles Reep</a> titled <em><a href="https://www.researchgate.net/profile/Richard-Pollard-3/publication/227692321_Measuring_the_effectiveness_of_playing_strategies_at_soccer/links/59dd4e4caca272b698e15fbc/Measuring-the-effectiveness-of-playing-strategies-at-soccer.pdf">Effectiveness of Playing Strategies</a></em>. I&#8217;ve lifted a figure from that research below, which should look remarkably familiar.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4hjP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4hjP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 424w, https://substackcdn.com/image/fetch/$s_!4hjP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 848w, https://substackcdn.com/image/fetch/$s_!4hjP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 1272w, https://substackcdn.com/image/fetch/$s_!4hjP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4hjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png" width="610" height="393.03538175046555" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:692,&quot;width&quot;:1074,&quot;resizeWidth&quot;:610,&quot;bytes&quot;:90857,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4hjP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 424w, https://substackcdn.com/image/fetch/$s_!4hjP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 848w, https://substackcdn.com/image/fetch/$s_!4hjP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 1272w, https://substackcdn.com/image/fetch/$s_!4hjP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0be2060-9b2e-47d1-9de7-094dc71209e1_1074x692.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>The Adjacent Possible</h3><p>The phenomenon of multiple discovery is deeply fascinating. The most famous example is probably Calculus, which was independently formulated by both Newton and Leibniz in the 17th century<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. There are plenty of other important historical examples, like evolution by natural selection and the light bulb &#8211; all similar in cultural magnitude to Expected Goals.</p><p>It seems like this occurs when the broader environmental, cultural, and technological conditions make an idea or discovery ripe for development. This is often attributed to the "zeitgeist," or spirit of the times, which fosters the necessary prerequisites for such innovation.</p><p>There is a concept coined by theoretical biologist <a href="https://en.wikipedia.org/wiki/Stuart_Kauffman">Stuart Kauffman</a> called <em>The Adjacent Possible</em> and it describes how innovation unfolds within the constraints of the current environment while expanding its boundaries. It refers to the set of possibilities that become accessible when a new innovation or discovery is made, opening doors to previously unattainable states.</p><p>The introduction of event data into the soccer analytics zeitgeist in the mid 2000's (and popularization in the early 2010&#8217;s) was the most obvious catalyst, which opened the door to many of these foundational models. <em>Moneyball</em> probably played a motivating factor as well, published in 2003. Though, many executives probably waited for the movie in 2011.</p><div><hr></div><h3>Escape Velocity</h3><p>It is intriguing how few novel ideas have achieved escape velocity from the gravity of team environments. If teams are genuinely squirrelling away their analytics discoveries, I&#8217;d expect to have found a few more breadcrumbs as evidence!</p><p>There are some hints that have been exposed. One uncovered crouton is Liverpool's <a href="https://deepmind.google/discover/blog/tacticai-ai-assistant-for-football-tactics/">collaboration with DeepMind on set pieces</a>, which I think is a pretty clear statement of intent regarding their strategic research direction. And of course there is all the public work that Javier Fernandez <a href="https://arxiv.org/abs/2010.10202">published</a> while employed by FC Barcelona. But there aren't many additional examples to point to, and the exceptions prove the rule.</p><p>I want to briefly contrast this with professional baseball. Analytics is a considerably more established discipline and practice within MLB. Therefore, you see a lot more members of staff switching between organizations to fulfill various personal career motivations and aspirations. Stashed in their luggage are uncovered truths about the game of baseball and the state of the "adjacent possible", which causes a proliferation of ideas and best practices.</p><p><em>(There&#8217;s probably a missed opportunity to title this section Exit Velocity, as a cheeky nod to advanced baseball statistics, but I prefer the orbital mechanics metaphor.)</em></p><p>My perspective on sharing new research methods has always been a pretty liberal one. I believe that it is in the best interest of the top research teams to steer and cultivate the public ecosystem because they should be best positioned to take immediate advantage of novel innovations that percolate.</p><p>I'm unsure if other organizations or analytics leaders view it  the same way that I do. I suspect that many embedded analysts fear arriving empty handed when they come up for air.</p><div><hr></div><h3>Extraterrestrial Soccer</h3><p>I'd like to suggest a soccer analytics formulation of the <a href="https://en.wikipedia.org/wiki/Fermi_paradox">Fermi paradox</a>, which observes the lack of evidence for extraterrestrial life despite the overwhelming statistical likelihood.</p><p><em>If the growth of soccer analytics is truly intuitive, with its core principles being independently discovered time and again, why hasn&#8217;t more of this surfaced?</em></p><p>Here is a non-exhaustive list of plausible explanations.</p><ul><li><p>Perhaps the conditions are no longer fertile for multiple discovery. The distinct lack of tracking data at scale in the public domain really concerns me.</p></li><li><p>Teams have managed to remain remarkably tight lipped. And it seems plausible that gambling companies would be particularly motivated to maintain an edge. </p></li><li><p>Not enough practitioners have bounced from club-to-club to incite a cascading cross-pollination event across a multitude of organizations.</p></li><li><p>Teams actually don't have much to share and the state of analytics inside of clubs isn't much more advanced than the public. Grace Robertson <a href="https://open.substack.com/pub/onfootball/p/how-do-we-tell-the-story-of-football?r=13mvc&amp;selection=0cc43f23-0388-4f0d-a2f2-386efbf1bf7d&amp;utm_campaign=post-share-selection&amp;utm_medium=web">comedically suggested otherwise in a related blog post from 2022</a>, which is definitely worth a read.</p></li><li><p>It's too expensive and there aren't many team environments that can support cutting edge research.</p></li><li><p>There actually is a lot of novel research being produced publicly and I'm just not reading it.</p></li></ul><p>The truth probably includes a mixture of these possibilities, and probably a few others that you should leave in the comments. </p><p><em>How to Win the Premier League</em> suggests an inevitably bright future for soccer analytics, but the reality is likely constrained by the gravitational forces of secrecy, siloed environments, and resource limitations.</p><p>If we want to see soccer analytics flourish more openly, it will likely require shifts on multiple fronts: better access to public tracking data, more resource and idea sharing between teams, and a dramatic cultural shift toward valuing open science over guarded advantage.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/p/unexpected-origins-and-the-fermi?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/p/unexpected-origins-and-the-fermi?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>After writing but before publishing, I discovered that <a href="https://bsky.app/profile/markposts.bsky.social">Mark Thompson</a> also made this comparison to Calculus in a 2022 blog post, which you can read <a href="https://www.getgoalsideanalytics.com/everything-need-know-expected-goals-xg/">on his website here</a>. Pretty meta, given the subject matter!</p></div></div>]]></content:encoded></item><item><title><![CDATA[Penalties and Conditional Probability]]></title><description><![CDATA[Expected Goals still need storytelling.]]></description><link>https://www.centralwinger.com/p/penalties-and-conditional-probability</link><guid isPermaLink="false">https://www.centralwinger.com/p/penalties-and-conditional-probability</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 25 Nov 2024 14:02:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/513245dd-2f7e-427e-ae7a-afc95cb90496_420x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Earlier this month, I was explaining to a friend how shots from rebounds are measured within the Expected Goals framework. After writing down my thoughts, I realized a couple of things. This is a deeply unintuitive concept to an xG skeptic and it illustrates a series of problems that I have with Expected Goals in general. This blog post doesn&#8217;t fix any of that, but it does aim to explain some of it.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h3>Infinite Rebounds, Infinite xG</h3><p>Since you can never have an individual sequence of play that exceeds a single expected goal, the current best-practice is to use conditional probability like this:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{Total } xG = xG_1 + (1 - xG_1) \\cdot xG_2&quot;,&quot;id&quot;:&quot;ELSWGLDXOT&quot;}" data-component-name="LatexBlockToDOM"></div><p>To get the total xG of a two-shot sequence, you take the goal probability of the first shot and add the product of the probability of missing the first shot and the goal probability of the second shot. This ensures that overlapping probabilities do not artificially inflate the total expected goals beyond the logical maximum of 1.00 xG for the entire sequence.</p><p>For example, let&#8217;s step through a hypothetical penalty kick sequence:</p><ul><li><p>The direct penalty itself is worth <code>0.75</code> xG, based on historical conversion rates.</p></li><li><p>The penalty is saved and a follow-up shot is attempted by an onrushing teammate.</p></li><li><p>The second shot is worth <code>0.60</code> xG, based on the conditions of the shot at the time it&#8217;s struck &#8211; perhaps the goalkeeper has closed down the angle slightly.</p></li><li><p>The probability that the initial penalty was missed is <code>0.25</code>, so you multiply that by the xG value of the second shot (<code>0.60</code>) to get an incremental probability of <code>0.15</code>.</p></li><li><p>You then combine these values (<code>0.75</code> and <code>0.15</code>) to arrive at a total goal probability of <code>0.90</code> for the entire sequence.</p></li></ul><p>It&#8217;s important to note that while the team accrues <code>0.90</code> xG for the sequence, the two attackers might be credited with individual xG values of <code>0.75</code> and <code>0.60</code>, reflecting their respective shot probabilities without adjustment for sequence overlap.</p><p><strong>The framework can be applied to multiple sequential rebounds via an infinite series of products</strong>. The sum of this series should converge on the true goalscoring probability of the particular penalty kick sequence. This is nothing new.</p><p>Not every saved penalty is going to have such a valuable rebound. In order to measure the true pre-shot goal probability of a penalty sequence, you have to look across the population of rebounds to determine an average and apply their conversion rates within the framework illustrated above. I think this lands somewhere around <code>0.78</code> xG for the entire sequence, but it depends on the competition.</p><div><hr></div><h3>From a Certain Point of View</h3><p>To an untrained or skeptical eye, it&#8217;s going to appear pretty strange when your post game match report has player boxscores with individual xG totals that don&#8217;t sum up to the overall team total at the top of the PDF. From their perspective, it will look like a software bug.</p><p>This is one of the distinct explanatory dangers of using Expected Goals as a metric as opposed to a unit of measure. The view is far too blurry between xG and the menagerie of different methods and acronyms for measuring <em>P(Goal)</em> at any particular timestamp.</p><p>This is painful to keep track of. It would be a whole lot easier if you could get away a simple SQL view that looks something like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f5H-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f5H-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 424w, https://substackcdn.com/image/fetch/$s_!f5H-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 848w, https://substackcdn.com/image/fetch/$s_!f5H-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 1272w, https://substackcdn.com/image/fetch/$s_!f5H-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f5H-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png" width="600" height="313.5989010989011" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:761,&quot;width&quot;:1456,&quot;resizeWidth&quot;:600,&quot;bytes&quot;:127496,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f5H-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 424w, https://substackcdn.com/image/fetch/$s_!f5H-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 848w, https://substackcdn.com/image/fetch/$s_!f5H-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 1272w, https://substackcdn.com/image/fetch/$s_!f5H-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3752f97-488a-4bec-b953-3b0a6e59131f_1744x912.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I made this via <strong>Snappify</strong>, which is pretty neat. Substack code blocks don&#8217;t support syntax highlighting.</figcaption></figure></div><p><strong>But you can&#8217;t accumulate xG totals like you accumulate passing totals!</strong> For the reasons outlined above, a team's total xG is not the sum of individual player xG totals. Fundamentally, xG is neither a Player or a Team metric &#8211; the measurement only retains conceptual integrity when it is observed as an individual shot metric.</p><div><hr></div><h3><strong>A Framework Under Pressure</strong></h3><p>There are a bunch of not-fun considerations and edge-cases that put strain on the conceptual xG framework (which you can fortunately mostly ignore with limited peril). Such as:</p><ul><li><p>If a player has their <code>0.75</code> xG penalty saved but it becomes a subsequent <code>0.60</code> xG opportunity, should the shooter actually only be demerited <code>0.15</code> goal units for missing the penalty? This is what a Markov-based approach would advocate for. </p></li><li><p>If the shot hits the post (which has a <code>0.00</code> PSxG value) and bounces into a juicy rebound position for a teammate, should we consider the null PSxG value a true Markov state?</p></li><li><p>Do we need an additional conditional probability for the likelihood of the referee awarding a penalty in the first place? This would makes sense if we&#8217;re measuring byproducts of team process as opposed to measuring outcomes governed by noise.</p></li></ul><p>How you ultimately untangle this credit-attribution and value-accumulation mess depends on what sort of question you&#8217;re answering.</p><p>If you were evaluating team performance, what xG value should you assign to the saved-penalty scenario discussed earlier?</p><ul><li><p>Should you simply use the raw <code>0.75</code> xG value for the penalty? (Probably not &#8211; this article&#8217;s whole premise suggests a more nuanced approach.)</p></li><li><p>Should you use a pre-shot, conditionally adjusted value like <code>0.78</code>, which accounts for the entire scoring probability at the start of the sequence?</p></li><li><p>Or should you use the <code>0.90</code> xG that combines the initial shot and the favorable rebound?</p></li></ul><p>Postgame, if a coach asks how many goals the team deserved to score in the match, I&#8217;d lean toward using the methodology that incorporates the specifics of the rebound. However, if you&#8217;re calculating an xG total to model future performance, you&#8217;d be better off using an approach that estimates repeatable processes.</p><p>Anyway, the extreme goal expectation values attached to penalty situations certainly exacerbate some of these conceptual problems with how xG is typically used, but they&#8217;re quite useful for demonstration.</p><div><hr></div><h3><strong>Closing the Barn Door</strong></h3><p>Another friend of mine suggested that this blog post was a great example of <em>closing the barn door after the horse has bolted</em>. I was unfamiliar with this idiom, but it&#8217;s an appropriate one. Expected Goals aren&#8217;t going anywhere, and for good reason &#8211;&nbsp;it&#8217;s an incredibly useful tool. But as analysts, we should think critically about how we use it and what it means in different contexts.</p><p>The takeaway? xG isn&#8217;t just a metric or a unit of measurement; it&#8217;s a story about probability, process, and context. And like any story, it&#8217;s only as good as the way we tell it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Substituting Similarity]]></title><description><![CDATA[Summary statistics are problematic for comparing players]]></description><link>https://www.centralwinger.com/p/substituting-similarity</link><guid isPermaLink="false">https://www.centralwinger.com/p/substituting-similarity</guid><dc:creator><![CDATA[Devin Pleuler]]></dc:creator><pubDate>Mon, 18 Nov 2024 13:32:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4ca63ffe-4f9e-4201-9f1f-c7854e98eb4e_420x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the last few years, I&#8217;ve done a lot of thinking around the notion of similarity in sport and I&#8217;ve come to believe this is the most important problem in soccer analytics. Whether you&#8217;re scouting players, analyzing opponents, or designing a recruitment strategy, you&#8217;re repeatedly grappling with the same fundamental question: <em>Is Player A similar to Player B?</em> This might seem straightforward, but as with many things in analytics, the simplicity is deceptive.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.centralwinger.com/subscribe?"><span>Subscribe now</span></a></p><p>Take recruitment, for example. If you&#8217;re replacing a player, your immediate question is likely: <em>Who can replicate their role?</em> For central positions where output metrics are less obvious, you&#8217;re less focused on raw production and more concerned with understanding how a player might <em>play</em> in a specific tactical context. This isn&#8217;t an easy question to answer, and current approaches often fall short.</p><p>Here&#8217;s why.</p><div><hr></div><h3>Limits of Summary Statistics</h3><p>In his book <em>Thinking, Fast and Slow</em>, Daniel Kahneman introduced (to me, at least) the concept of <strong>substitution bias</strong>, where people unconsciously replace a difficult problem with an easier one. This happens all the time in sports analysis.</p><p>The hard problem is understanding player similarity in terms of their decision-making tendencies&#8212;how they think and act in specific situations. But because that&#8217;s challenging to measure, we often substitute it with an easier question: <em>How similar are their statistics?</em></p><p>This substitution creates significant and cascading issues. Advanced process-oriented metrics like expected goals, progressive passes, or on-ball value are helpful, but they&#8217;re still outputs shaped by a host of external factors:</p><ul><li><p><strong>Team Context</strong>: A player&#8217;s statistics are influenced by their team&#8217;s style, coaching preferences, and tactical configuration.</p></li><li><p><strong>Game State</strong>: Variance introduced by scorelines, opposition tactics, and randomness adds noise.</p></li><li><p><strong>Ambiguous Causality</strong>: Summary metrics are often the result of team interactions, making it difficult to isolate the player&#8217;s individual contribution.</p></li></ul><p>While comparing summary vectors of advanced stats can sometimes approximate similarity, it&#8217;s fundamentally a substituted problem&#8212;and it often leads to misleading conclusions.</p><div><hr></div><h3>Stimulus and Response</h3><p>Instead of focusing on outputs, I think we should frame this as a stimulus-response problem: when a player is presented with a specific situation, how do they respond? And are there patterns or tendencies which can be uncovered in how individual players react to certain categories of situations?</p><p>Imagine a scenario where a midfielder has the ball just outside the penalty area:</p><ol><li><p>Do they attempt a safe pass back to the center-back?</p></li><li><p>Do they risk a through-ball to split the opposing back line?</p></li><li><p>Do they switch play with a long cross-field pass?</p></li></ol><p>Each decision tells us something about the player&#8217;s tendencies. But here&#8217;s the catch: we&#8217;ve only recently gained the tools to study this. In the event data era, we were not privy to the full decision space because we lacked positional context of what other options were available. Tracking data, however, is changing the game.</p><div><hr></div><h3>Player Embeddings</h3><p>With tracking data, we can start building <strong>player embeddings</strong>&#8212;neural network-generated representations of a player&#8217;s decision-making tendencies. These embeddings allow us to simulate how a player might act in hypothetical scenarios.</p><p>For example, let&#8217;s say Player A and Player B are central midfielders. We can use their embeddings to compare how they respond to similar situations&#8212;like breaking defensive lines or retaining possession under pressure. This lets us evaluate their decision-making tendencies more directly, side-stepping some of the biases baked into summary stats.</p><p>Below are some example surfaces that were generated for my 2021 talk at <a href="https://www.nessis.org/">NESSIS</a> titled <em>Player Masks: Encoding Soccer Decision-Making Tendencies</em>, which demonstrates this concept using <strong>Statsbomb 360</strong> data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vgWm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vgWm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 424w, https://substackcdn.com/image/fetch/$s_!vgWm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 848w, https://substackcdn.com/image/fetch/$s_!vgWm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 1272w, https://substackcdn.com/image/fetch/$s_!vgWm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vgWm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png" width="1456" height="722" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:722,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:402995,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vgWm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 424w, https://substackcdn.com/image/fetch/$s_!vgWm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 848w, https://substackcdn.com/image/fetch/$s_!vgWm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 1272w, https://substackcdn.com/image/fetch/$s_!vgWm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64ff53ff-2ccb-49ec-b3a3-fbc8b2106af1_1838x912.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">These surfaces show the probability distribution of pass destinations for three different situations for a generic (i.e. average) player embedding.</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kmMQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kmMQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 424w, https://substackcdn.com/image/fetch/$s_!kmMQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 848w, https://substackcdn.com/image/fetch/$s_!kmMQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 1272w, https://substackcdn.com/image/fetch/$s_!kmMQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kmMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png" width="1456" height="845" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:845,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:294107,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kmMQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 424w, https://substackcdn.com/image/fetch/$s_!kmMQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 848w, https://substackcdn.com/image/fetch/$s_!kmMQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 1272w, https://substackcdn.com/image/fetch/$s_!kmMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F327490ff-c990-4b22-9012-5eb2183622d8_1762x1022.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This surface demonstrates the difference between likely passing destination surfaces generated by two different player embeddings.</figcaption></figure></div><div><hr></div><h3>Strengths, Limitations, and Future Directions</h3><p>This approach has clear advantages:</p><ul><li><p><strong>Contextual Comparison</strong>: Simulated scenarios reduce the noise of team and game-state effects.</p></li><li><p><strong>Decision Focus</strong>: Embeddings capture how players think, not just what they produce.</p></li></ul><p>But it&#8217;s not perfect. Even the best embeddings will still reflect some influence of team style, coaching, and the opportunities presented by specific systems. And while embeddings offer a more nuanced view, they&#8217;re not always easy to interpret, which can make them tricky to communicate to non-technical audiences.</p><p>I imagine that it might be a bit difficult to persuade a scouting executive on decision making tendencies calculated on synthetic data that you can&#8217;t tie directly back to video!</p><p>Still, as tracking data becomes more widespread, these methods will only improve. We&#8217;re not quite at the point where player similarity is a solved problem, but we&#8217;re getting closer. Additionally, we haven&#8217;t spoken anything about player physical capacity. A player might have elite decision making, but it doesn&#8217;t matter if they can&#8217;t  get themselves where they need to go.</p><div><hr></div><h3>Validation?</h3><p>It is not obvious how you might validate a model like this. But in theory, you would expect that a player would be assigned approximately the same embedding through an identical training process across both a training and a validation data set.</p><p>Practically, you will want to selectively freeze certain portions of the model after training to generate embeddings for future players who were not part of the initial training data set. And you will probably need to retrain something like this on a regular basis to account for drift.</p><p>Of course, you would see some player-level variation season-to-season, or perhaps when playing for different teams, but hopefully there should be some signal that is retained across all of these iterations.</p><p>My toy experiments with this concept almost certainly don&#8217;t have enough data to truly construct robust embeddings and are probably just over-fitting. But, the output surfaces look believable and I think the approach is conceptually strong.</p><div><hr></div><h3>So, What&#8217;s Next?</h3><p>For now, the challenge is twofold: refining these models and integrating them into practical workflows. As a field, we need to consciously resist substitution bias and think deeply about what "similarity" really means and how it applies to the questions we&#8217;re trying to answer.</p><p>Ultimately, it&#8217;s about understanding the game on a deeper level&#8212;breaking down the decisions players make and the factors that influence them. If we get this right, the implications go far beyond player recruitment.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.centralwinger.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Central Winger!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>