Clint Dempsey: Average Position vs. Result


I have been experimenting with some different positional visualization ideas and this is hopefully the first of a handful of related posts. Once I stop being technically inept (and/or lazy), and figure out how to properly plug MySQL into Java/Processing, I can mass produce these for every player in the league. I picked Dempsey because he played very regularly for Fulham (35 starts) last season and was relatively integral to their success.

What you’re looking at is the average position of Clint Dempsey during the 37 EPL games that he appeared in for Fulham during the 2010-2011 season. Light green circles represent his position when Fulham won, the dark green circles represent when Fulham drew, and the red circles are when Fulham lost.

All circles are connected to the season’s average position via a line to show how different the position was from the “norm”. The concentric opaque circles represent one and two standard deviations from the average.

We can make a couple interesting observations from this visualization.  First is the obvious tenancy for Dempsey to drift forward during Fulham wins.

Astute readers will point out the Dempsey was deployed as both a Striker and an outside winger during the season. I recognize this, but it’s tough to discount that Fulham didn’t lose in the 10 games where Dempsey was deployed furthest up the pitch. I recognize that this correlation does not necessarily imply causation. A player shifting backwards could be caused by his team losing – not the cause for the team losing.

The other interesting observation is that the further away Clint’s average position is, the more likely Fulham is to win. For positions beyond one standard deviation, Fulham seems to be about three times as likely to win.

Further ideas for this kind of visualization is including some extra dimensionality. For example, if I weight the size of each circle based upon the positional standard deviation during that game, it would add some meaningful context to some of the outlying data points.

Clutch Goal-scoring in the English Premier League 2010

Extending the work done by Ford Bohrmann (Twitter: @SoccerStatistic) at SoccerStatistically for his Outcome Probability Calculator, I put together a method for weighting the relative importance of a particular goal.

Using Ford’s formulas, the percentile chance of victory can be calculated by the current score and the current minute. For example, a home team up by 1 goal in the 80th minute has a 90.5% chance of winning and a 8.5% chance of drawing. The away team only has a 1% chance of pulling out a victory.

However, if the away team manages to score a goal in the 80th minute, these statistics change dramatically. Suddenly, the home team has only a 17.5% chance of winning, a 70.7% chance of drawing, and a 11.8% chance of drawing. This goal increased the chances of the away team winning by 11.8%. The goal also increased the chance of a draw by 60.2%.

Now, we compare the total expected points before and after the goal by weighing the particular chances of each outcome. Since a victory is worth 3 points, and a draw is worth 1 point, we combine the product of the two outcome point-values and their chances of happening.

For example, before the goal, the away team is expected to walk off with: (0.085)(1 point) + (0.010)(3 point) = 0.115 points.

After the goal, the away team is expected to get: (0.707)(1 point) + (0.175)(3 points) = 1.232 points.

Therefore, the worth (or weight) of this goal is the difference between the two expected values: (1.232 points) – (0.115 points) = 1.117 points

After weighting each goal for its expected point value during the 2010 English Premier League Season, these are the average expected point value for each scored goal (or goal scored against)

Team Average Goal Value
Bolton Wanderers 0.9979
Tottenham Hotspur 0.9882
Wigan Athletic 0.9744
Birmingham City 0.9735
Aston Villa 0.9727
West Bromwich Albion 0.9723
Everton 0.9161
Fulham 0.8796
Manchester United 0.8741
Sunderland 0.8544
Liverpool 0.8479
Manchester City 0.8322
Wolverhampton Wanderers 0.8260
Blackpool 0.8060
Blackburn Rovers 0.7824
Arsenal 0.7779
Chelsea 0.7479
West Ham United 0.7233
Stoke City 0.7212
Newcastle United 0.6658

Bolton, it seems, is the most clutch goal-scoring team during the 2010 EPL season – closely followed by Tottenham. Newcastle’s goals had the lowest average impact on the game.

Average expected value for goal scored against:

Team Average Goal Against Value
Everton 1.0433
Stoke City 0.9963
Bolton Wanderers 0.9610
Liverpool 0.9493
Fulham 0.9101
Blackpool 0.8864
Newcastle United 0.8785
Manchester United 0.8774
Chelsea 0.8515
Aston Villa 0.8486
Wolverhampton Wanderers 0.8457
Manchester City 0.8357
Birmingham City 0.8341
Blackburn Rovers 0.8151
Tottenham Hotspur 0.8085
Sunderland 0.7974
West Ham United 0.7802
Arsenal 0.7755
West Bromwich Albion 0.7416
Wigan Athletic 0.7062

In 2010, Everton gave up the most clutch goals – closely followed by Bolton. On the other hand, Wigan and West Brom were the most stingy – giving up the least amount of value for each goal conceded.

Posted in EPL

Drafting for Value in the MLS SuperDraft

There are approximately 530 currently active players in the MLS. Of which, about 200 of them initially entered the league via the MLS SuperDraft.

Using guaranteed compensation, draft selection number and year drafted – a second degree polynomial regression provides a formula that effectively predicts the expected compensation that a player will be paid based upon the number they were selected from the MLS SuperDraft and how many years it has been since they entered the MLS. This is a significant gain over a standard linear regression which results only in a 29% coefficient of determination. This polynomial (non-linear) regression provides an improved 39% coefficient of determination.

The base salary for a player who entered the MLS via the SuperDraft, according to this statistical model, is $158,962. Depending on the player’s selection number in the draft and how many years the player has been in the league, this expected compensation value fluctuates either up or down. For each pick that the player remained undrafted, they lose $6,627.34 off their base salary but pickup $88.43 multiplied by their pick number squared. In other words, a pick’s expected value decreases after each selection, but the size of the decrease lessens exponentially as the pick number grows.

For example, the salary for a rookie player selected with the third pick will have the expected initial salary of:

$139,876.06 = $158,962.21 – $6,627.34*(3) + $88.43*(3^2)

As the player ages, his salary is expected to increase $1,014.40 per year squared, lose $1,552 per year, and gain $106.18 per year multiplied by the player’s initial draft pick number.

For example, after this player has been in the league for two years, his expected salary grows to:

$141,466.24 = $139,876.06 + $1,014.40*(2^2) – $1,552.70*(2) +$106.18*(2)(3)

Using these same formulas, we can develop a table of relative draft pick values, as well as their expected value after multiple years.

Full table is available at: http://dl.dropbox.com/u/380945/mlsSuperdraft.xls

This chart shows that the value of top picks, while initially high, tend not to increase as dramatically as lower draft picks. For example, the compensation of a player selected with a number three pick is expected to rise only $11,293.76 after four years of being in the league. On the other hand, the 38th draft pick’s compensation is expected to rise $26,158.96 over the same period.

According to the chart, exchanging a number three draft pick for any other two draft picks in the first round (given 18 selection picks per round) would be an upgrade. If this hypothetical team was to exchange their number three draft pick for the 17th and 18th draft picks, the expected salary of the two players is expected to be slightly more than the 3rd pick alone. However, their combined value is expected to increase in value by $34,904.40 over four years. In comparison, the 3rd pick in the same draft would have been expected to increase in value only $11,293.76.

Because of the MLS’s single entity structure, maximizing the cultivation of player value increase is perhaps even more important than maximizing the total value of the team. The player market in the MLS is very similar to playing the stock market, but only worrying about stock value fluctuations – not current stock value. According to this model, it may be in a team’s best interest to invest in “penny stocks”. Essentially, what this chart is suggesting is that it is much harder for a good player to double their value than a lesser rated player.

However, there are certainly statistically relevant ramifications of taking this “penny stock” approach. A player’s fluctuation in value certainly correlates very strongly to the amount of minutes that they play during a season. Also, it is much easier for a team to provide one top draft pick playing minutes, than to provide two lower draft picks with a significant share of time. This methodology doesn’t work by letting these investments ride the bench all season.

Also, there are clear salary cap and roster size-limit complications with taking this approach. With a top draft pick you can expect their salary to remain relatively static. With a lower draft pick (who manages to remain rostered), their value is expected to increase by about $10,000 in the first two years. For teams already pushing the salary cap, lower pick investments may not be the best avenue of growth. For teams with confidence in their ability to maximize a young player’s potential and have salary cap room to spare for long-term investments, this avenue is most certainly worth exploring.

Now, by calculating the expected compensation for every drafted player in the league, we quickly learn which players were good draft picks versus players that were not good draft picks. We will classify every player that has a lower actual compensation total than the expected compensation total as a bad pick. Conversely, we will classify every player that has a higher actual compensation total than the expected compensation total as a good pick. Notice, this classification does not imply that a particular player is a good (or bad) investment at this current point in his career.

Using this methodology (determining the difference between the player’s current salary and their algorithmically calculated expected salary), the ten best MLS SuperDraft picks (that are still currently active in the MLS) of all time are:

Year Pick Name Current Salary Expected Salary Difference
2004 1 *Freddy Adu $594,884 $192,003 $402,881
2001 16 Brian Ching $412,500 $178,464 $234,036
2008 42 Geoff Cameron $245,000 $54,454 $190,546
2004 2 Chad Marshall $320,000 $186,384 $133,616
2006 1 Marvell Wynne $301,667 $170,550 $131,117
2010 8 Dilly Duka $223,000 $111,914 $111,086
2002 50 Davy Arnaud $258,750 $164,643 $94,197
2005 35 Gonzalo Segares $167,750 $84,832 $82,918
2009 41 Danny Cruz $123,000 $45,551 $77,449
2004 28 Khari Stephenson $178,333 $102,373 $75,960

*Freddy Adu is a special case because he has spent a lot of time outside of the MLS before returning. He was also on loan as a designated played and therefore only $415,000 of the player’s salary counted against the salary cap. Even at the league maximum, he is the best draft pick of all time with a positive differential of over $300,000.

By breaking down players based upon position, we can begin to determine what positions tend to do better than others in the draft.

Striker Midfielder Defender Goalkeeper
Average Value Change $5,805 $9,523 -$13,023 $5,307
Standard Deviation of Value Change $91,058 $49,612 $42,919 $40,609
According to these results, you are expected to make, on average, 40% more money drafting a Midfielder than a Striker. With the standard deviation of the Striker difference being more than twice the amount as any other position, it suggests that Strikers are risky picks, but have a greater potential for large payoff.The model we have constructed clearly suggests that a team’s total salary fluctuation, year to year, is much more heavily related to players that were selected late in the draft in comparison to players that were selected early. Because of this result, it is clear that the careful selection and development of late-round picks is more related to a team’s financial growth than early-round picks.

It is important to remember that these conclusions are merely a guideline for drafting with potential value in mind. With such a small sample data size of only a decade of MLS SuperDraft results, it remains difficult to consider this guideline complete. As with any guideline, there will always be exceptions to these rules. Hopefully, with this mathematical model, franchises can better understand the risks that they are taking.