Jump to content
IGNORED

Relative Importance Values 2016/17


JEP

Recommended Posts

Although this is the first post on this account, many years back I was registered on here as @JoeCityFan and visit the forum pretty much daily as a reader. I’m a City fan and a maths analyst so as it's the off season and I was bored at the weekend (and inspired by the book Soccermatics by David Sumpter), I thought I'd take a quick look over our results in the hope of finding out something reasonably interesting. The result of this is what I've called a player's Relative Importance Value - (RIV), ie to see if we have performed better when starting with a particular player or not.  To do this I used the ratio between the points gained per start of a player and the points per missed game*.

You could compare two players by comparing their points per game values (PPG), but there is no way to understand how the team performed when those players weren't playing. If Player A achieves a PPG of 1.2 and Player B achieves 1.1 PPG, then we would say that the team did better when Player A played. Though, if the team achieved 1.4 PPG when Player A did not play and 0.8 PPG when Player B did not play, we could assume that Player B was a more integral part of the side than Player A.

The most basic interpretation of the RIV score is if a player's score = 1, then the team performs equally as well when the player starts and does not start. A value greater than 1 indicates that the team achieves more points when this player starts; and a value less than 1 indicates the opposite, so the team performs worse when the player starts. My interpretation is that when the score is high, a player fits into the system that the manager picks and the team benefits. A low score indicates not necessarily that the player is bad (it may be in some cases) but they do not fit, or work well in the system that has been picked.

Some caveats though. This formula is very basic and I really do appreciate that there are multiple levels of complexity that can be added to give a better overall picture of the squad. An assumption I've made is that substitutions do not affect the outcome of the match and essentially we live and die by the team that kicks off. Of course, in reality this isn't the case and substitutions do occasionally have an effect (Rotherham at home?) and it's something that I'd like to investigate further. Something else you'll notice from the graphs is the lack of any bar for Aden Flint. This is simply because he played every game, which means there isn't anything to divide his points per game by.

Results

On to the results. I've uploaded a couple of graphs to the link below. The first is a graph ranking the players by their RIV score and in the second I've tried to group them by position and then rank them.

http://imgur.com/a/6uEd3

Note: Engvall and McCoulsky are there because their squad numbers were between some of the other players' numbers. Djuric's bar is blank because we lost the three games that he started in.

Are the results what you might have thought? Generally most of the graph is as expected, especially with Giefer and Adam Matthews down the bottom! Interestingly our most effective performers are Matty Taylor and Jamie Paterson. If you remember in the short amount of time that Tammy didn't play, we didn't fare too badly, which is reflected by his score of 1.2. A few other things to note

·         Pack and Smith score highly. Smith started 14/21 games in the same side as Pack, so I believe there is a relationship between their scores. Smith's score could have been brought down by him not starting those 7 games alongside Pack.

·         Fielding may be shaky, but he's no Giefer!

·         The general improvement with Wright in defence over Magnusson.

·         Bobby Reid, Callum O'Dowda and Zak Vyner are not starters at the moment and Wilbs might be more suited to off the bench appearances!

·         The performance of the side is very slightly better when Tomlin does not start.

·         Hopefully we'll find a place for Hegeler to fit next year.

·         Joe Bryan with a score less than 0.8??

 

The Curious Case of Joe Bryan

On the back of that bullet point, I tried to work out why Joe Bryan's score was so low, since this is a player who has started the majority of games this season (39) and allegedly been scouted by Premier League clubs. Surely he must be more important to us than that? It occurred to me that he is probably the only player in the side whose position is split between two: left back and left midfield. I've gone back through all of the starting lineups to find the games where I've assumed him to have started at left back or left midfield. The result is the following graph:

http://imgur.com/Xhk7MmS

The difference between the performance of the team in Joe's two positions is pretty clear. We seem to struggle when he plays in midfield but his importance at left back puts him third in the overall squad ranking (ignoring Flint!).

Summary

The RIV is a measure of the effectiveness of starting particular players. It isn't affected by the strength of the opposition squad, the subs made or the overall context of the match. In that sense it is very basic. Though it is not a coincidence that the top players in the squad are those that finished the season so strongly, it is interesting to see the positive effect that Matty Taylor and Bailey Wright had on the club and the performances of the team.

Hopefully this little bit of analysis has been interesting. I look forward to reading, replying and learning from your thoughts. I'm happy to send the data file I've created to anyone who wants it, since it's all open sourced from the BBC website; and if there's anything of interest that I can post then request and I'll let you know if I can do it!

 

Joe

 

*The Method (if you're interested in how I calculated each player's RIV)

I've been through the team selection (from the BBC match reports) for every league match this season and noted the number of points earned by the players who have started, divided this by the number of games they've started to find their points gained per start. Then I've found the difference between the total number of points City achieved and the number of points that player has achieved and then divided that by the number of games they didn't start in. The ratio of these two numbers is the RIV.

For example: The team earned 35 points in the 30 games that Lee Tomlin started. 35/30 = ~1.17 points per start. This then means that we got 19 points in the 16 games he didn't start. 19/16 = ~1.19 points per missed game. RIV = 1.17/1.19 = 0.98

Link to comment
Share on other sites

Great work, interesting to see Little in the above 1 category, no surprise to see Matthews as the worst, and amazed at Matty Taylor, but considering he was involved very heavily at the end of the season, when results went our way...

Maybe people can start to see the benefit he has to the team, even though he is not banging in the goals.

Link to comment
Share on other sites

17 minutes ago, whale said:

Interesting to see Lee Tomlin coming in at around 1. 

 

If I'm understanding this correctly then that means it is of no benefit or detriment whether or not we play him? 

Essentially there is a miniscule drop in results when he started compared to when he didn't is my understanding.

Although not my graph, equations, etc (and fair play to the OP for doing in), I would personally say that 0.02 under the base level is marginal enough to say it makes no difference.

5 minutes ago, asfred said:

Great work, interesting to see Little in the above 1 category, no surprise to see Matthews as the worst, and amazed at Matty Taylor, but considering he was involved very heavily at the end of the season, when results went our way...

Maybe people can start to see the benefit he has to the team, even though he is not banging in the goals.

I'm not that surprised to see Little get above 1.

He was by no means a star player, and there was usually a mistake or two in him in each game, but he had 2 spells where he played fantastically well, and at both points we produced extremely good form.

Not saying we should have kept him, and agree with the club not renewing his deal (though wouldn't have objected to him being here as backup), but he was very unfairly singled out on here.

Link to comment
Share on other sites

Nice stuff.

If you're a bit bored, why not take the score at the time the starting player was taken off, and then weight it as a percentage of 90 minutes.  You could also do the same for players subbed on

eg. Tomlin starts and we are 1-1 for 60 minutes, equals 1 point at 60/90ths, Wilbs comes on and we win 2-1, he gets 2 points for 30/90ths.  You could look at it as if Wilbs got a 1-0 win for 30/90s.

There is also something in the game that all starters start at 0-0, with a point.  Does that need some reflection in the data, especially if you start to integrate sub data.  You could end up with a sub at 2-1 up, turning into a 2-3 defeat, a change of -3 points for 30/90ths.

Food for thought?

Link to comment
Share on other sites

Thanks everyone for the feedback - I'm glad you've found it interesting!

22 hours ago, asfred said:

Great work, interesting to see Little in the above 1 category, no surprise to see Matthews as the worst, and amazed at Matty Taylor, but considering he was involved very heavily at the end of the season, when results went our way...

Maybe people can start to see the benefit he has to the team, even though he is not banging in the goals.

I think Taylor's score is inflated slightly due to him only starting 9 games since arriving, but we picked up more points in those 9 games than we did relative to the rest of the season so his presence really did have a positive effect on the team. With that in mind, I think he could be an important player for us next season in the 10 role, replacing Tomlin. I wouldn't be surprised to see him take over that shirt number too.

22 hours ago, JamesBCFC said:

Essentially there is a miniscule drop in results when he started compared to when he didn't is my understanding.

Although not my graph, equations, etc (and fair play to the OP for doing in), I would personally say that 0.02 under the base level is marginal enough to say it makes no difference.

That's my understanding too. Tomlin's score of just under 1 means that we performed pretty much as well in the games that he started in than the games that he didn't. It will be interesting to see whether, should Tomlin leave this summer, he has a similar indifferent impact on his new club next season. 

On the topic of Little I agree, he did excellently for us last season in the role he was given. The graph also highlights actually how much better he did than our other right back options. What I imagine has happened is that Little has left for first team football (and he deserves it) because he would be a more than capable back-up right back for us and I'm sure the management would feel that too.

22 hours ago, Davefevs said:

Nice stuff.

If you're a bit bored, why not take the score at the time the starting player was taken off, and then weight it as a percentage of 90 minutes.  You could also do the same for players subbed on

eg. Tomlin starts and we are 1-1 for 60 minutes, equals 1 point at 60/90ths, Wilbs comes on and we win 2-1, he gets 2 points for 30/90ths.  You could look at it as if Wilbs got a 1-0 win for 30/90s.

There is also something in the game that all starters start at 0-0, with a point.  Does that need some reflection in the data, especially if you start to integrate sub data.  You could end up with a sub at 2-1 up, turning into a 2-3 defeat, a change of -3 points for 30/90ths.

Food for thought?

Thanks Dave. It occurred to me about halfway through collecting the data that I should have included the times that players were subbed, but was too lazy to go back and start again! I might go back and take another look when I'm bored next.

Similar to that, I've been creating some squads representing our most common starting lineups at various points in the season. I'll post them up when I've got them in a format I'm happy presenting!

Link to comment
Share on other sites

42 minutes ago, JEP said:

Thanks everyone for the feedback - I'm glad you've found it interesting!

I think Taylor's score is inflated slightly due to him only starting 9 games since arriving, but we picked up more points in those 9 games than we did relative to the rest of the season so his presence really did have a positive effect on the team. With that in mind, I think he could be an important player for us next season in the 10 role, replacing Tomlin. I wouldn't be surprised to see him take over that shirt number too.

That's my understanding too. Tomlin's score of just under 1 means that we performed pretty much as well in the games that he started in than the games that he didn't. It will be interesting to see whether, should Tomlin leave this summer, he has a similar indifferent impact on his new club next season. 

On the topic of Little I agree, he did excellently for us last season in the role he was given. The graph also highlights actually how much better he did than our other right back options. What I imagine has happened is that Little has left for first team football (and he deserves it) because he would be a more than capable back-up right back for us and I'm sure the management would feel that too.

Thanks Dave. It occurred to me about halfway through collecting the data that I should have included the times that players were subbed, but was too lazy to go back and start again! I might go back and take another look when I'm bored next.

Similar to that, I've been creating some squads representing our most common starting lineups at various points in the season. I'll post them up when I've got them in a format I'm happy presenting!

Do you reckon the club do this sort of analysis? I expect the data can all be extracted from pro-zone or something but I wonder if it is a set of data that is used widely.

Link to comment
Share on other sites

10 hours ago, Red Right Hand said:

Do you reckon the club do this sort of analysis? I expect the data can all be extracted from pro-zone or something but I wonder if it is a set of data that is used widely.

They must do. Just had a search on the club's website for "Analyst" and they were recruiting for first team analyst last summer. Link here.

A couple of interesting things from the link:

  • The first team analyst is part of the analysis department and reports to the head of analysis - meaning they should have a group of people doing this.
    • To add to this, I'd be very surprised if they didn't already know what I posted at the top of the page
  • A quick google of Scout7 looks like it is some massive player database, which must make data collection and analysis a whole lot easier. The candidate also needs to be able to use Prozone, so there's that too.
  • 4 or the 5 requirements are directly sports/football related rather than statistics (with the 5th being a driving license), which rules me out for ever getting the job!
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...