Jump to content
IGNORED

Lies, Damn Lies and Statistics


Silvio Dante

Recommended Posts

47 minutes ago, 054123 said:

I listen to the Not The Top 20 Podcast and they live by XG to the point where it clouds their judgment IMO.

Apparently our low XG means we’re about to slide away. 

I agree that taking it that far and to such an absolute conclusion is to take it too seriously.

However, having a "actual goals against/for" ratio that is far higher than your xG does suggest that there will be a regression to the mean at some point in the future. That is, it would not be surprising to see our points per game drop. Now you can say "Yeh obviously - we're on an unbeaten run but quite a few games have been draws and really we've not been playing scintillating football". I'd agree whether you look at it anecdotally or statistically that if we keep going as we are our luck will run out and so it is likely that we'll drop off soon.  What the xG does is provide some form of statistical support to that argument.

xG cannot account for players returning from injury or new signings. When Kalas and Nagy are back in the team we may well see our defence tighten up and thus we could actually continue the form we are in, or even improve. That doesn't mean the xG is wrong, it means that taking it to the extreme conclusion that "we're [definitely] about to slide away] was wrong. 

Link to comment
Share on other sites

1 hour ago, 054123 said:

I listen to the Not The Top 20 Podcast and they live by XG to the point where it clouds their judgment IMO.

Apparently our low XG means we’re about to slide away. 

If people simply look at xG it has to cloud judgement. Teams with high xG can finish anywhere in the league because it does not include defending passing etc. Combine xG with defending and you have what's called the e rating. A  low xG and a miserly defence? 

Afobe was outperforming the xG. He was scoring above expectation from the chances created. Would he have continued in the same form? xG here could suggest caution. The expectation has to be no. He is not Messi so would start missing more chances, looking at history elsewhere he was outperforming his norms.

Link to comment
Share on other sites

2 hours ago, 054123 said:

I listen to the Not The Top 20 Podcast and they live by XG to the point where it clouds their judgment IMO.

Apparently our low XG means we’re about to slide away. 

Couldn’t agree more. Very good podcast but that part is annoying. I’ve given up in The Second Tier Podcast as they don’t even seem to watch games! 

Tifo podcast is excellent albeit not championship focused

Link to comment
Share on other sites

On 09/10/2019 at 12:32, Silvio Dante said:

My point here is that 25% is a very decent sample.

Actually, it isn't, at all. Statistics Graduate here.

Each team has played 11 games. For statistical analysis that is a minuscule sample. (38 or 46 at the end of a season is also)

Even considering the total games played so far in the Championship, 132. I can't imagine that being enough to prove statistically significant.

Anything like this must be taken with a huge pinch of salt, as even in a whole season, there is not even nearly enough matches for any team to revert to their xG mean (this will be proven by the final standings at season end being miles off what xG would predict).

In short, it is not unlikely that a team might get lucky, even over a whole season. (Leicester and Chelsea did it, consecutively, in 15/16 and 16/17 if you believe fully in xG. Chelsea overachieved their xG by 23 goals that season. But 36 of their total goals were scored by Diego Costa or Eden Hazard, did these players get lucky, or are they just fair superior to the ranking level used to calculate xG? - I know what I think the answer is)

I'm not saying xG is wholly nonsense, but it's not something to cry ourselves to sleep over either.

Link to comment
Share on other sites

21 hours ago, spudski said:

You might want to read through this for a fuller explanation mate..

https://www.theguardian.com/football/2017/mar/30/expected-goals-big-football-data-leicester-city-norwich

Thanks for nothing @spudski  I read this, in the hope that it may all become clear, but now I’m more confused than ever. I can understand the basics - eg the position of a shot; a shot from a rebound or a set piece etc - but how are the other variables factored in?  For example, the data used to calculate XG must be historical, but historical data can only accurately predict the future if nothing changes. This doesn’t happen with football as team selection, form etc changes on a match by match basis. 
 

Another thing that puzzles me is that nobody seems to mention XGC - expected goals conceded.  The chances of winning a match must be a combination of XG and XPC

Link to comment
Share on other sites

4 hours ago, 054123 said:

I listen to the Not The Top 20 Podcast and they live by XG to the point where it clouds their judgment IMO.

Apparently our low XG means we’re about to slide away. 

Agree.  Started the listen at the start of the season, but I found them to be all-knowing (in their own minds!!!) and as if nobody else’s opinion mattered.

I’m sure they watch lots of football and look at loads of stats, but I think they want everyone to think they are the gospels.

Unsubscribed!

Link to comment
Share on other sites

37 minutes ago, pongo88 said:

Thanks for nothing @spudski  I read this, in the hope that it may all become clear, but now I’m more confused than ever. I can understand the basics - eg the position of a shot; a shot from a rebound or a set piece etc - but how are the other variables factored in?  For example, the data used to calculate XG must be historical, but historical data can only accurately predict the future if nothing changes. This doesn’t happen with football as team selection, form etc changes on a match by match basis. 
 

Another thing that puzzles me is that nobody seems to mention XGC - expected goals conceded.  The chances of winning a match must be a combination of XG and XPC

I think that’s really the problem - football isn’t a static dataset as the teams constantly evolve. This is why I say a 25% sample is reasonable - this year’s championship won’t be the same as next years. So I take the point made that 11 games isn’t a big number, and nor in stats is 46, but if XG is a reliable predictor of a teams quality, which changes year on year, then 11 games/(ok just below) 25% is a decent sample - the team won’t be the same or create the same XG the following season so your maximum sample is 46.

Putting that aside, however.

The point about Maupays chance above struck a chord - and maybe someone more familiar can answer me here. I get the XG is based on the quality of chance etc but is then that overlaid, if not with quality of individual, with level the game with the chance was played at? I can just about live with a logic that says of 100 chances of that nature in the premier league, stats prove 60% will be taken as players will be broadly of similar ability. However, if the xG of the chance covers teams from Liverpool to Leamington Spa the difference in the base data set is huge - and would be another undermining factor.

Essentially, not “apples with apples”

Anyone know?

 

Link to comment
Share on other sites

47 minutes ago, Silvio Dante said:

I think that’s really the problem - football isn’t a static dataset as the teams constantly evolve. This is why I say a 25% sample is reasonable - this year’s championship won’t be the same as next years. So I take the point made that 11 games isn’t a big number, and nor in stats is 46, but if XG is a reliable predictor of a teams quality, which changes year on year, then 11 games/(ok just below) 25% is a decent sample - the team won’t be the same or create the same XG the following season so your maximum sample is 46.

Putting that aside, however.

The point about Maupays chance above struck a chord - and maybe someone more familiar can answer me here. I get the XG is based on the quality of chance etc but is then that overlaid, if not with quality of individual, with level the game with the chance was played at? I can just about live with a logic that says of 100 chances of that nature in the premier league, stats prove 60% will be taken as players will be broadly of similar ability. However, if the xG of the chance covers teams from Liverpool to Leamington Spa the difference in the base data set is huge - and would be another undermining factor.

Essentially, not “apples with apples”

Anyone know?

 

xG values alter in line with divisions ..

Link to comment
Share on other sites

54 minutes ago, Silvio Dante said:

The point about Maupays chance above struck a chord - and maybe someone more familiar can answer me here. I get the XG is based on the quality of chance etc but is then that overlaid, if not with quality of individual, with level the game with the chance was played at? I can just about live with a logic that says of 100 chances of that nature in the premier league, stats prove 60% will be taken as players will be broadly of similar ability. However, if the xG of the chance covers teams from Liverpool to Leamington Spa the difference in the base data set is huge - and would be another undermining factor.

 

This is a great point and I don't know the answer exactly. However, this is where I like to look at a chart like this one from experimental361 https://experimental361.com/2019/10/08/attack-breakdowns-championship-8-oct-2019/. This kind of helps you to compare a specific player with the "average" Championship player.

The key bit of explanation for this is as follows:

There’s a shaded “stripe” which indicates the long-term shot conversion rate of all finishers [in the division] except the top and bottom 10%, so we can identify those whose performance may be unsustainable (i.e. unlikely to be repeated next season). If a player is above the stripe, they’re converting chances at a rate consistent with someone in the top 10% of finishers, and likewise a player below the line is in the worst 10%. Based on what we know about the specific player, we can therefore take a view on whether we expect their scoring rate to continue.

So the below is, I think, quite encouraging for us. It tells us that our goalscorers are scoring at about the rate you would expect them to given the quality of the shots they are taking. Afobe might have been over-performing until injured, and Weimann is slightly over-performing. Essentially I look at this and think that the way we improve our performances is either by i) tightening up the defence or ii) increasing the number of quality chances we take. Fortunately we have our number one CB and a very promising CDM coming back from injury, so hopefully we can tighten up that defence without compromising our attack - and so increase the chance that we won't drop back.

2019-10-08-bristol-city.png?w=860

Link to comment
Share on other sites

3 hours ago, Silvio Dante said:

Thanks.

So that’s saying, using Maupay as the Mark, that 4 out of 10 chances of that nature (ie top level striker, open goal) are missed - or am I missing something?

No that's the expectation. The rating would be .6. A Harry Kane may outperform that figure but the value is based upon a top level average.

Link to comment
Share on other sites

9 hours ago, 054123 said:

I listen to the Not The Top 20 Podcast and they live by XG to the point where it clouds their judgment IMO.

Apparently our low XG means we’re about to slide away. 

Take solace from the fact that they'll have been furious at full time of that Spurs-Bayern game that I mentioned further up, as once again the actual football refused to bend to their Dungeons and Dragons calculations.

Link to comment
Share on other sites

31 minutes ago, Cowshed said:

No that's the expectation. The rating would be .6. A Harry Kane may outperform that figure but the value is based upon a top level average.

Sorry, might be being dense here but isn’t that the same thing? The expectation is 40% of chances of that ilk missed, which leads to a rating of .6 (I.e 60% conversion). Harry Kane may hit 8/10 or 9/10, but that in turn means other top division strikers hit 2/10 to get to the mean average.

So we’re saying that a top division striker may convert an open goal (a la Maupays chance) 2-3 times out of 10?

Again, apologies if missing something but that appears to be what you’re saying xG shows...

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...