Monday, January 5, 2009

Numbers, Boring Numbers

Rather than try to edit - yet again - what should have been an abortive attempt to draw conclusions from first half totals I'll just take a few minutes to run through some of the numbers that jumped out while I was grinding them, as well as some of the problems that trying to look for predictive power from first half totals.

The main problem can be summed up by 20 and 24. These numbers appear so frequently as halftime totals, and they are so close to mean game totals, that trying to include them in broader groups merely skews the group. 21 and 23 are also common totals, not surprisingly as they are very near the mean scoring range.

Out of 256 first half totals, 20 appeared 25 times as a first half total and 23 more times as a second half total, totalling 9.4% of all datapoints. A total of 24 appeared another 35 times. It should be clear that moving this many datapoints from one group to another will significantly alter the results. In fact, I did move them and ended up with a nearly perfect bell curve for all three groups.



Just a few observations. The first half median was 21 and the second half 20. This is consistent with overall scoring. About 6% more points were scored in the first halves of games. First halves averaged 22.4 points and second halves 21.2. That the median is lower than the mean is no surprise since there is a firm floor (zero) to the number of points that can be scored but no ceiling. For what it's worth the highest scoring half was a 57 point second half. There were 7 halftime totals greater than 50.

Moving back to breakdowns, it really becomes clear that first half scores are not predictive of second half totals. In this second stab I broke totals down into much smaller segments between 17 and 25 to try to capture a better picture of where range breaks were most frequent. Oddly there were no first half totals of exactly 25 (although there were 7 second half totals at this number), but even so this worked out to reasonable groups. Roughly one-third of all first half totals were under 18 and one-third over 25.







17
8
1
0
3
1
3
1
0
9


18-19
2
1
0
0
1
1
1
0
3

20
4
3
3
3
1
1
2
0
6

21
2
3
0
1
1
0
1
0
7

22-23
3
1
0
3
0
1
0
0
4

24
3
5
1
2
0
1
2
0
4

25
3
0
0
1
0
1
1
0
0

>25
17
8
3
4
3
7
6
0
27

So what do we learn from this? The main thing we learn is that manually creating tables in html is an incredible pain. But aside from that? Not much. The rows are first half totals and the accompanying columns second half totals. The main thing we learn from a football sense is that unfiltered scoring is not predictive of future scoring. Lower scoring first halves do shade toward low scoring second halves. I suspect though that if this was normalized for game participants this discrepancy would disappear, in other words two tough defensive teams are typically going to have low scoring games. We don't need to see a chart to believe this would probably be true.

Up later, a look at special situations including games with strong offensive teams, strong defensive teams and first half blowouts.

0 comments:

Post a Comment

About This Blog

Twitter: oblong_spheroid

  © Blogger templates The Professional Template by Ourblogtemplates.com 2008

Back to TOP