Oct 02

Cubs 2006 season record predictions:
May 9: 64-98
August 30: 66-96
Prediction using regression against final team statistics (.319 OBP, 4.74 ERA): 67-95
Actual record: 66-96
Not bad, if I do say so myself.
Since the regression is highly sensitive to team OBP it may not be particularly useful over long time frames. Fortunately for me the Cubs stank pretty consistently throughout the year.
May 09
Based on a regression analysis of the 2005 season, the number of wins of a baseball team can be predicted by the equation W = -55.574 - 15.0869ERA + 609.516OBP. (F-stat = 64.19, R2 = 0.826, SEW = 4.68)
Unfortunately for Chicago Cubs fans, if they continue to maintain their current ERA of 4.57 and OBP of 0.310, I predict they will win 64 ± 9.36 games this year with a 95% confidence interval.
Mar 10
One of my favorite columns in The Wall Street Journal is “Ahead of the Tape”, which is typically in the left column of page C1 of the print edition. Unfortunately, Mr. Whitehouse made a rather silly mistake today. Consider the following passage:
Yet from a statistician’s point of view, the market’s reaction to the [Labor Department's monthly jobs report] is hard to fathom. Over the past six years, the economists’ consensus has missed the reported number, on average, by about 82,000 jobs, according to Bianco Research. That might look like a big difference. But as a percentage of total payroll employment — 135 million — it’s actually very small, less than one tenth of 1%.
Measuring the employment change forecast miss as a percentage of total employed isn’t appropriate. The reason why is that the total payroll employment figure changes relatively little from month to month — a smashing increase of 400,000 jobs in a month is only a 0.3% change.
To illustrate, let’s say the average temperature of the core of the sun is 15,000,000 K and the standard deviation of temperature is 1,000 K. This means that 99.6% of the time the temperature will be between 14,997,000 K and 15,003,000 K. If your forecast is off by 0.1% of the average temperature (15,000 K) it is absurd!
A better method to estimate prediction accuracy is to compare prediction error (probably the root mean squared error) to the standard deviation of the distribution. No doubt that there are better methods yet.
Recent Comments