The problems with my regression which “predicts” a baseball team’s win-loss record are the following:
- The regression is extremely sensitive to the input parameters, many of which are quite unstable throughout the year. For example, one extra point (0.001) of team OBP results in 0.6 extra predicted wins.
- The standard error of the prediction is quite high (4.68).
- The regression doesn’t seem to be much better than linearly extrapolating a team’s current win-loss record to the entire season.
To demonstrate my point, a run of the regression against the current Cubs team statistics (0.371 team OBP, 3.81 team ERA) now predicts a record of 113-49, a swing of 23 wins in a matter of 15 days. This record would put them 3 wins shy of the regular season win record—first achieved by the 1906 Cubs (in a 152-game season) and again by Lou Piniella’s 2001 Seattle Mariners.
It’s clear that this new prediction is the consequence of the Cubs’ hot streak. Let’s hope they can keep it going.

May 23rd, 2008 at 9:11 am
[...] prediction: 112-50 (0.368 team OBP, 3.73 team ERA, underlying equation). (Previous predictions: 113-49, [...]