| |||
You are here > Home > Baseball Newsstand > Discussion
| |||
Baseball Primer Newsblog — The Best News Links from the Baseball Newsstand Tuesday, December 11, 2007The Biz of Baseball: Zimbalist: Review: Diamond Dollars & The Baseball EconomistAn Andrew Zimbalist production…as he reviews Vince Gennaro’s “Diamond Dollars: The Economics of Winning in Baseball” and J.C. Bradbury’s “The Baseball Economist.” On Gennaro’s book…
UPDATE!...J.C. Bradbury responds to to Zimbalist. |
Login to submit news.
You must be logged in to view your Bookmarks. Hot TopicsNewsblog: Opinion: Atlanta Braves make unfortunate anti-vaccine statement with Travis Tritt as NLCS national anthem singer
(94 - 12:59pm, Oct 26) Last: Lassus Newsblog: ‘Dear Rays, Montreal won’t pay for your new stadium’ billboard up near Trop (4 - 12:58pm, Oct 26) Last: Infinite Yost (Voxter) Newsblog: OT Soccer Thread - Transfer! Kits! Other Stuff! (478 - 12:53pm, Oct 26) Last: AuntBea odeurs de parfum de distance sociale Newsblog: 2021 World Series OMNICHATTER! (8 - 12:51pm, Oct 26) Last: What did Billy Ripken have against ElRoy Face? Newsblog: Dodgers Albert Pujols Hits the COVID-19 Injured List (245 - 12:47pm, Oct 26) Last: 57i66135 is available to babysit, for a price Newsblog: Silver Slugger Award finalists announced (12 - 12:43pm, Oct 26) Last: salvomania Newsblog: Perth Heat ‘heartbroken’ as Australian Baseball League cancels upcoming season (14 - 12:42pm, Oct 26) Last: Rough Carrigan Hall of Merit: Most Meritorious Player: 2021 Discussion (18 - 12:18pm, Oct 26) Last: DL from MN Newsblog: Houston vs. Atlanta Is Rob Manfred’s Nightmare World Series (14 - 12:14pm, Oct 26) Last: Cris E Newsblog: MLB Just Tried a Bunch of Experimental Rules in the Minors. How Well Did They Work? (73 - 12:07pm, Oct 26) Last: bunyon Newsblog: Why this will be the most Jewish World Series in baseball history (1 - 11:44am, Oct 26) Last: reech Newsblog: Report: Yankees' Cole, Gardner had altercation after clubhouse prank (32 - 11:32am, Oct 26) Last: Starring Bradley Scotchman as RMc Sox Therapy: Meet the 2022 Red Sox (6 - 10:38am, Oct 26) Last: Darren Newsblog: Major League Baseball work stoppage almost certain on Dec. 2 (8 - 10:19am, Oct 26) Last: RoyalsRetro (AG#1F) Newsblog: Mets should be all-in on Brian Sabean, winning Giants executive, to run front office (33 - 9:45am, Oct 26) Last: Slivers of Maranville descends into chaos (SdeB) |
||
Page rendered in 0.3108 seconds |
Reader Comments and Retorts
Go to end of page
Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.
1. mr. man Posted: December 11, 2007 at 09:00 PM (#2641693)I'm eager to know just how reliable Gennaro's estimates of individual team win curves are...can anyone help me on this? I'm in the process of writing my undergrad hon. econ thesis on whether teams consider their own marginal revenue functions in the free agent market, and whether teams on the playoff 'sweet spot' are paying more per marginal win. Nate Silver, are you out there?
I'm wondering at this point whether to test against Silver's model or Gennaro's update to it...
Only skimmed the review but if Zimbalist is going to offers counter figures, he should reveal his sources as well.
can i ask you if you are also an economist? are you a lawyer like doug pappas was?
i don't mean that to be rude in any way. i am wondering what kind of viewpoint you are looking at things with
thanx
Link is now broken.
-- MWE
I am probably going to get kicked out of some club for this but I'm tempted to start my own This Week In Quotes feature:
Zimbalist’s critique is lacking and borderline dishonest in several places. - JC Bradbury
There also is the possiblity that Keith Law is not being honest in his post. Having been there and witnessed the events, I can assure you that there are several issues which Law alleges that are flat out lies, but if he didn't use the lies he wouldn't be able to make his point. - Tracy Ringolsby
I may be comparing apples oranges here, but Ringolsby's comment started a 417 post thread while there’s nary a peep about Bradbury calling a colleague a liar. Now, I’ll grant you that you actually have to RTFA to find Bradbury’s quote, but it smells like there’s a double standard here.
Second, Bradbury replied under his own name and substantiated his claims with a detailed response to the review.
Ringolsby posted a one line firebomb under a handle, followed by several posts in which similarly wild accusations were made.
He arrives at this outcome by running a multiple regression of runs on batting average, OBP and SLG. The coefficient on OBP is almost three times that on SLG.
This is one very odd regression and rather difficult to interpret.
The multicollinearity is an issue not so much because of its effect on standard errors (I agree with the thrust of Bradbury's response there though, depending on the size of those se's Zimbalist may have a point as well) but because of its effect on how the results need to be interpreted.
That "three times" larger coefficient is the impact of OBP, holding BA and SLG constant. That is, essentially, what is the impact of one extra point worth of walks. The SLG coeff is the impact of an increase in SLG, holding BA and OBP constant -- i.e. one extra point's worth of an extra base (e.g. a single becoming a double).
The BA coefficient has the oddest interpretation. Think about what it means to increase BA while holding OBP and SLG constant. To increase BA while holding OBP constant, you are essentially trading a walk for a single; to increase BA while holding SLG constant, you are essentially trading 1 double for 2 singles.
Whatever these coefficients might tell you, they don't tell you the relative impact of BA/OBP/SLG on scoring. And given what the coefficients are telling you, I'm not that surprised that OBP is three times as important -- it's the only coefficient where you're clearly trading an out for a non-out.
The "problem" is that BA (or hit-rate) is the primary component in both OBP and SLG. The estimated equation:
R = b0 + b1*BA + b2*OBP + b3*SLG + u
can be rewritten as:
R = b0 + b1*BA + b2*(BA+isoOBP) + b3*(BA+ISO) + u
R = b0 + (b1+b2+b3)*BA + b2*isoOBP + b3*ISO + u
(where isoOBP is a nonsensical term equalling OBP-BA)
and this clearly shows that one point of BA is way, way, way more important than the other two (unless one of those coeff's is largely negative).
(Alternatively you could think of it as a system of equations with OBP and SLG also endogenous (and a function of BA) and BA the only exogenous. In that case "b1" is the direct effect only of BA but the question is whether you want that direct effect or the total effect. I'd vote for the total effect. But that seems unnecessarily complicated.)
A cleaner formulation would surely be:
R = c0 + c1*(h/PA) + c2*((bb+hbp)/PA) + c3*(xb/PA) + e
Those variables will still be correlated (though much less so) but their interpretation is much cleaner. Increasing h/PA while keeping bb/PA and xb/PA constant is quite easy to comprehend (it's a single), even if it doesn't exactly give you the impact of BA. I assume the coefficients for the first two will be much larger since you are trading an out for a time on-base whereas the third is not. [Note, you assume the number of PAs stays constant since otherwise all 3 rates change.]
(Note, there are other issues -- you might want to standardise per 27 PA or possibly 27 outs; you'd want to control for baserunning; etc.)
And on the discussion about DIPS, ERA, etc. I've always felt that the "randomness" of ERA isn't so much due to variations in BABIP (though this obviously has an impact) but due to the "random" ordering of events. That is, whether you give up the walk before or after the HR or the double before/after the first out has huge impact on ERA even if they produce the same rate stats. Maybe that evens out more over a season than I think it does.
I saw it - my eyes widened. I'd want to read Bradbury's book first before commenting here, though - my econometric chops are too rusty to go barging in just yet.
Yes, there's one standard for subjects people give a #### about -- witness the large number of posts in BBWAA threads not involving Hat Guy, Jr. -- and another standard for subjects no one could give a tinker's damn about -- witness the dearth of activity in the Larry Fritsch Remembered thread.
But feel free to pretend you've uncovered damning evidence of some agenda.
As for the OBP vs SLG comparison, I've got to side with bradbury...He makes it pretty clear that the effect is 3x more valuable for a -point- of OBP, not for a standard deviation's worth. I suppose one SD's worth would make a more useful comparison, but could be too compplicated to explain.
That said, I've read bradbury's book and found it was in most chapters just proving things most of us here already accepted to be true...or oversimplifying a more complex issue. I wouldn't have a problem recommending it to a new fan, but anyone who's been here awhile will feel like they've already read it.
I've been astonished in my readings of baseball economics how naive the academic community is about modern statistics. Even current journal articles still use AVG, RBI, wins etc for modelling purposes, totally oblivious to the fact that more modern metrics are available. I recall there being a bit of an uproar here over the fact that Hakes & Sauer used OBP and SLG to test the 'Moneyball Hypothesis' instead of the far more specific ISO and BB/PA.
Although I haven't read Bradbury's book, I do think Zimbalist's review read like he just went looking for things to slam. And not infrequently it appears that Bradbury has the better of the argument. That said, on this particular issue Bradbury is mistaken. Tango and others have shown quite definitively that a one point improvement in OBP is equal to a 1.8 point improvement in SLG. The three-to-one ratio is incorrect.
Interestingly , though, this makes surprisingly little difference when you actually evaluate players. Straight OPS, SLG+1.8*OBP, and SLG+3*OBP are all very highly correlated, because OBP and SLG are highly correlated. And that's one reason the whole Moneyball thesis is nonsense: there aren't enough hi-OBP/lo-SLG players to build a team around, even if OBP was "undervalued." (And of course the A's weren't a distinctly high-OBP team anyway.) You might save a few bucks by avoiding the occasional high-SLG/lo-OBP player, but were talking marginal gains not a fundamental team-building strategy.
Well that certainly fits in well with a "publish or perish" culture, doesn't it?
I'd say 'publish or perish' necessitates the aggressive criticism characteristic of journal referees, which is the root of zimbalist's need to go looking for faults. That much of what he criticizes in Bradbury's work is defensible isn't his problem; Zim's obligation is to raise all questions that could possibly be asked.
Not that he wasn't a little harsh about doing it!
Yeah, I guess he sort of is even though he's not actually serving as a reviewer for a peer review journal. He's writing for the bizofbaseball website thus it would have been nice to see a brief overview baseball economics, some contextualization of these books within that overview, a bit more of summary of the books, and then an explanation of the significance of these books, if any, within the field. Instead, the review is pretty much what a journal editor would get, a cut to the chase analysis of why the reviewer believes a particular piece of scholarly work is unpublishable in its current form.
Of course, I may be wrong. Zimbalist writes, "If elasticity is used instead of the estimated coefficient, OBP is 1.8 times greater than SLG." I don't know what "elasticity" means in this context, but maybe he's actually fixing the flaw in Bradbury's work. Can someone clarify?
If that's actually what Bradbury did (I can't find it in the book), it's quite possible. You'd expect 1.8 if you regressed on *only* OBP and SLG. By also including batting average, I bet you could easily wind up with the ratio of coefficients being 3.
For instance, suppose isolated power didn't matter at all. SLG would still be important, but exactly as important as BA. If you included BA in the regression, the coefficient of SLG would be zero, and the ratio wouldn't be 3, it would be *infinity*. But if you included only SLG and OBP, SLG would substitute for BA and you'd get the "real" answer.
If that makes sense.
(Note: this may be a variation of what Walt Davis said in comment 11.)
That is, raising your OBP from .300 to .303 is 1.8x as valuable as raising your SLG from .400 to .404.
OTOH, he may be referring to the standard deviations...a standard deviation of SLG is much larger than a standard deviation of OBP and thus the two are of more comparable value in that regard.
In other words, I have to back Zimbalist on this:
At one point I looked at team runs for a few seasons (like 6 or so, nothing special) and looked at standardized coefficients (i.e. looking at the impact of standard deviation changes, not unit changes) and OBP and SLG came out essentially equal. I broke it up and did it season by season and that was kinda odd -- in about 1/3 of the seasons, SLG was much more important; in 1/3 OBP was much more important and in the others they were about equal. But it was a very small sample of seasons.
The whole idea of using BA, SLG and OBP in a regression is kinda preposterous anyway because of the multicollinearity issues.
Kinda. Multicollinearity is rarely a problem -- only when it's pretty extreme such as in cases where about 90% of the variance in one indep var can be explained by others. And that's not the case here. Even so, assuming the assumptions of linear regression are met, the coefficient estimates are unbaised and the standard error estimates are correct.
The bigger problem is the interpretation. Most of the explained variation in runs is being explained by the "collective" contribution of BA/OBP/SLG -- which is to say BA.
(Note: this may be a variation of what Walt Davis said in comment 11.)
In the same ballpark. The point I'm making is that the interpretation of any regression coefficient is "this is the impact of a 1-unit change in X1 holding all other X's constant." In the presence of multicollinearity, and especially in this specific case, the idea of changing X1 (BA) while holding the others (OBP and SLG) constant means that your coefficient is measuring the impact of a particularly odd sort of change.
That is, let's say your team hits 270/330/440. You might wonder "what happens if we add 5 points of BA?" Well, adding 5 points of BA would have your team hitting roughly 275/335/445 -- all three have changed. Therefore, the coefficient from this regression doesn't answer your question. This coefficient tells you how many more/less runs you'd score if your team hit 275/330/440. Basically if your team, say, traded 15 walks and 15 doubles for 30 singles.
Now that difference is close to zero, leading to the impression that BA doesn't matter. And at one level that's fine -- after controlling for OBP and SLG, BA matters little in terms of predicting team runs. But of course BA is the primary component of OBP and SLG.
Here's an analogous, more extreme situation. Let's say you ran a regression of team runs on OBP, SLG, other indicators of offensive quality, and the total salary paid to hitters. Now after controlling for the quality of the offense, why would paying extra money help? It probably doesn't. Does that mean money can't buy runs? Of course not -- money is how you buy OBP, SLG, etc. It does tell you that increasing offensive payroll without improving your offense is not likely to help but you already knew that. :-) (yes, obviously you could be buying better defenders)
And god forbid I should raise the spectre of the challenge of working with ratios again, but the fact that OBP is per PA and SLG is per AB means they should never be in the same regression. Yes, it probably doesn't matter a lot (and I did it myself in the regressions mentioned above). I support a constitutional amendment to measure them all per PA.
And using linear weights, it's easy to see that the proper weight for OBP vs. SLG is about 1.8. Adding one walk is worth about .5 runs (.33 for the walk, .17 for not making an out). Adding one base is worth about .3 runs. Account for the fact that OBP is based on PAs while SLG is based on ABs (increasing the value of the walk by about another 10%), and you get .55/.3 = 1.8.
Zimbalist has responded to J.C.'s blog posting. As I told Andy when I first read, prepare for the castle to be stormed by the sabermetric community. Here is an excerpt: The response could make for a nice thread on its own... will see what happens.
You must be Registered and Logged In to post comments.
<< Back to main