MLB 板


LINE

看板 MLB  RSS
http://tinyurl.com/3h6ou9 This is the first part of a multi-part series on how to estimate player value. I've been doing an awful lot of reading, thinking, and discussing these issues over the past several weeks, which is part of the reason that it's been relatively quiet around here. Because writing things out is the best way that I know to master a complicated topic like this, my hope is that this series will help me crystallize my thinking on player valuation and get up to speed on the most significant research to date. It will also serve as a nice set of papers to which I can refer to justify my methods moving forward...and who knows, maybe it'll be useful to others who are working through similar issues as well. To be clear, little of the big ideas that follow are based on my own work, though I may supplement them with a small study here and there. Because this is a supposed to be Reds blog, I will often use the Reds in case studies. In general, though, you can think of this as a popular science review article ... or maybe a college term paper, given that I'm still new to much of this info. :) 以上是废话,可以跳过 General Principles of Player Valuation Wins vs. Runs What do we mean by value? The answer can be fairly nuanced, as evidenced by essays like this or this. I'm going to be a bit generic about it, however: I want to know how much players did to help their team win. Based on that definition, the ultimate goal would be to have statistics that quantify player value in terms of wins. There have been several efforts on that front, including Win Shares, WARP, and WPA. At this point, however, I'm not satisfied with how any of those stats handle fielding (among other things), so I'm not ready to make the leap to those stats. The alternative, then, is to use stats that give their units in runs, for which we have many "good" stats that can quantify hitting, pitching, and fielding. How much are we losing by going with runs instead of wins? To get some handle on this, I ran a quick regression of all teams from 1996-2004 with data pulled from the Lahman database and looked at one variable models that predict wins: Predictor of Wins R-Square MSE Run Differential 0.90 14.78 Runs Allowed 0.43 86.32 Runs Scored 0.35 97.92 R-Square, in this case, indicates the proportion of variation in wins explained by the different predictors. Therefore, this quick 'n dirty analysis indicates that we can explain 90% of variation in the number of team wins by just knowing a team's run differential (the difference between a team's runs scored and its runs allowed). The remaining 10% is presumably due to the timing of when those runs are scored, or variation in run environments (e.g. a run in Coors' field is worth less in terms of wins than a run in PETCO Park, simply because more runs are scored in Coors' than in PETCO, so each one contributes less to wins). Research to date indicates that most, though not all, of timing-based events tend to be associated with events that involve very little unique player skill -- clutch hitting and pitching, for example, have very low repeatability in and of themselves, meaning that clutch performances are best predicted by a player's overall stats. Others "timing" events, like those having to do with baserunning (SB's and CS's happen more often in close games than in blowouts), tend to result in relatively few net runs per year. Finally, we can make adjustments for variation in the run environment of games via park factors and other techniques. Therefore, I'm ok with using runs instead of wins, at least for the time being, because of the gains in precision that we get from using the available runs-based statistics. 什麽是价值?答案可以是有些微妙的,像这两篇文章所试图去定义的: (http://tinyurl.com/4fy3r7 http://tinyurl.com/4a5kwz)。我则是用较一般化的角度 去了解它:我想知道这个球员为球队带来了多少胜利。 从这个定义出发,那最终的目标是要去找到可以直接把球员的价值用胜利表示出来的数据 。在这方面我们的努力已经有一些成果,像是WS,WARP,WPA。然而,我对这些数据处理 防守的方式不是很满意,所以在这边我暂时不会讨论这些数据。 从另一个角度来看(价值),则是把它用得分的方式呈现。在这方面我们已经有许多完善 的现成数据可以来衡量打击、投球和防守。问题是,分数和胜利之间有多少偏差?我在这 里跑了一下从1996-2004年间的数据: Predictor of Wins R-Square MSE Run Differential 0.90 14.78 Runs Allowed 0.43 86.32 Runs Scored 0.35 97.92 这个粗糙的分析显示得失分差和胜利间的相关系数高达0.9。至於剩下的0.1,我想可能的 原因是因为得分的时机或是环境上的不同所致。(同样的一分在Coors' field或是PETCO 其价值有所不同,因为Coors'的环境比较容易得分) 截至目前为止的研究显示,关键能力几乎不(但不完全)属於球员的特殊能力。以关键投 球和打击来说,数据上显示这些"能力"只有很低的可重复性,表示关键表现基本上就是球 员本身能力的展现。其他的关键能力,像是跑垒技巧,每年只能产生极少得分的差异。 至於环境变因,我们则可用park factor和其他方式来调整。因此,我认为以分数来呈现 球员的价值是个可行的方法。 Offense vs. Defensive Contributions You'll note that in the table above, runs allowed alone predicts wins better than runs scored alone. This is interesting: it indicates that winning teams are slightly more likely to have good run prevention than good run scoring. This could mean two things: a) runs prevented are more important than runs scored, or b) it's "easier" to build an offense-oriented team than a defense-oriented team. One way to get at this is question to use a two variable model that includes both runs scored and runs allowed by teams, instead of just run differential alone. Doing so on this same dataset results in a model that predicts wins just as well as the run differential model (model R2 = 0.90), and assigns coefficients that tell you roughly how many wins you get from a run scored vs. run earned. It turns out that these coefficients are are virtually identical: +0.099 wins per run scored, and -0.101 wins per run allowed (model R2 = 0.90), indicating that the reason for the result in the table is mostly likely attributable to the "easier to build good offensive team" hypothesis. Furthermore, this means that preventing a run from scoring on defense is worth just as much as scoring a run on offense. What does this mean for how we evaluate players? Well, clearly, we need to consider both aspects of a players' performance: offense and defense. For position players, this means that we need to know both how many runs they generated on offense, as well as how many runs they saved on defense, both relative to some baseline. If you only consider offense--and let's face it, that's what just about everyone does...at best, defense is used as a tiebreaker--you're likely to severely overvalue players that are offensive standouts but defensive disasters. With pitchers, at least in the National League, we probably should consider offense and defense as well. However, the offensive contributions of pitchers these days are generally so meager, and involve such a small number of plate appearances, that I tend to just ignore them. However, I recognize that for pitchers like Micah Owings in '07, this might miss a substantial amount of value. Consider it something to look at in the future. In future articles, we'll go over more specifically how to go about evaluating players. 可能你已经注意到,失分比得分更能精确预测胜利,这个现象很有意思。这表示赢球队伍 似乎稍为比较能阻止失分而非稍为比较能得分。可能原因有二:1.防守比攻击更重要。 2.建立一支打击强的队伍比建立一支防守强的队伍容易。 解决的方法是同时考虑得分和失分两个变数,而非只考虑得失分差一个变数。这麽做的结 果是预测胜率的相关性和只考虑得失分差一样精确(r=0.9),同时可以知道 每得一分/每失一分 代表多少的胜利。每得一分=0.099胜,每失一分=-0.101胜。这结果 暗示"容易建立一支打击强队"的假设较可能是正确的。更进一步来说,这表示攻下一分 和守下一分是相等价值。 这和我们的主题有什麽关系?显然要衡量一位球员的价值,我们必须同时考虑打击和防守 。假如你只考虑打击,那你最多就只能做得和其他人一样好。防守好坏才是(衡量球员) 关键(Beane:科科)。你很有可能过度高估一位打强守弱的选手。 对於投手,至少在国联,我们也要考虑到他的打击。通常他们打击上的表现少到几乎可以 忽略不计。然而,像Micah Owings这种强投豪打,这麽做可能会过於低估他的价值。或许 未来我会对这部份做更进一步的探讨。 在接下来的系列中,我会更详细阐述评估球员的方法。 --



※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 140.112.5.3
1F:推 appshjkli:Owings今年的强投好像被拔掉了;不过打击还在 10/05 15:03
2F:推 dashboy:推 10/05 15:20
3F:推 Poleaxe:推 10/05 15:31
4F:推 eaquson: 10/05 16:26
5F:推 Geel:推荐 10/05 16:30
6F:推 NPLNT:推 10/05 16:34
7F:推 bbbruce:推 10/05 17:03
8F:推 abing75907:推 10/05 17:48
9F:推 Paparra:我看完前两段後 才看到一行字.."以上是废话" 10/05 21:52
10F:推 jayin07:XD 10/05 21:54
11F:推 gaga19900329:推 10/05 22:05
12F:推 airmike:版主翻译辛苦 10/16 19:10







like.gif 您可能会有兴趣的文章
icon.png[问题/行为] 猫晚上进房间会不会有憋尿问题
icon.pngRe: [闲聊] 选了错误的女孩成为魔法少女 XDDDDDDDDDD
icon.png[正妹] 瑞典 一张
icon.png[心得] EMS高领长版毛衣.墨小楼MC1002
icon.png[分享] 丹龙隔热纸GE55+33+22
icon.png[问题] 清洗洗衣机
icon.png[寻物] 窗台下的空间
icon.png[闲聊] 双极の女神1 木魔爵
icon.png[售车] 新竹 1997 march 1297cc 白色 四门
icon.png[讨论] 能从照片感受到摄影者心情吗
icon.png[狂贺] 贺贺贺贺 贺!岛村卯月!总选举NO.1
icon.png[难过] 羡慕白皮肤的女生
icon.png阅读文章
icon.png[黑特]
icon.png[问题] SBK S1安装於安全帽位置
icon.png[分享] 旧woo100绝版开箱!!
icon.pngRe: [无言] 关於小包卫生纸
icon.png[开箱] E5-2683V3 RX480Strix 快睿C1 简单测试
icon.png[心得] 苍の海贼龙 地狱 执行者16PT
icon.png[售车] 1999年Virage iO 1.8EXi
icon.png[心得] 挑战33 LV10 狮子座pt solo
icon.png[闲聊] 手把手教你不被桶之新手主购教学
icon.png[分享] Civic Type R 量产版官方照无预警流出
icon.png[售车] Golf 4 2.0 银色 自排
icon.png[出售] Graco提篮汽座(有底座)2000元诚可议
icon.png[问题] 请问补牙材质掉了还能再补吗?(台中半年内
icon.png[问题] 44th 单曲 生写竟然都给重复的啊啊!
icon.png[心得] 华南红卡/icash 核卡
icon.png[问题] 拔牙矫正这样正常吗
icon.png[赠送] 老莫高业 初业 102年版
icon.png[情报] 三大行动支付 本季掀战火
icon.png[宝宝] 博客来Amos水蜡笔5/1特价五折
icon.pngRe: [心得] 新鲜人一些面试分享
icon.png[心得] 苍の海贼龙 地狱 麒麟25PT
icon.pngRe: [闲聊] (君の名は。雷慎入) 君名二创漫画翻译
icon.pngRe: [闲聊] OGN中场影片:失踪人口局 (英文字幕)
icon.png[问题] 台湾大哥大4G讯号差
icon.png[出售] [全国]全新千寻侘草LED灯, 水草

请输入看板名称,例如:BuyTogether站内搜寻

TOP