Runs created
From Wikipedia, the free encyclopedia
Runs created (RC) is a baseball statistic invented by Bill James to estimate the number of runs a hitter contributes to his team.
Contents |
[edit] Purpose
James explains in his Bill James Historical Baseball Abstract why runs created is an essential thing to measure:
With regard to an offensive player, the first key question is how many runs have resulted from what he has done with the bat and on the basepaths. Willie McCovey hit .270 in his career, with 353 doubles, 46 triples, 521 home runs and 1,345 walks -- but his job was not to hit doubles, nor to hit singles, nor to hit triples, nor to draw walks or even hit home runs, but rather to put runs on the scoreboard. How many runs resulted from all of these things?1
Runs created attempts to answer this bedrock question. The conceptual framework of the "runs created" stat is:
where
- A = On-base factor
- B = Advancement factor
- C = Opportunity factor
[edit] Formulae
[edit] Basic runs created
In the most basic runs created formula:
where BB is base on balls, TB is total bases and AB is at-bats.
This can also be expressed as:
- OBP × SLG × AB
- or,
- OBP × TB
where OBP is on base percentage and SLG is slugging average.
[edit] "Stolen base" version of runs created
This formula expands on the basic formula by accounting for a player's basestealing ability.
where BB is base on balls, CS is caught stealing, TB is total bases, SB is stolen bases, and AB is at bats.
[edit] "Technical" version of runs created
This formula accounts for all basic, easily available offensive statistics.
where BB is base on balls, CS is caught stealing, HBP is hit by pitch, GIDP is grounded into double play, TB is total bases, IBB is intentional base on balls, SH is sacrifice hit, SF is sacrifice fly, and AB is at bats.
[edit] 2002 version of runs created
Earlier versions of runs created overestimated the number of runs created by players with extremely high A and B factors (on-base and slugging), such as Babe Ruth, Ted Williams and Barry Bonds. This is because these formulae placed a player in an offensive context of players equal to himself; it is as if the player is assumed to be on base for himself when he hits home runs. Of course, this is impossible, and in reality, a great player is interacting with offensive players whose contributions are inferior to his. The 2002 version corrects this by placing the player in the context of his real-life team. This 2002 version also takes into account performance in "clutch" situations.
- A: H + BB − CS + HBP − GIDP
- B:
- C: AB + BB + HBP + SH + SF
where K is strikeout.
The initial individual runs created estimate is then:
If situational hitting information is available, the following should be added to the above total:
where RISP is runners in scoring position, BA is batting average, HR is home run, and ROB is runners on base. The subscripts indicate the required condition for the formula. For example, HRISP means "hits while runners are in scoring position."
This is then figured for every member of the team, and an estimate of total team runs scored is added up. The actual total of team runs scored is then divided by the estimated total team runs scored, yielding a ratio of real to estimated team runs scored. The above individual runs created estimate is then multiplied by this ratio, to yield a runs created estimate for the individual.
[edit] Other expressions of runs created
The same information provided by runs created can be expressed as a rate stat, rather than a raw number of runs contributed. This is usually expressed as runs created per some number of outs, e.g. (27 of course being the number of outs per team in a standard 9-inning baseball game).
In 2003 Neil Bonner published a series of articles on Ron Shandler’s BaseballHQ website on an alternative method of calculating Runs Created per Game (RC/G). Bonner claimed that his analysis showed there were only three significant elements that could be used to forecast RC/G.
His model is based on the following premise: the rate of runs created is based on a hitter’s power, contact rate and walk rate. From only these three metrics a teams overall runs could be forecasted as well as an individual player. Contact Rate (ct%) and Walk Rate (bb%) are easily understood but he used a new metric to model power which he called, Swing Speed (SS).
SS = ((1B x 0.5) + (2B x 0.8) + (3B x 1.1) + (HR x 1.2)) / (AB - K)
Based on Swing Speed his model of RC/G is:
RC/G = (SS x 37.96) + (ct% x 10.38) + (bb% x 14.81) – 13.04
Using a common formula for games played it is straight forward to calculate overall Runs Created (RC):
Games (G) = (AB – H) / 25.2
In 2005 this method was published in Ron Shandler’s Baseball Forecaster and has been the official method used for RC/G forecasts in both the Forecaster and on the BaseballHQ website.
As Bonner’s method is very different from the Bill James model, Bonner has had his share of detractors. Yet in practice it models the real world very well for both teams and players as this example from 2005 shows:
2005 AL TEAM DATA (Actual v. Forecast) Tm R/G R RC/G RC %off BOS 5.62 910 5.60 899 1.17% NYY 5.47 886 5.41 874 1.35% TEX 5.34 865 5.22 867 -0.25% CLE 4.88 790 5.16 836 -5.86% TOR 4.78 775 4.56 741 4.34% OAK 4.77 772 4.64 764 1.09% LAA 4.70 761 4.53 738 2.98% TBD 4.63 750 4.67 748 0.31% CHW 4.57 741 4.55 736 0.64% BAL 4.50 729 4.79 772 -5.84% DET 4.46 723 4.61 746 -3.15% KCR 4.33 701 4.25 685 2.30% SEA 4.31 699 4.22 686 1.91% MIN 4.25 688 4.27 699 -1.64% TOTAL 10790 4.75 10792 -0.02%
Generally speaking, this method varies by only by one or two percent on a team level. In 2005 the total runs scored in the AL was 10,790 and Bonner's rate based model predicted 10,792 runs. Examining the prior year data for the 2004 Red Sox:
Pos Player R RBI SS CT% BB% RC/G RC C Jason Varitek 67 73 0.269 73% 12% 6.5 84 1B Kevin Millar 74 74 0.237 82% 10% 6.0 85 2B Mark Bellhorn 93 82 0.271 66% 14% 6.3 96 3B Bill Mueller 75 57 0.215 86% 11% 5.7 65 SS Pokey Reese 32 29 0.176 75% 7% 2.4 18 LF Manny Ramirez 108 130 0.295 78% 13% 8.1 127 CF Johnny Damon 123 94 0.223 89% 11% 6.2 107 RF Gabe Kapler 51 33 0.201 83% 5% 4.0 33 DH David Ortiz 94 139 0.294 77% 11% 7.8 126 Orlando Cabrera 33 31 0.215 90% 5% 5.1 33 Kevin Youkilis 38 35 0.216 78% 14% 5.3 33 Doug Mirabelli 27 32 0.284 71% 11% 6.7 31 Nomar Garciaparra 24 21 0.231 90% 5% 5.8 24 David McCarty 24 17 0.228 74% 8% 4.5 20 Trot Nixon 24 23 0.248 84% 9% 6.4 26 Doug Mientkiewicz 13 10 0.164 83% 9% 3.1 10 Dave Roberts 19 14 0.223 80% 10% 5.3 13 Cesar Crespo 6 2 0.131 75% 0% -0.3 -1 Brian Daubach 9 8 0.228 72% 12% 4.8 11 Ricky Gutierrez 6 3 0.171 85% 5% 3.0 3 Ellis Burks 6 1 0.148 76% 8% 1.7 2 Andy Dominique 0 1 0.125 73% 0% -0.7 0 Adam Hyzdu 3 2 0.350 80% 9% 9.9 3 Sandy Martinez 0 0 0.000 50% 0% -7.9 -1 Earl Snyder 0 0 0.167 75% 0% 1.1 0 949 911 947
The 2004 Red Sox scored 949 runs and Bonner's rate based method predicted 947 runs via totaling the Runs Created (RC) of each of the teams hitters.
[edit] Accuracy
Runs created is believed to be an accurate measure of an individual's offensive contribution because, when used on whole teams, the formula normally closely approximates how many runs the team actually scores. Even the basic version of runs created usually predicts a team's run total within a 5% margin of error.2 Other, more advanced versions are even more accurate.
[edit] Problems with runs created
While even the simplest version of Runs Created estimates team runs with reasonable accuracy, the multiplicative (A*B)/C structure of the formula is fundamentally flawed when estimating the runs produced by each individual hitter, particularly in the case of hitters with extremely high on-base and slugging percentages. The reason for this is that it is impossible for a player to get on base and then drive himself in -- players' on-base and slugging averages must interact with those of their teammates. Yet RC's simple OBP*TB form assumes that a player's own slugging is interacting with his own on-base percentage, which artifically inflates RC for players who score well in both categories.
Take an example: in isolation, Ryan Howard's on-base percentage and slugging average each have a real, discrete effect on the Philadelphia Phillies' offense, but when combined they overstate Howard's contribution by treating it as though he is both driving in players with equal on-base ability as himself, and simultaneously being driven in by players with equal slugging ability as himself. This model would be appropriate with regard to a theoretical lineup of 9 Ryan Howards, each of whose on-base and slugging abilities would interact in precisely this manner; however, Howard is in a lineup with players of lesser on-base and slugging abilities -- his actual contribution to the Phillies in terms of runs is influenced by the fact that some of his on-base skills are being wasted by teammates who lack his slugging ability, and that some of his slugging skills are being wasted by teammates who lack his on-base ability. Therefore, Howard's RC production must be adjusted downward to reflect this reality.
This is generally not a major issue for most players, as their OBPs and SLGs are not high enough to significantly distort their Runs Created; however, superstars who put up impressive OBPs and SLGs will frequently see their RC artificially inflated by this phenomenon. In recent years, James has modified the Runs Created to correct this error, effectively placing a player in a lineup of average players, rather than assume that a player's own slugging is interacting with his own on-base percentage.
Runs created does not take into account the stadiums in which a player hits. Certain stadiums, such as Denver's Coors Field prior to the introduction of the baseball humidor, generally increase offensive production in games played there. Since each run scored in such stadiums is less valuable, the same number of runs created will translate into fewer wins in a stadium like Coors than it would elsewhere.
Runs created also does not take into account the era in which a player played. Due to various factors, some eras of baseball history have had lower or higher average levels of offensive production.
[edit] Related statistics
- OPS (On-base Plus Slugging) is similar conceptually to runs created, except that it adds the A (on-base) and B (advancement) factors together, rather than multiplying them. This makes the statistic less accurate than runs created. However, OPS is easier for many fans to accept and embrace because they are already familiar with the individual OBP and SLG statistics that comprise it, and because it is simple to figure out.
- Win Shares is James' attempt to summarize, in one stat, a player's contributions on both offense and defense.
[edit] See also
[edit] External links
- Career leaders in Runs Created
- Single-season leaders in Runs Created
- Runs Created leaders among active players
- Year-by-year leaders in Runs Created
[edit] References
Note 1: James, Bill (1985). The Bill James Historical Baseball Abstract (1st ed.), pp. 273-4. Villard. ISBN 0-394-53713-0.
Note 2: James, Bill (2002). Win Shares, p. 90. STATS, Inc. Publishing. ISBN 1-931584-03-6.