[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Retrosheet


Evaluating Traditional Lineups

Introduction

People care about batting orders. When I hear fans discussing their favorite teams, two of the most common types of comments I hear are "Why are they playing that bum?" and "Why are they batting that bum first (or second or third... )?" In the 1988 Baseball Abstract, Bill James summed up the view of the baseball research community on this subject:

"Several people, maybe a dozen, have done simulation studies of lineups, and have all (as far as I know) reported that it really doesn't make any difference, that one lineup is as good as another. I still don't buy it."

More studies have been done since that but what Bill James mentioned still seems to be the most common approach: take a team, pick out several potential good or bad batting orders, simulate a bunch of games and compare the number of runs scored using each approach. One notable exception to this was an article written in a 1991 issue of SABR's By The Numbers by Mark Pankin. He used a Markov process model to evaluate lineups.

In this article, I will attempt to follow in Mark's footsteps. I will be analyzing two lineups: the composite NL and AL lineups used from 1993 to 2004. I will use a form of the Markov process model to evaluate lineup strength, and will attempt to evaluate, not several, or a couple, or 100, or a few thousand, but all possible lineup combinations - all 362880 of them. I will be looking at a few things. First, what alterations should be made to our current thinking about lineup construction. How much variations do we see in the expected runs from different batting orders? I'll also look at one variation of the lineup taking into the account the handedness of the batters and see how that affects optimal lineup construction. Finally, I'll look at where the Giants should have been batting Barry Bonds these last few years.

The Lineups

As mentioned in the introduction, I will be analyzing the composite lineups used in each league from 1993 to 2004. I picked those years because they gave us a large sample size and because the offense has been at a relative high point for the entire period.

The composite starting lineup for the NL during those years:

NL     AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1 120334 32827 5831 1189 2075 11617  363 18400 1279 1018  662 5531 2210  .341  .393
 2 117432 32312 5989  897 2605 10563  157 18568 1129 1822  750 2920 1151  .339  .408
 3 111311 32687 6556  605 5253 14733 1510 19151 1142  193 1188 2485  902  .378  .505
 4 109833 31322 6355  493 5474 13479 1693 21226 1149   70 1187 1714  803  .366  .502
 5 109160 29844 6111  558 4379 11339 1024 20738 1003  262 1016 1680  918  .344  .460
 6 107152 28567 5661  600 3714  9992  877 21022 1073  421  976 1683  917  .333  .435
 7 104685 26559 5200  583 2826  9134  791 20719 1033  662  881 1244  713  .317  .396
 8 100280 25044 4772  621 1948 10042 2217 19004 1000  957  842 1034  509  .322  .368
 9  95204 17187 3076  247 1096  6276  262 30783  508 7312  489  504  273  .234  .253

The American League lineup:

AL     AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1 111052 30708 5562 1018 2232 11210  426 16929 1102  991  792 4798 1776  .346  .405
 2 108386 29886 5685  762 2538 10461  203 17142 1019 1447  924 2958 1087  .342  .412
 3 104126 30201 5958  495 4569 12681 1147 17799 1072  191 1242 1856  651  .369  .488
 4 101653 28796 5880  329 5156 12735 1419 19293 1035   60 1119  980  506  .365  .500
 5 100621 27514 5683  372 4125 11176 1015 18808  924  172  987 1073  616  .348  .460
 6  99132 26561 5327  440 3607  9849  759 19320  947  358  792 1184  698  .337  .440
 7  96691 25119 4939  483 2944  9004  630 18378  980  562  885 1225  717  .326  .412
 8  94541 24157 4779  436 2358  7878  413 17757  939  982  795 1148  752  .317  .390
 9  91926 23098 4398  539 1580  6809  106 16505  888 1682  753 1628  925  .307  .362

One pattern is clear here. The top three hitters in the lineup, both in terms of on-base and slugging percentage, hit 3rd, 4th and 5th, respectively. The players with the next two highest on-base percentages hit 1st and 2nd, followed by the rest of the hitters, in declining order of their on-base plus slugging percentages in the 6th through 9th slots.

Note that there are more plate appearances in the National League because they have had two more teams than the American League since 1998.

The Markov Model

In general, Markovian approaches to baseball research analyze changes in game states. Readers of my Value Added article will already have seen something similar to this as will readers familiar with Mark Pankin's work. At the beginning of each play, we are in one of 24 game states (with outs going from 0 to 2 and men on going from bases empty to full). At the end of each play we are in one of 25 states (the 24 ones mentioned above as well as the 3 outs stage). Each player's offensive contribution can be seen, then, as a 24 x 25 matrix of state transitions caused by his at-bats. This might be best seen by example.

Here is the row of the transition matrix for the NL lead-off hitter corresponding to the no outs and no one on starting state:

Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0    927 15407  2798     0   561     0     0     0
  1  36316     0     0     0     0     0     0     0
  2      0     0     0     0     0     0     0     0
  3      0     0     0     0     0     0     0     0

Lead-off hitters got up in these situations 56009 times (the sum of all the values in the table). Since there are only 5 transitions possible from this state, the table is rather sparse. 927 times, he left the state the same as he found it. In other words, he hit a home run or found some other way to circle the bases. He ended up on first, second and third, 15407, 2798 and 561 times, respectively. All the other times, 36316 of them, he made an out.

Note that we can determine the number of runs scored during each transition except ones that end in the final (three outs) state. In general, all runners (including the batter) that are not still on the bases or accounted for by an out have scored. In order to eliminate the exception with the final state, I'm going to expand that state to include the men left on base at the end of the inning.

Starting at a specific spot in the batting order, we will determine the number of runs scored in that inning by successively evaluating each state transition, pruning the list of transitions whenever it reaches a final state or drops below some probability threshold. To see how this works in action, here's the table above expressed in percentages:

Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   1.66 27.51  5.00  0.00  1.00  0.00  0.00  0.00
  1  64.84  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  2   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  3   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00

As we analyze each inning, we need to determine two things: the amount of runs scored in the inning and a probability of lead-off hitters for the next inning. To determine this we need to maintain a list (with probabilities) of our next states, a running count of runs scored, as well as a list (with probabilities) of the lead-off batters in the next inning.

To demonstrate this, let's look at the first inning. We start with a very small lists of possible states: the lead-off batter at the plate with no on and none out. The probability of that state is 1.00, no runs have been scored and we have an empty list of next inning lead-off hitters. We run these states through the transition matrix for the batter hitting and update our data. Doing this for the first batter, yields the following list of next states:

Out MenOn   Prob
 0   ---    1.66
 0   F--   27.51
 0   -S-    5.00
 0   --T    1.00
 1   ---   64.84

We now have scored .0166 runs in the inning (the probability of staying in the same state, since that yielded a single run), and we still have an empty list of the batters due up next inning.

This information is then used in evaluating the next batter's matrix. Here is the second hitter's matrix (expressed in percentages) for the five possible states:

Out=0 MenOn=---
Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   2.19 26.41  5.28  0.00  0.77  0.00  0.00  0.00
  1  65.35  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  2   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  3   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
 
Out=0 MenOn=F--
Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   1.93  0.03  1.94 19.89  0.90  6.35  3.30  0.00
  1   0.08 38.12 15.43  0.00  0.34  0.00  0.00  0.00
  2  11.70  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  3   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
 
Out=0 MenOn=-S-
Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   1.22  5.17  4.06 11.20  0.77 10.81  0.36  0.00
  1   0.28  1.28 30.75  0.00 33.34  0.00  0.00  0.00
  2   0.75  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  3   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
 
Out=0 MenOn=--T
Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   1.65 17.42  4.52  0.00  0.77 13.56  0.00  0.00
  1  26.90  0.44  0.00  0.00 34.40  0.00  0.00  0.00
  2   0.33  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  3   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
 
Out=1 MenOn=---
Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  1   1.99 26.32  4.76  0.00  0.74  0.00  0.00  0.00
  2  66.20  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  3   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00

So for each of these 5 five states, we compute the probability of reaching the next state, the number of runs scored during this at-bat and (although it's a little too soon for that) a list of the lead-off batters in the next inning along with the probabilities.

As you can imagine, this approach becomes pretty unwieldy for anything but a computer rather quickly, but to follow only one sequence in the inning, let's expand what happens when we evaluate the one out, no one on situation with the second batter at the plate. Remember, we had a probability or reaching this state of .6484. So the probability of reaching the two outs and no one on state at the conclusion of his at-bat would be .4292 (.6484 * .6620).

The third hitter's matrix for that situation:

Out    ---   F--   -S-   FS-   --T   F-T   -ST   FST
  0   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  1   0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  2   3.92 27.25  5.09  0.00  0.47  0.00  0.00  0.00
  3  63.27  0.00  0.00  0.00  0.00  0.00  0.00  0.00

This is the first time we have encountered a transition to the inning-ending state. Since we had a probability of .4292 of reaching this state to begin with, the probability of a 1-2-3 inning resulting in the clean-up hitter batting first the next inning would be .2716 (.4292 * .6327). So we add that information to our list of ending states.

We continue this analysis until all the threads have either reached a final state or dropped below a probability threshold. We need such a threshold because innings never have to end. It is possible that the first nine hitters in the inning hit home runs, or to have an inning begin with 27 straight hits. So we will weed out any sequence with less than a one in ten million chance of occurring.

Completing this analysis for the inning yields the following results:

RUNS       1     2     3     4     5     6     7     8     9
.6046    1.4   0.4   0.2  32.8  26.9  18.9  11.1   5.7   2.6

So this lineup would be expected to score .6046 runs an inning and have the cleanup hitter leading off the next inning 32.8% of the time, the 5th place hitter leading off 26.9% of the time, and so on. Notice that earlier we had determined the probability of a 1-2-3 inning with the cleanup hitter due up first in the following inning was 27.16%. This doesn't match the 32.8% of time that the fourth hitter leads off the next frame because the 1-2-3 innings do not include cases where a batter reaches base but is removed as part of a double-play.

So in order to analyze a batting order, we need to do this analysis for each of the nine batting order positions leading off an inning. Here are the results for the National League:

 ST     RUNS       1     2     3     4     5     6     7     8     9
  1    .6046     1.4   0.4   0.2  32.8  26.9  18.9  11.1   5.7   2.6
  2    .6107     3.1   1.0   0.4   0.2  31.8  27.7  19.4  11.1   5.3
  3    .5806     6.3   2.1   0.9   0.4   0.2  32.2  28.7  19.0  10.2
  4    .5128    11.9   4.3   1.9   0.8   0.4   0.2  34.5  28.5  17.6
  5    .4518    20.7   8.5   4.0   1.6   0.8   0.4   0.2  36.6  27.3
  6    .4146    31.4  16.3   8.2   3.5   1.7   0.8   0.4   0.2  37.5
  7    .4110    40.4  28.9  16.2   7.4   3.8   1.9   0.9   0.4   0.1
  8    .4499     0.2  38.6  29.3  15.5   8.6   4.4   2.1   0.9   0.4
  9    .5148     0.5   0.1  37.8  25.8  17.3   9.9   5.1   2.4   1.0
 
Where: ST - the lead-off batter in the inning

Hopefully, this chart makes sense. The expected runs is actually highest with the second batter leading off and lowest when the bottom of the order (7-8-9) is due up. Notice that the highest probability in the chart is the 40.4% chance that the first batter in the lineup will be up first in the innings after the 7th-place batter leads off.

The chart for the American League:

 ST     RUNS       1     2     3     4     5     6     7     8     9
  1    .6104     1.2   0.5   0.2  32.9  27.0  18.8  11.0   5.7   2.8
  2    .6109     2.6   1.1   0.5   0.2  32.1  27.8  19.1  10.9   5.7
  3    .5932     5.3   2.3   1.1   0.5   0.2  32.8  28.2  18.8  10.8
  4    .5468    10.0   4.5   2.2   1.0   0.5   0.2  34.1  28.8  18.7
  5    .5198    18.4   9.0   4.6   2.1   1.0   0.5   0.2  35.1  29.2
  6    .5041    29.4  16.7   9.2   4.3   2.2   1.1   0.5   0.2  36.5
  7    .4994    37.6  27.5  17.4   8.8   4.6   2.3   1.1   0.5   0.2
  8    .5316     0.2  35.4  28.5  17.1   9.6   5.1   2.5   1.1   0.5
  9    .5766     0.5   0.2  34.6  26.9  17.8  10.7   5.5   2.6   1.2

How does this match up with what we see in real life? Here are the real life charts for the National League:

 ST     RUNS       1     2     3     4     5     6     7     8     9
  1    .6115     1.3   0.4   0.2  33.3  27.3  18.8  10.9   5.3   2.4
  2    .6195     3.1   1.1   0.5   0.3  31.9  27.8  19.4  10.8   5.1
  3    .5861     6.4   2.1   0.9   0.4   0.3  32.6  28.5  18.9   9.9
  4    .5067    12.0   3.9   1.8   0.7   0.3   0.1  34.9  28.8  17.4
  5    .4607    20.6   8.5   4.0   1.5   0.8   0.4   0.2  37.1  27.0
  6    .4158    31.7  16.1   7.9   3.5   1.5   0.8   0.4   0.2  37.9
  7    .4218    40.9  28.8  16.5   7.1   3.6   1.9   0.8   0.4   0.2
  8    .4610     0.2  39.1  29.5  15.3   8.5   4.1   2.0   1.0   0.4
  9    .5161     0.5   0.2  38.4  25.9  17.2   9.6   4.8   2.3   1.0

And the AL:

 ST     RUNS       1     2     3     4     5     6     7     8     9
  1    .6170     1.2   0.5   0.2  33.3  27.4  18.6  10.9   5.2   2.7
  2    .6198     2.4   1.0   0.5   0.3  32.2  28.1  19.1  10.8   5.7
  3    .6011     5.3   2.2   1.1   0.5   0.3  33.0  28.4  19.1  10.2
  4    .5527     9.9   4.3   2.0   0.9   0.5   0.3  34.2  29.0  18.8
  5    .5294    18.2   8.6   4.5   2.0   1.0   0.6   0.2  35.5  29.4
  6    .5070    29.6  16.7   8.9   4.0   2.0   1.0   0.5   0.3  37.0
  7    .4963    38.2  28.1  17.2   8.3   4.4   2.1   1.0   0.5   0.3
  8    .5248     0.2  35.8  29.1  16.9   9.1   5.0   2.4   1.0   0.5
  9    .5845     0.6   0.3  35.1  27.0  17.6  10.3   5.3   2.6   1.2

Both leagues slightly out-performed the model. National League teams scored 1.06% more runs than predicted (4.5992 to 4.5508) while the American League bettered the model by 0.8% (5.0326 to 4.9928).

Before going much further, we should point out two problems with the model as well as two (possibly faulty) assumptions. The first problem is that we are not including stolen bases, caught stealing, wild pitches, passed balls and so on. My assumption is that the benefit or cost of these events are not overly dependent upon batting order. In other words, batting orders do not materially affect the number of these events. Now there are some lineups where this would not be true. For example, if we bat our best base-stealer eighth in the NL, this could decrease his stolen base attempts as managers opt to sacrifice the runner over rather than risk a caught stealing.

The second problem is that the transitions also contain information about how the runners on ahead of each batter did during his at-bats. Since a third-place hitter generally has faster runners on-base during his at-bats than a leadoff hitter, it is not realistic to assume that a batter's transition matrix will be the same if we moved him in the batter order. One way of dealing with this is to credit each batter with a generic state of transitions based upon what he did in each situations. In other words, when he hits a single with a man on first and one out, we won't credit him with what actually happened on the play (since the chance of the runner reaching second or third is at least partly due to the speed and the runner), but we will credit him with what typically happens in these situations. There are problems with this approach, however, since you'd almost have to take into account hit locations (since singles to right produce more first-to-third transitions than singles to left) and that data is not available for all of our hits. We will live with the problem for now, but this is something we'll need to keep in mind.

As for the assumptions, the first is that we are assuming batters don't care where they hit in the lineup, that if we took (for example) Barry Bonds and hit him leadoff, he would hit as well as he did hitting cleanup, and wouldn't go into a sulk and have his performance suffer as a result. Baseball players are not simply numbers in a transition matrix and a theoretically great batting order is not going to work if it causes a player revolt. This is probably more of a problem when proposing novel relief pitching strategies, but it's also a potential problem here.

And finally, we are ignoring "batter protection." In other words, when Barry Bonds is up in a particular situation, our model doesn't care who is in the on-deck circle. In general, this might make sense, but they are cases (in particular when dealing in the NL with the batter hitting in front of the pitcher) when this assumption will clearly not be correct.

Despite these problems and assumptions, I think this method could give us some insight into some excellent as well as some poor lineup choices, as well as information on how beneficial or costly some of these choices are.

So for each of our traditional lineups, here are the predicted runs scored in the first eight innings. First the NL:

INN   RUNS      1     2     3     4     5     6     7     8     9
  1   .605    1.4   0.4   0.2  32.8  26.9  18.9  11.1   5.7   2.6
  2   .465   20.0  12.2   7.7   4.2   2.5   1.5  12.1  19.5  20.3
  3   .521    7.7  12.1  15.8  15.9  15.0  12.8   9.8   6.8   4.2
  4   .507   14.5  10.0   7.3   6.3   8.0  11.0  13.7  15.0  14.2
  5   .502   12.5  12.8  13.5  12.4  11.6  10.3   9.0   8.6   9.3
  6   .514   12.3   9.6   9.2   9.2  10.5  11.9  12.8  12.8  11.7
  7   .501   13.3  12.2  12.0  10.7  10.3  10.1  10.1  10.4  10.9
  8   .512   12.1  10.4  10.4  10.2  11.0  11.6  11.9  11.6  10.8
TOT  4.127

The values under 1 through 9 are the percentages of times that that lineup spot led off the next inning.

And now the AL:

INN   RUNS      1     2     3     4     5     6     7     8     9
  1   .610    1.2   0.5   0.2  32.9  27.0  18.8  11.0   5.7   2.8
  2   .527   18.0  12.1   8.2   4.8   2.9   1.7  12.1  19.3  21.1
  3   .566    7.2  11.4  15.5  16.2  15.0  13.0  10.0   7.1   4.8
  4   .552   13.2  10.0   7.9   6.8   8.1  10.9  13.5  14.7  14.8
  5   .554   11.4  12.2  13.3  12.8  11.8  10.7   9.3   8.8   9.7
  6   .557   11.3   9.6   9.5   9.6  10.4  11.8  12.6  12.7  12.4
  7   .552   12.1  11.7  12.0  11.2  10.6  10.4  10.3  10.4  11.3
  8   .557   11.2  10.2  10.6  10.5  10.9  11.6  11.9  11.6  11.5
TOT  4.477

The Results

As I mentioned in the introduction, I wanted to look at all possible lineups for this study. My feeling was that, since all the likely combinations have been tried in simulation, a lineup that was significantly better than our traditional one would almost have to be one we wouldn't ordinarily consider.

So I ran the tests on the lineups mentioned above for the NL and AL from 1993 to 2004. Let's look at the NL first. As you will recall from the previous section, the traditional lineup, when evaluated using our method, produced 4.127 runs over 8 innings. When I looked at all possible combinations, the lineups went from a low of 3.967 runs to a high of 4.143. So the range was rather small. Here are how some of our likely and unlikely candidates ranked:

  RUNS  ----- LINEUP ----    RANK
 4.127  1 2 3 4 5 6 7 8 9     216   the traditional one
 4.125  3 4 1 5 2 6 8 7 9     324   sorted by OBP, highest to lowest
 4.109  3 4 5 6 2 1 7 8 9    4757   sorted by OPS, highest to lowest
 3.999  9 8 7 6 5 4 3 2 1  339298   reverse traditional
 3.983  9 7 8 6 2 5 1 4 3  360764   sorted by OBP, lowest to highest
 3.984  9 8 7 1 2 6 5 4 3  360269   sorted by OPS, lowest to highest

Nothing too surprising here. The most obviously good ones are pretty near the top and the perversely bad ones are near the bottom. So what were the best and worst lineups in the National League? Here are the 10 best:

  RUNS  ----- LINEUP ----
 4.143  1 3 2 5 4 6 7 9 8
 4.142  1 3 2 5 4 6 8 7 9
 4.142  1 3 4 2 5 6 7 9 8
 4.142  1 3 4 5 2 6 7 9 8
 4.140  1 3 4 2 5 6 8 7 9
 4.140  1 3 4 5 2 6 8 7 9
 4.139  1 3 2 5 4 6 7 8 9
 4.138  1 2 4 3 5 6 8 7 9
 4.138  1 2 5 3 4 6 8 7 9
 4.138  1 3 4 5 6 2 7 9 8

Notice that these lineups are very similar to each other, and very similar to the typical ordering. The top one has no hitter more than one position removed from his "normal" place.

The ten worst:

  RUNS  ----- LINEUP ----
 3.967  8 9 2 6 1 7 5 4 3
 3.967  7 8 1 9 6 5 2 4 3
 3.967  7 8 1 9 2 5 6 4 3
 3.967  2 8 1 9 6 7 5 4 3
 3.968  8 9 2 6 1 7 4 5 3
 3.968  8 9 2 1 7 5 6 4 3
 3.968  8 9 2 1 6 7 5 4 3
 3.968  8 7 1 9 6 5 2 4 3
 3.968  8 7 1 9 2 6 5 4 3
 3.968  8 7 1 9 2 5 6 4 3

What about the AL? Let's start once more with some likely candidates:

  RUNS  ----- LINEUP ----    RANK
 4.477  1 2 3 4 5 6 7 8 9     516   the traditional one
 4.466  3 4 5 1 2 6 7 8 9    8653   sorted by OBP, highest to lowest
 4.476  4 3 5 6 2 1 7 8 9     730   sorted by OPS, highest to lowest
 4.408  9 8 7 6 5 4 3 2 1  340055   reverse traditional
 4.398  9 8 7 6 2 1 5 4 3  359066   sorted by OBP, lowest to highest
 4.390  9 8 7 1 2 6 5 3 4  362546   sorted by OPS, lowest to highest

Very similar results here. The best lineups:

  RUNS  ----- LINEUP ----
 4.488  5 2 4 3 1 6 7 8 9
 4.488  5 2 4 3 1 7 6 8 9
 4.487  5 2 4 3 1 6 8 7 9
 4.486  1 2 4 3 5 7 6 8 9
 4.486  5 1 4 3 6 2 7 8 9
 4.485  1 2 4 3 5 6 7 8 9
 4.485  1 2 4 3 5 6 8 7 9
 4.485  5 1 4 3 2 7 6 8 9
 4.485  5 1 4 3 6 2 8 7 9
 4.485  5 1 4 3 6 7 2 8 9

There are some weird ones here, but also some that are extremely similar to the traditional one.

And the worst:

  RUNS  ----- LINEUP ----
 4.380  2 9 1 7 8 6 5 4 3
 4.380  2 9 1 7 8 6 5 3 4
 4.381  9 8 1 7 2 6 5 4 3
 4.381  2 9 1 7 8 6 3 5 4
 4.381  2 9 1 7 8 5 6 4 3
 4.381  2 9 1 7 8 5 6 3 4
 4.382  9 8 1 7 2 6 5 3 4
 4.382  2 9 1 7 8 6 4 5 3
 4.382  2 8 1 9 7 6 5 4 3
 4.382  2 8 1 7 9 6 5 4 3

Note that most of the "worst" lineups feature hitters 3 through 5 at the bottom of the order. This makes sense since they are the best hitters.

I must confess that this wasn't what I hoped to find when I began this investigation. I suppose the best-case scenario would have been to find a handful of counter-intuitive lineups that were significantly better than the traditional ones. As it was, the best lineup in the NL scored only 4.4% more runs than the worst, and in the AL the range was even narrower, as the best team scored only 2.4% more runs than the worst. And the difference between the best and the traditional lineup is negligible: in the NL it amounted to 0.38% more runs (or about 3 runs a season) and in the AL it was 0.24% more runs. These results seem to agree with the long-held belief that the ordering makes little difference.

In addition to the traditional model, one of the things I want to look at was the impact of right and left-handed pitchers on our lineup. Let's start by taking another look at the composite lineups in the NL and AL from 1993 to 2004, this time broken down by the handedness of the hitters.

NL right-handed hitters:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  46840 12652 2407  373  955  4319  122  7278  645  390  259 3167 1240  .338  .399
 2  64394 17773 3393  429 1630  5560   72 10458  752 1010  437 1616  584  .339  .418
 3  50931 14881 2873  240 2565  6241  593  9282  603   81  539 1460  549  .373  .509
 4  64521 18433 3627  282 3245  6801  734 12377  751   38  691 1008  465  .357  .502
 5  59166 16199 3214  289 2441  5334  379 11043  653  168  571 1128  588  .338  .462
 6  65773 17499 3492  360 2369  5550  363 12922  745  254  623 1205  643  .327  .438
 7  72331 18420 3644  361 2030  5813  384 14297  756  451  626  866  520  .314  .399
 8  68274 16978 3309  367 1411  6314 1414 13121  734  674  554  577  294  .317  .370
 9  60365 10101 1765  124  574  3268   89 20974  343 5429  284  244  123  .213  .229

NL left-handed hitters:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  42937 11891 1997  517  688  4147  161  6205  394  349  232 1305  527  .344  .396
 2  26394  7212 1340  245  554  2536   47  4245  199  288  148  960  419  .340  .406
 3  46306 13773 2903  306 2161  6781  772  7648  475   70  510  706  249  .389  .513
 4  32010  9166 1974  156 1606  4796  735  6234  301   23  351  507  231  .381  .508
 5  37247 10185 2174  218 1516  4567  491  7241  239   63  309  377  206  .354  .466
 6  28371  7601 1470  155  974  2991  371  5578  205   97  229  279  154  .340  .434
 7  19773  5022  917  145  501  1993  297  3992  151  118  144  226  102  .325  .391
 8  14814  3714  698  112  292  1734  406  2858  129  120  150  310  147  .331  .372
 9  26402  5224  958   85  367  2143  130  7631  102 1651  149  150   93  .259  .282

NL switch-hitters:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  30557  8284 1427  299  432  3151   80  4917  240  279  171 1059  443  .342  .380
 2  26644  7327 1256  223  421  2467   38  3865  178  524  165  344  148  .339  .386
 3  14074  4033  780   59  527  1711  145  2221   64   42  139  319  104  .363  .463
 4  13302  3723  754   55  623  1882  224  2615   97    9  145  199  107  .370  .485
 5  12747  3460  723   51  422  1438  154  2454  111   31  136  175  124  .347  .435
 6  13008  3467  699   85  371  1451  143  2522  123   70  124  199  120  .343  .419
 7  12581  3117  639   77  295  1328  110  2430  126   93  111  152   91  .323  .381
 8  17192  4352  765  142  245  1994  397  3025  137  163  138  147   68  .333  .357
 9   8437  1862  353   38  155   865   43  2178   63  232   56  110   57  .296  .327

AL right-handed hitters:

AL     AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  39789 10867 2082  282  862  3864   86  6278  560  349  270 2312  800  .344  .405
 2  49623 13668 2646  287 1254  4420   68  8086  583  661  407 1536  525  .339  .416
 3  46666 13550 2658  203 2187  5480  475  8209  529   72  564 1064  336  .367  .497
 4  57844 16456 3360  184 3046  6923  657 11089  660   32  677  493  259  .364  .507
 5  46617 12607 2570  160 1905  4601  285  9057  501   86  475  550  339  .339  .455
 6  52233 13855 2814  203 1909  4618  248 10343  574  188  428  698  411  .329  .437
 7  52517 13526 2676  218 1636  4342  200 10095  604  323  479  775  464  .319  .410
 8  59530 15058 2993  246 1477  4397  133 11072  665  681  496  676  431  .309  .386
 9  61231 15382 2997  318 1108  4195   32 11012  637 1163  494  675  369  .304  .365

AL left-handed hitters:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  38944 11040 1892  407  817  3761  219  5426  351  314  294 1163  481  .350  .416
 2  25900  7264 1408  193  675  2659   84  3962  215  237  203 1034  407  .350  .428
 3  43843 12696 2552  180 1936  5738  561  7363  438   54  507  604  230  .374  .488
 4  31711  8903 1849   93 1588  4281  561  5889  293   16  313  367  170  .368  .495
 5  40219 11143 2377  151 1708  4898  564  7058  300   57  394  356  172  .357  .471
 6  32035  8760 1755  157 1176  3450  345  5995  252  102  251  273  172  .346  .448
 7  27065  7249 1348  175  853  2822  301  5015  243  133  244  218  122  .340  .425
 8  19319  5083 1020   96  525  1849  184  3786  159  143  163  227  138  .330  .407
 9  13986  3561  654   97  238  1131   47  2368  118  207  125  472  285  .313  .366

AL switch-hitters:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  32319  8801 1588  329  553  3585  121  5225  191  328  228 1323  495  .346  .393
 2  32863  8954 1631  282  609  3382   51  5094  221  549  314  388  155  .341  .395
 3  13617  3955  748  112  446  1463  111  2227  105   65  171  188   85  .360  .460
 4  12098  3437  671   52  522  1531  201  2315   82   12  129  120   77  .365  .478
 5  13785  3764  736   61  512  1677  166  2693  123   29  118  167  105  .354  .447
 6  14864  3946  758   80  522  1781  166  2982  121   68  113  213  115  .346  .433
 7  17109  4344  915   90  455  1840  129  3268  133  106  162  232  131  .328  .398
 8  15692  4016  766   94  356  1632   96  2899  115  158  136  245  183  .328  .385
 9  16709  4155  747  124  234  1483   27  3125  133  312  134  481  271  .313  .350

The pattern in both leagues is similar. Lefties appear most frequently in the first, third and fifth slots, while switch-hitters are most likely to bat first or second. One of the reasons for alternating righty and lefty hitters is to minimize the effectiveness of short relievers in late innings, since these pitchers will often lose the platoon advantage after one batter.

What I would like to model is the traditional lineup in both leagues, with lefties batting first, third and fifth and right-handed hitters in the other slots. I want to look at the best and worst lineups against both right and left handed pitching. But before I do that here are a bunch of more charts

NL right-handed hitter vs right-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  33460  8980 1671  280  617  2916   52  5261  513  298  192 2364  845  .335  .390
 2  45990 12527 2338  300 1106  3704   24  7667  633  753  312 1288  412  .333  .408
 3  37770 10962 2118  174 1910  4326  365  7078  501   59  395 1126  391  .367  .507
 4  46943 13231 2556  196 2300  4487  363  9315  636   31  519  751  322  .349  .492
 5  41441 11210 2216  205 1688  3442  174  8023  558  144  404  866  410  .332  .456
 6  47974 12630 2473  263 1674  3706  206  9704  626  204  472  944  483  .321  .430
 7  54249 13622 2630  255 1497  4102  222 10976  642  359  483  708  389  .309  .392
 8  51283 12607 2383  284 1027  4498  997 10056  627  541  400  449  218  .312  .363
 9  43936  7180 1234   82  371  2140   19 15367  249 4021  178  205  102  .206  .221

NL right-handed hitter vs left-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  13380  3672  736   93  338  1403   70  2017  132   92   67  803  395  .348  .419
 2  18404  5246 1055  129  524  1856   48  2791  119  257  125  328  172  .352  .442
 3  13161  3919  755   66  655  1915  228  2204  102   22  144  334  158  .387  .514
 4  17578  5202 1071   86  945  2314  371  3062  115    7  172  257  143  .378  .528
 5  17725  4989  998   84  753  1892  205  3020   95   24  167  262  178  .351  .475
 6  17799  4869 1019   97  695  1844  157  3218  119   50  151  261  160  .343  .459
 7  18082  4798 1014  106  533  1711  162  3321  114   92  143  158  131  .330  .422
 8  16991  4371  926   83  384  1816  417  3065  107  133  154  128   76  .330  .389
 9  16429  2921  531   42  203  1128   70  5607   94 1408  106   39   21  .233  .252

NL left-handed hitter vs right-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  33686  9398 1622  419  587  3304  154  4754  243  235  171 1027  425  .346  .404
 2  21158  5873 1107  199  470  2048   45  3301  134  200  118  742  322  .343  .415
 3  33467 10152 2175  234 1673  5227  703  5296  263   36  355  563  185  .398  .532
 4  24415  7147 1594  118 1283  3833  674  4509  183   10  268  401  181  .389  .525
 5  30041  8349 1820  176 1267  3732  474  5606  153   35  240  308  157  .358  .477
 6  22945  6293 1224  133  828  2501  369  4336  136   68  175  227  124  .347  .447
 7  16077  4116  759  116  427  1714  290  3131  106   71  112  186   81  .330  .397
 8  12096  3102  585   89  258  1510  372  2282   97   92  131  260  126  .340  .384
 9  21506  4419  840   76  330  1843  127  5954   82 1226  125  104   72  .269  .298

NL left-handed hitter vs left-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1   9251  2493  375   98  101   843    7  1451  151  114   61  278  102  .338  .364
 2   5236  1339  233   46   84   488    2   944   65   88   30  218   97  .325  .366
 3  12839  3621  728   72  488  1554   69  2352  212   34  155  143   64  .365  .464
 4   7595  2019  380   38  323   963   61  1725  118   13   83  106   50  .354  .453
 5   7206  1836  354   42  249   835   17  1635   86   28   69   69   49  .336  .419
 6   5426  1308  246   22  146   490    2  1242   69   29   54   52   30  .309  .375
 7   3696   906  158   29   74   279    7   861   45   47   32   40   21  .304  .364
 8   2718   612  113   23   34   224   34   576   32   28   19   50   21  .290  .321
 9   4896   805  118    9   37   300    3  1677   20  425   24   46   21  .215  .215

NL switch-hitters vs right-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  22316  6018  980  225  292  2369   67  3650  184  203  136  850  333  .343  .373
 2  20040  5497  927  182  309  1859   30  2872  151  367  122  257  108  .339  .385
 3  10255  2963  575   44  408  1300  119  1608   49   31  103  242   72  .368  .473
 4   9854  2754  575   36  481  1438  164  1999   69    5  112  151   68  .371  .492
 5   9618  2599  544   40  317  1138  128  1882   75   19  120  146   90  .348  .434
 6   9938  2661  528   74  290  1138  118  1924   95   39   88  169   90  .346  .423
 7   9460  2368  487   64  223  1018   89  1816   97   69   78  119   67  .327  .386
 8  13097  3331  572  121  184  1536  318  2310  101  120  115  115   45  .335  .359
 9   6461  1429  279   30  125   690   35  1662   52  159   45   86   42  .300  .332

NL switch-hitters vs left-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1   8241  2266  447   74  140   782   13  1267   56   76   35  209  110  .341  .398
 2   6604  1830  329   41  112   608    8   993   27  157   43   87   40  .339  .390
 3   3819  1070  205   15  119   411   26   613   15   11   36   77   32  .349  .435
 4   3448   969  179   19  142   444   60   616   28    4   33   48   39  .365  .468
 5   3129   861  179   11  105   300   26   572   36   12   16   29   34  .344  .440
 6   3070   806  171   11   81   313   25   598   28   31   36   30   30  .333  .405
 7   3121   749  152   13   72   310   21   614   29   24   33   33   24  .311  .366
 8   4095  1021  193   21   61   458   79   715   36   43   23   32   23  .328  .351
 9   1976   433   74    8   30   175    8   516   11   73   11   24   15  .285  .310

AL right-handed hitter vs right-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  27339  7441 1371  198  567  2594   48  4382  441  273  204 1711  546  .343  .399
 2  34110  9289 1781  195  831  2879   17  5776  473  465  293 1150  348  .335  .409
 3  33523  9694 1872  145 1520  3658  255  5998  441   55  414  796  218  .363  .490
 4  41605 11606 2372  143 2146  4566  299  8180  543   27  482  342  166  .354  .498
 5  31383  8383 1682  107 1240  2852   99  6223  395   61  341  419  241  .333  .446
 6  36552  9555 1932  142 1322  3057  102  7475  470  142  318  509  283  .324  .431
 7  37251  9409 1817  159 1114  2871   90  7389  501  250  358  591  322  .312  .400
 8  43071 10736 2113  165 1024  3069   67  8138  555  537  350  534  298  .305  .377
 9  44837 11168 2146  220  782  2945   11  8285  535  916  346  508  240  .301  .359

AL right-handed hitter vs left-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  12450  3426  711   84  295  1270   38  1896  119   76   66  601  254  .346  .417
 2  15513  4379  865   92  423  1541   51  2310  110  196  114  386  177  .349  .432
 3  13143  3856  786   58  667  1822  220  2211   88   17  150  268  118  .379  .514
 4  16239  4850  988   41  900  2357  358  2909  117    5  195  151   93  .387  .531
 5  15234  4224  888   53  665  1749  186  2834  106   25  134  131   98  .353  .473
 6  15681  4300  882   61  587  1561  146  2868  104   46  110  189  128  .342  .451
 7  15266  4117  859   59  522  1471  110  2706  103   73  121  184  142  .336  .436
 8  16459  4322  880   81  453  1328   66  2934  110  144  146  142  133  .319  .408
 9  16394  4214  851   98  326  1250   21  2727  102  247  148  167  129  .311  .381

AL left-handed hitter vs right-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  29762  8553 1493  338  682  2889  205  3885  242  195  214  925  343  .353  .429
 2  20295  5817 1153  158  578  2143   81  2952  140  154  155  774  293  .356  .444
 3  31387  9343 1932  146 1494  4398  539  4890  229   25  353  464  168  .384  .511
 4  23468  6711 1384   75 1248  3392  528  4116  183    4  228  275  126  .377  .511
 5  31119  8740 1916  124 1388  4032  551  5151  180   30  293  280  121  .364  .484
 6  24744  6842 1380  128  957  2758  336  4392  142   57  186  207  129  .350  .459
 7  21333  5827 1089  144  702  2302  297  3786  149   77  193  164   94  .345  .436
 8  15447  4145  851   77  457  1504  184  2877   98   94  126  165   99  .335  .422
 9  11090  2864  545   81  196   932   44  1796   78  133  106  342  188  .317  .375

AL left-handed hitter vs left-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1   9182  2487  399   69  135   872   14  1541  109  119   80  238  138  .339  .373
 2   5605  1447  255   35   97   516    3  1010   75   83   48  260  114  .326  .368
 3  12456  3353  620   34  442  1340   22  2473  209   29  154  140   62  .346  .431
 4   8243  2192  465   18  340   889   33  1773  110   12   85   92   44  .342  .450
 5   9100  2403  461   27  320   866   13  1907  120   27  101   76   51  .333  .426
 6   7291  1918  375   29  219   692    9  1603  110   45   65   66   43  .333  .413
 7   5732  1422  259   31  151   520    4  1229   94   56   51   54   28  .318  .383
 8   3872   938  169   19   68   345    0   909   61   49   37   62   39  .311  .348
 9   2896   697  109   16   42   199    3   572   40   74   19  130   97  .297  .333

AL switch-hitters vs right-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  23600  6444 1146  240  409  2672   97  3896  135  226  173 1027  375  .348  .394
 2  23964  6611 1161  223  460  2578   40  3679  158  370  223  277  104  .347  .401
 3   9672  2816  541   89  319  1036   78  1542   70   40  135  139   63  .359  .464
 4   8519  2406  470   43  359  1107  131  1634   54    6   85   94   54  .365  .474
 5   9900  2708  531   47  393  1231  118  1970   89   21   86  122   77  .356  .456
 6  10925  2937  543   69  410  1333  123  2248   97   50   86  174   87  .351  .444
 7  12622  3202  681   67  350  1421  111  2431  102   55  120  186   93  .331  .401
 8  11704  2970  565   80  259  1291   78  2185   91  111  101  187  118  .330  .382
 9  12389  3089  545  107  172  1126   22  2376  102  214   97  357  194  .315  .352

AL switch-hitters vs left-handed pitchers:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1   8719  2357  442   89  144   913   24  1329   56  102   55  296  120  .341  .391
 2   8899  2343  470   59  149   804   11  1415   63  179   91  111   51  .326  .380
 3   3945  1139  207   23  127   427   33   685   35   25   36   49   22  .360  .449
 4   3579  1031  201    9  163   424   70   681   28    6   44   26   23  .364  .486
 5   3885  1056  205   14  119   446   48   723   34    8   32   45   28  .349  .424
 6   3939  1009  215   11  112   448   43   734   24   18   27   39   28  .334  .402
 7   4487  1142  234   23  105   419   18   837   31   51   42   46   38  .320  .387
 8   3988  1046  201   14   97   341   18   714   24   47   35   58   65  .322  .393
 9   4320  1066  202   17   62   357    5   749   31   98   37  124   77  .306  .344

The switch-hitting charts aren't really needed for what I'll be doing but I wanted to include them for completeness.

So given the handedness of my lineup (first, third and fifth lefties and the rest righties), here's how my lineup will hit against right and left-handed pitchers.

NL versus righties:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  33686  9398 1622  419  587  3304  154  4754  243  235  171 1027  425  .346  .404
 2  45990 12527 2338  300 1106  3704   24  7667  633  753  312 1288  412  .333  .408
 3  33467 10152 2175  234 1673  5227  703  5296  263   36  355  563  185  .398  .532
 4  46943 13231 2556  196 2300  4487  363  9315  636   31  519  751  322  .349  .492
 5  30041  8349 1820  176 1267  3732  474  5606  153   35  240  308  157  .358  .477
 6  47974 12630 2473  263 1674  3706  206  9704  626  204  472  944  483  .321  .430
 7  54249 13622 2630  255 1497  4102  222 10976  642  359  483  708  389  .309  .392
 8  51283 12607 2383  284 1027  4498  997 10056  627  541  400  449  218  .312  .363
 9  43936  7180 1234   82  371  2140   19 15367  249 4021  178  205  102  .206  .221

NL versus lefties:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1   9251  2493  375   98  101   843    7  1451  151  114   61  278  102  .338  .364
 2  18404  5246 1055  129  524  1856   48  2791  119  257  125  328  172  .352  .442
 3  12839  3621  728   72  488  1554   69  2352  212   34  155  143   64  .365  .464
 4  17578  5202 1071   86  945  2314  371  3062  115    7  172  257  143  .378  .528
 5   7206  1836  354   42  249   835   17  1635   86   28   69   69   49  .336  .419
 6  17799  4869 1019   97  695  1844  157  3218  119   50  151  261  160  .343  .459
 7  18082  4798 1014  106  533  1711  162  3321  114   92  143  158  131  .330  .422
 8  16991  4371  926   83  384  1816  417  3065  107  133  154  128   76  .330  .389
 9  16429  2921  531   42  203  1128   70  5607   94 1408  106   39   21  .233  .252

AL versus righties:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1  29762  8553 1493  338  682  2889  205  3885  242  195  214  925  343  .353  .429
 2  34110  9289 1781  195  831  2879   17  5776  473  465  293 1150  348  .335  .409
 3  31387  9343 1932  146 1494  4398  539  4890  229   25  353  464  168  .384  .511
 4  41605 11606 2372  143 2146  4566  299  8180  543   27  482  342  166  .354  .498
 5  31119  8740 1916  124 1388  4032  551  5151  180   30  293  280  121  .364  .484
 6  36552  9555 1932  142 1322  3057  102  7475  470  142  318  509  283  .324  .431
 7  37251  9409 1817  159 1114  2871   90  7389  501  250  358  591  322  .312  .400
 8  43071 10736 2113  165 1024  3069   67  8138  555  537  350  534  298  .305  .377
 9  44837 11168 2146  220  782  2945   11  8285  535  916  346  508  240  .301  .359

AL versus lefties:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 1   9182  2487  399   69  135   872   14  1541  109  119   80  238  138  .339  .373
 2  15513  4379  865   92  423  1541   51  2310  110  196  114  386  177  .349  .432
 3  12456  3353  620   34  442  1340   22  2473  209   29  154  140   62  .346  .431
 4  16239  4850  988   41  900  2357  358  2909  117    5  195  151   93  .387  .531
 5   9100  2403  461   27  320   866   13  1907  120   27  101   76   51  .333  .426
 6  15681  4300  882   61  587  1561  146  2868  104   46  110  189  128  .342  .451
 7  15266  4117  859   59  522  1471  110  2706  103   73  121  184  142  .336  .436
 8  16459  4322  880   81  453  1328   66  2934  110  144  146  142  133  .319  .408
 9  16394  4214  851   98  326  1250   21  2727  102  247  148  167  129  .311  .381

Now, my model is designed to predict scoring over an eight-inning stretch and, while it won't give us much insight into how each lineup will perform against a parade of short relievers in the late innings of a game, it could give us an idea whether or not such a lineup is designed to work well during the earlier innings of a game.

The best lineups against right-handed pitchers in the National League (I've added the traditional one at the end):

  RUNS  ----- LINEUP ----      V/L
 4.103  1 3 5 4 2 6 7 9 8    4.198
 4.100  1 3 5 2 4 6 7 9 8    4.228
 4.099  1 3 5 4 2 6 8 7 9    4.194
 4.098  1 3 5 4 2 6 9 8 7    4.193
 4.097  1 2 5 3 4 6 7 9 8    4.228
 4.097  1 3 5 4 2 7 6 9 8    4.197
 4.096  1 2 5 3 4 6 8 7 9    4.227
 4.096  1 3 4 5 2 6 7 9 8    4.205
 4.096  2 3 1 5 4 6 7 9 8    4.210
 4.095  1 3 2 5 4 6 7 9 8    4.230
 
 4.077  1 2 3 4 5 6 7 8 9    4.195
 
Where: V/L - how the same lineup did against left-handed pitchers.

Notice the the very best lineups bunch all the lefties at the top of the order. Note also that all of these lineups actually hit lefties better than righties. That's because we have more right-handed hitters in the lineup.

The worst in the NL:

  RUNS  ----- LINEUP ----
 3.861  8 7 2 1 9 5 6 4 3
 3.863  8 7 2 1 9 5 4 6 3
 3.863  8 7 2 1 9 4 5 6 3
 3.863  2 8 7 1 9 5 6 4 3
 3.864  8 9 5 1 7 2 6 4 3
 3.865  8 9 5 1 7 6 2 4 3
 3.865  8 9 4 1 7 2 6 5 3
 3.865  8 7 2 1 9 6 5 4 3
 3.865  8 7 2 1 9 4 6 5 3
 3.865  2 9 4 1 8 7 6 5 3

The best in the AL (again, with the traditional one at the end):

  RUNS  ----- LINEUP ----      V/L
 4.463  4 3 2 5 1 6 8 7 9    4.506
 4.462  4 3 2 5 1 6 7 8 9    4.505
 4.461  4 1 2 5 3 6 7 8 9    4.516
 4.461  4 1 2 5 3 6 8 7 9    4.517
 4.461  4 2 5 3 1 6 8 7 9    4.503
 4.460  4 2 5 1 3 6 7 8 9    4.512
 4.460  4 2 5 1 3 6 8 7 9    4.514
 4.460  4 2 5 3 1 6 7 8 9    4.501
 4.460  4 3 2 5 1 6 7 9 8    4.503
 4.460  4 3 2 5 1 6 8 9 7    4.505
 
 4.441  1 2 3 4 5 6 7 8 9    4.499

I found it strange that all of these lineups had the cleanup hitter leading off and several of them bunched the lefties together, but this time in the middle of the order.

And the worst in the AL:

  RUNS  ----- LINEUP ----
 4.323  9 8 1 7 6 5 4 2 3
 4.323  9 8 1 7 6 2 3 4 5
 4.323  9 8 1 6 2 7 5 4 3
 4.323  8 9 1 7 6 5 4 2 3
 4.323  6 9 1 7 8 5 4 2 3
 4.323  2 9 7 6 8 1 4 3 5
 4.324  9 8 1 7 6 4 2 3 5
 4.324  9 8 1 7 6 2 5 4 3
 4.324  9 8 1 7 6 2 4 3 5
 4.324  9 8 1 6 2 7 4 3 5

There is more of a difference between the very best and worst lineups this time around, but these values are still in a rather narrow band, and the best lineup against right-handed pitchers is still not noticeably better than the traditional one.

The best lineups in the NL against lefties (with the traditional one at the end):

  RUNS  ----- LINEUP ----      V/R
 4.252  8 4 6 2 3 5 1 7 9    4.037
 4.251  6 1 2 3 4 5 8 7 9    4.062
 4.251  6 1 2 3 4 5 8 9 7    4.050
 4.251  7 1 6 3 4 2 5 8 9    4.037
 4.251  8 4 6 1 3 2 5 7 9    4.030
 4.249  6 1 2 3 4 8 5 9 7    4.038
 4.249  6 1 3 2 4 5 8 9 7    4.039
 4.249  7 1 6 2 3 4 5 8 9    4.040
 4.248  6 1 3 2 4 5 8 7 9    4.051
 4.248  7 1 2 3 4 6 5 8 9    4.052
 
 4.195  1 2 3 4 5 6 7 8 9    4.077
 
Where: V/R - how the same lineup did against right-handed pitchers.

Notice how the best lineup bunches all the lefty hitters together toward the bottom of the order, although this is not true of the others on the list.

The worst:

  RUNS  ----- LINEUP ----
 4.016  5 8 1 9 2 3 7 4 6
 4.019  5 8 1 9 2 3 7 6 4
 4.020  5 8 1 9 2 7 4 6 3
 4.024  5 8 1 9 2 7 3 6 4
 4.025  5 8 1 9 2 7 4 3 6
 4.025  5 8 1 9 2 6 7 4 3
 4.026  5 8 1 9 2 3 4 7 6
 4.027  3 8 1 9 2 7 4 6 5
 4.028  5 8 1 9 2 7 3 4 6
 4.028  5 8 1 9 2 3 4 6 7

And the same for the AL. The best:

  RUNS  ----- LINEUP ----      V/R
 4.546  6 7 1 4 3 2 5 9 8    4.415
 4.545  6 7 1 4 3 2 5 8 9    4.416
 4.545  6 7 2 4 3 1 5 9 8    4.420
 4.544  6 7 2 4 3 1 5 8 9    4.421
 4.543  6 7 1 4 2 3 5 8 9    4.400
 4.543  6 7 1 4 2 3 5 9 8    4.398
 4.541  6 7 5 4 3 1 2 9 8    4.428
 4.540  6 7 1 4 3 2 9 5 8    4.407
 4.540  6 7 2 4 3 1 9 5 8    4.410
 4.539  5 7 1 4 3 2 6 8 9    4.435
 
 4.499  1 2 3 4 5 6 7 8 9    4.441

Again, we see a few of the lineups clustering the lefties toward the bottom of the order. Notice that in both the NL and AL, the difference between the best and the normal lineup is about .05 runs, which is much greater than the gaps we've seen before. Still, since that amounts to about a run every three weeks, it's certainly not a dramatic difference.

The worst lineups:

  RUNS  ----- LINEUP ----
 4.404  9 1 8 7 5 6 3 2 4
 4.405  3 2 8 9 1 6 7 5 4
 4.408  9 1 8 7 6 5 3 2 4
 4.408  7 5 8 9 1 6 3 2 4
 4.408  3 1 8 9 5 2 7 4 6
 4.409  9 2 8 3 1 6 7 5 4
 4.409  3 2 8 9 1 7 5 4 6
 4.409  3 2 8 9 1 5 7 4 6
 4.410  9 1 8 7 5 3 2 6 4
 4.410  7 1 8 9 5 6 3 2 4

I wanted to look at one last thing in this article and that had to do with Barry Bonds. During the last few seasons, there has been quite a bit of discussion on how best to utilize him in a lineup. Do you bat him third or fourth or do something even more creative? I've heard suggestions that he should lead-off since opposing teams would be less likely to intentionally walk him. So I thought it might be interesting to analyze the composite lineup for the San Francisco Giants from 1999 to 2004. Instead of looking at their composite batting order, however, I looked at how they hit by position (and when doing this, I credited pinch-hitters to the position of the player they were hitting for). So here's how the Giants hit by position from 1999 to 2004:

       AB     H   2B   3B   HR    BB  IBB    SO  HBP   SH   SF   SB   CS   OBP   SLG
 P   2882   536  105    5   36   176   12   787   20  282   20   10    7  .236  .263
 C   3440   919  202   25   86   304   40   629   40   27   28   17    9  .331  .415
1B   3487   941  189   16  121   462   33   749   52    4   45   14   18  .360  .437
2B   3717  1078  239   34  139   386   24   604   36   25   45   62   42  .359  .485
3B   3637   964  190   14   92   349   17   480   34   32   36   20   13  .332  .401
SS   3706  1041  196   21  122   280   21   528   16   30   31   10   18  .332  .444
LF   3099   984  201   20  292   968  290   488   53   10   29   84   18  .483  .678
CF   3834  1030  200   29  106   347   11   632   36   19   17  105   44  .334  .419
RF   3551   956  180   31  149   475   28   766   28   17   32   67   35  .357  .463

Note that not all of the left-fielder's batting line is the responsibility of Barry Bonds, but enough of it is to make the line instantly recognizable as his. So what should the Giants have done with Barry Bonds? Here are the top lineups:

  RUNS  --------- LINEUP ---------
 4.699  CF RF LF 1B 2B  C 3B SS  P
 4.698  CF RF LF 1B 2B 3B SS  P  C
 4.694  CF RF LF 1B 2B  C  P 3B SS
 4.694  CF RF LF 1B 2B  C SS 3B  P
 4.692  3B RF LF 1B 2B CF SS  P  C
 4.690  CF RF LF 1B 2B  C SS  P 3B
 4.690  CF RF LF 1B 2B SS  C 3B  P
 4.689  CF RF LF 1B 2B  C 3B  P SS
 4.688  3B RF LF 1B 2B CF SS  C  P
 4.688  CF RF LF 1B 2B SS  C  P 3B

The worst:

  RUNS  --------- LINEUP ---------
 4.332   P CF 1B  C SS 3B RF 2B LF
 4.332   C SS 1B  P 2B CF 3B LF RF
 4.332  3B CF 1B  P 2B RF  C LF SS
 4.331   P SS  C 2B CF 3B RF 1B LF
 4.331   P SS CF 2B RF 3B  C 1B LF
 4.327   P SS  C 2B CF 1B 3B RF LF
 4.327   P SS CF 2B 3B 1B  C RF LF
 4.325   P SS CF 2B 1B 3B  C RF LF
 4.324   P SS 3B 2B CF 1B  C RF LF
 4.322   P SS 1B 2B CF 3B  C RF LF

Oh well, it turns out that my method thinks the best position for Bonds was batting third, which is where he batted most of those years (he hit cleanup the last two and a half years of the period). Once again, I was hoping for something a little more creative. A lineup featuring him leading off would have been nice, but it didn't happen. Actually, at this point in the article, I would have settled for a lineup with him in the second slot. The most representative lineup they used during those years?

  RUNS  --------- LINEUP ---------
 4.600  CF 3B LF 2B 1B RF SS  C  P

Which turns out not to be such a bad choice.

I did want to point out a potential problem with this method and that is: beware of small sample sizes. Notice that all of these studies made use of a rather large number of games. The smallest was the six seasons that went into the Giants study. I also ran my method against the 2004 Giants (since Bonds' performance seemed most extreme that season) and found a great deal of variation. Here are the best, traditional and worst lineup I found.

  RUNS  ----- LINEUP ----
 5.145  1 4 2 8 6 5 7 3 9
 4.696  1 2 3 4 5 6 7 8 9
 4.088  7 1 6 8 3 2 4 9 5

Now, these are the sorts of differences I was looking for!

In order to explain these results, I generated the statistical totals (scaled to be about a season's worth of games) for each of the three lineups. Let's start with the normal one.

  RUNS  ----- LINEUP ----
 4.696  1 2 3 4 5 6 7 8 9
 
PL  AB   H  2B  3B  HR  BB  IB  SO HBP  SH  SF   OBP   SLG
 1 679 189  39  12  26  83   3 102   8   5   7  .360  .486
 2 661 169  29   3  15  75   1 111   5  17   6  .333  .377
 3 690 208  52   5  24  41   2  91   5   1   9  .341  .496
 4 487 175  40   3  47 226 107  58   9   0   4  .565  .743
 5 648 166  27   2  18  51   2  84   8   3   3  .317  .387
 6 608 166  41   1  16  57   6  83  15   2   8  .346  .423
 7 603 162  29   3  20  48   4  78   6   5   7  .325  .426
 8 565 150  33   3  10  63  14  97   6  10   4  .343  .388
 9 528  99  20   2   4  36   2 151   9  51   3  .250  .256

Not too surprisingly, this looks a lot like the 2004 Giants batting splits by batting order position. There are small differences. For example, the 4th spot in the order had 233 walks (and 115 of them were intentional) in real life. These are explained by differences in the situational mix due to the removal of base-running events. Many of these events cause a runner to move from first to second, a transition that dramatically increases the chance of Barry Bonds getting a free pass. So this is our base line.

Next up: the best lineup:

  RUNS  ----- LINEUP ----
 5.145  1 4 2 8 6 5 7 3 9
 
PL  AB   H  2B  3B  HR  BB  IB  SO HBP  SH  SF   OBP   SLG
 1 689 191  39  11  27  82   1 103   9   6   5  .359  .483
 4 531 204  49   2  58 228  82  66   9   0   3  .572  .812
 2 647 176  35   5  15  77   1 122   5  19  12  .348  .411
 8 632 180  36   3  11  85  24 102   8   8   7  .373  .403
 6 627 176  43   1  19  59   8  83  18   2  12  .353  .443
 5 633 164  27   2  21  55   1  84   5   2   2  .322  .408
 7 610 169  28   3  22  54   6  81   7   4   4  .341  .441
 3 601 183  46   3  23  34   2  83   3   1  18  .335  .506
 9 537 103  23   2   4  37   2 150   8  50   3  .253  .264

Moving the cleanup hitter (Bonds) to the second spot increased his plate appearances, dramatically decreased his number of intentional walks, and caused his slugging percentage to jump 69 points. It caused the other hitters to improve as well. Except for the leadoff hitter, who stayed where he was in the lineup, every other regular improved their OPS in the new lineup. After Bonds, the largest jump in performance belong to the second-place hitter (most often Michael Tucker and Deivi Cruz) who raised his on-base percentage 15 points and his slugging percentage 34 points.

We'll look at reasons for these in a moment, but first, the worst lineup:

  RUNS  ----- LINEUP ----
 4.088  7 1 6 8 3 2 4 9 5
 
PL  AB   H  2B  3B  HR  BB  IB  SO HBP  SH  SF   OBP   SLG
 7 700 192  34   5  25  53   5  84   5   3   4  .328  .444
 1 652 163  34   9  19  76   1 110   5  11   4  .331  .417
 6 647 165  44   1  12  61   5  90  13   2   6  .329  .382
 8 613 165  33   3  11  74  19 102   7   9   3  .353  .387
 3 640 192  49   4  24  32   2  85   3   1  12  .330  .502
 2 577 141  24   3  11  70   1  97   4  11   5  .328  .354
 4 445 161  36   2  45 188  78  54  10   0   3  .556  .755
 9 516  91  17   2   3  33   2 152   9  72   3  .237  .234
 5 543 132  21   2  16  57   3  78   7   1   3  .321  .378

Earlier in the article, I talked about how this method ignores batter protection, that fact that pitchers may alter the way certain batters are pitched to (or not pitched to, for that matter) depending upon the on-deck hitter. Well, here is an example where that assumption clearly causes problems. Note that Bonds is now hitting immediately in front of the pitcher, and that his walks and intentional walks are the lowest of the three scenarios. I'm pretty confident that this isn't even remotely realistic. Fortunately, you're probably not going to see these types of lineups except when dealing with the pathological bad ones.

One other thing to note with the last lineup is how poorly the leadoff hitter (usually Ray Durham) did. His on-base plus slugging percentage was down 98 points over how he did with the normal lineup.

So why did we see the results we did? Let's start with Bonds - why was his slugging percentage so much higher in the best lineup than it was in the normal one? This is primarily caused by how he did in two situations:

OUT FST   AB  H 2B 3B HR BB IB SO HP SH SF   OBP   SLG   Norm  Best
  0 ---  143 41 10  1 14 34  2 19  3  0  0  .433  .664  1.026  .793
  1 ---   64 30  6  0  7 23  6  5  0  0  0  .609  .891   .488 1.192
 
Where: OUT FST - the situation (outs, men on)
       AB ... SLG - The statistical line for that situation
       Norm - the number of time per game situation occurred in "normal" run
       Best - the number of time per game situation occurred in "best" run

These situations (bases empty with zero and one out) are very similar. Despite this, Bonds did much better when there was one out, due primarily (I believe) to small sample sizes. Moving Bonds up in the lineup to the second spot increased the occurrences of the one-out situation, while decreasing the chances of seeing the no-out one. So one of the reasons the Giants scored more runs with this lineup is that it was designed to take advantage of Bonds' ability to hit with no one on and one out, while minimizing the liability of his poorer showing when leading off an inning. This is fine, except that I don't for a moment believe these are really abilities. Instead, I think this lineup took advantage of an illusion created by a relatively small sample size.

Why did the second-place hitter improve? This was primarily due to the shift in frequency of the following two situations:

OUT FST   AB  H 2B 3B HR BB IB SO HP SH SF   OBP   SLG   Norm  Best
  1 ---  193 45  9  0  4 22  0 29  1  0  0  .315  .342  1.195  .592
  1 F--   38 11  1  0  2  3  0 13  0  0  0  .341  .474   .233  .671

It turns out that dropping him down a spot allowed him to exploit an above average performance with a man on first and one out. Again, this was achieved in a small sample size (41 plate appearances).

Finally, why did the leadoff hitter do so poorly in the worst-case lineup? Here are the situations causing the drop-off:

OUT FST   AB  H 2B 3B HR BB IB SO HP SH SF   OBP   SLG   Norm Worst
  0 ---  297 86 16  4 16 27  0 40  4  0  0  .357  .532  1.848  .724
  1 ---   91 17  3  0  2 13  0 15  0  0  0  .288  .286   .551 1.324

His excellent performance leading off an inning was minimized once he was moved to the second spot. This caused an increase in the frequency of the one-out bases empty situation, a poor one for the 2004 Giants' leadoff hitter. Again, I'm not sure why Tucker and company hit so well with no one out and yet so poorly with one out, but it's probably due more to luck than to talent.

Unfortunately, these small-scale subjects are usually what people care about. They don't want to know about a composite Giants team from 1999 to 2004, they want to know about a single season. People aren't concerned about generic teams, they want specific answers. How much did it cost the 1961 Yankees having Bobby Richardson leading off?1 Or having Horace Clarke in the same spot nine years later?

One way around this problem might be to combine simulation with this method. Take the team you're interested in and simulate 10000 or so games, collecting situational statistics as you go. Then feed this large sample of games into the Markov model and see which of the 362880 lineups work best for that team. Of course, this is probably easier said than done.

Conclusion

If anything, my approach shows that batting orders matter even less than people have believed. You would think that with such complicated forces at work here, some truly bizarre lineups might have been more efficient than the obvious ones used throughout the years, but if they exist, the methods described in this article didn't find them. That doesn't mean that specific teams haven't used illogical lineups in the past, only that one of those teams wasn't the San Francisco Giants over the past six years, and that it probably didn't cost these hypothetical teams a lot of runs anyway.

I did notice that there are a lot of lineups as productive as the traditional ones that would look very odd to players, fans and the sporting press. A lot of the lineups near the top of many of these lists feature pitchers (in the NL) hitting other than last, as well as other weird orderings. There are lots of instances of very different lineups producing almost identical results. But if the normal lineups do almost as well as these creative ones, is there any percentage in straying from conventional wisdom? I don't think so. And I guess that's the real conclusion of this article: since all but the most pathologically weird lineups produce just about the same number of runs, I might be inclined to select the lineup that makes the most intuitive sense to the players and fans. Simply put, it's not worth all the fuss you'd cause trying to be clever with lineups.

Notes

Note 1:

After I wrote this, I got to wondering what this method would do with the 1961 Yankees. Of course, these results shouldn't be taken too seriously, since the same sample size problems we discussed with the 2004 Giants also apply here, but here are the best (followed by the normal) lineups for that team:

  RUNS  ----- LINEUP ----
 5.096  3 6 7 5 4 8 9 1 2
 5.062  7 5 4 3 6 8 9 1 2
 5.059  6 7 2 3 4 8 5 9 1
 5.052  7 3 1 2 4 8 5 9 6
 5.047  2 4 8 5 6 1 7 3 9
 5.047  3 5 1 2 4 8 9 6 7
 5.045  7 5 1 2 3 4 8 9 6
 5.041  3 6 8 5 4 9 1 7 2
 5.034  1 7 2 3 4 8 5 6 9
 5.032  7 4 8 5 6 1 2 3 9
 
 4.575  1 2 3 4 5 6 7 8 9

And the worst:

 3.871  2 1 6 9 4 7 8 3 5
 3.889  2 1 5 9 7 6 8 3 4
 3.913  8 7 6 1 9 5 3 4 2
 3.914  9 7 4 1 5 2 6 8 3
 3.918  8 7 1 9 5 3 6 4 2
 3.921  2 1 9 5 7 6 8 3 4
 3.921  2 1 5 6 9 7 8 3 4
 3.929  2 1 5 9 4 7 8 3 6
 3.934  2 1 9 6 4 7 8 3 5
 3.934  2 1 5 9 7 4 6 8 3

Once again, we see a much larger spread with a single team sample. Here are the yearly stats for the normal lineup:

PL  AB   H  2B  3B  HR  BB  IB  SO HBP  SH  SF   OBP   SLG
 1 709 179  21   4   5  39   1  46   2  10   4  .292  .315 Richardson
 2 700 178  36   5   6  27   0  83   2  12   3  .283  .346 Kubek
 3 619 178  17   4  60  92   0  66   6   0   6  .382  .619 Maris
 4 568 178  18   6  57 132   9 116   0   1   5  .440  .667 Mantle
 5 633 182  21   4  34  50   7  64   4   0   5  .341  .494 Berra
 6 614 173  23   3  29  40   4  94   8   1   5  .331  .471 Skowron/Howard
 7 571 157  25   5  24  64  11  95   6   1   4  .352  .462 Howard/Skowron
 8 566 132  21   7  15  48   8  86   4   2   7  .294  .375 Boyer
 9 531  90   9   2   6  37   2 127   3  32   5  .226  .228 Pitchers

And the "best" lineup:

PL  AB   H  2B  3B  HR  BB  IB  SO HBP  SH  SF   OBP   SLG
 3 656 192  17   6  65 103   0  66  10   0   4  .395  .634
 6 686 207  24   4  38  56   6 106  10   3   2  .362  .515
 7 644 172  24   5  27  78  11 116   8   3   4  .351  .446
 5 652 183  20   3  36  56  11  67   3   0   6  .338  .486
 4 549 168  17   5  58 136   6 108   0   0  12  .436  .672
 8 613 152  24   6  22  53  10  88   4   5   9  .308  .414
 9 575 103  10   2   7  41   2 140   4  33   8  .236  .240
 1 589 150  18   3   4  32   1  36   1  10   5  .292  .316
 2 584 150  33   4   6  22   0  68   2   7   2  .285  .358

I thought it was interesting that this lineup had Maris batting leadoff and Mantle hitting fifth. Such a lineup could have given Roger the extra at-bats needed to break Ruth's record in 154 games. Also notice that apart from the first time at-bat, Maris would be preceded by Richardson and Kubek. Having the pitcher bat seventh is an odd touch, but typical of some of the quirky features of many of these lineups. I'm not convinced that this lineup was 80 runs better than the one they used over the course of a season, but it wouldn't surprise me if it was somewhat better. Considering that Ralph Houk typically batted his two worst-hitting regulars first and second all season, it shouldn't be too hard to improve upon what he did.