
What truly moves the needle and drives success for baseball teams? At all levels, it’s fairly common to turn to pitching, with the idea that hitting a baseball is so fundamentally hard that it can be nearly impossible against a well-executed pitch.
Stemming from this idea is the research question many baseball teams are asking right now; what type of pitch is the best to throw in each individual count?
There are too much situational data and confounding variables that force our question to be a little more detailed; for each pitcher, which pitch had the best results in each individual count? (Even this doesn’t eliminate all confounding variables).
Here is how I went about answering this question:
Data Collection
All data for this project was obtained via baseballsavant.com. This link is the unique filter tool that I used to download the following pieces of information in the form of a csv file:
- Team (I did Houston Astros as defending champs with dominant pitching staff)
- Pitcher name
- Pitch type
- Count
- Spin rate
- Velocity
- wOBA allowed
The first file used for this research contains 332 rows of data for Houston Astros pitchers, with variables to distinguish pitch count and pitch type.
The next file used has 3,423 rows of data for a large sample of all MLB pitchers (not representative yet) that includes the same information as the previous file but for every qualified pitcher across the league. It also contains more detailed data points such as velocity, spin rate & whiffs that should be used for further research or analysis.
There are some instances in which the wOBA allowed for all pitches from a sample was 0, in the count + pitch combo of 2-0 changeups and 2-0 curveballs. There are also the following caveats to certain data:
n = number of pitches tracked, p = number of unique pitchers to throw the pitch at least once
- 3-0 changeup
- n = 5, p = 5
- 3-0 curve
- n = 3, p = 2
- 3-0 slider
- n = 6, p = 4
- 3-1 curve
- n = 10, p = 4
- 3-1 changeup
- n = 21, p = 7
Define ‘Success’
Within the selected variables is the one that I have designated as a ‘success’ metric for the purpose of this research (wOBA allowed).
wOBA is used because it poses the highest correlational strength with runs scored for an offense, and it is easy to calculate although not easy to obtain raw data for. This is because the coefficients/multipliers used in the calculation change each season based on which outcomes lead to the most runs. Essentially, it assigns a varying weight for each outcome of an at bat, using the intuitive knowledge that a walk is not worth the same to a baseball team as a home run, but in a traditional stat like OBP, they’re considered equal.
I have to remind myself often that when we measure and analyze these types of statistics for pitchers, the lower number is typically more desired, meaning lower is better. Regardless, this is how Fangraphs classifies wOBA categories:
Rating | wOBA |
Excellent | .400 |
Great | .370 |
Above Average | .340 |
Average | .320 |
Below Average | .310 |
Poor | .300 |
Awful | .290 |
However, with the available data we can gather a more conclusive answer of categorization.
Findings + Interpretation
Below is the set of data visualization pieces created in IBM SPSS Statistics software with the pitch data previously mentioned. These scatter plots illustrate which pitch type has the most/least success for 8 different HOU pitchers in the 2022 season.
For a more detailed view, the tabular representation of the data is below:
Luis Garcia
Best pitch on 2-2: Changeup
Justin Verlander
Best pitch on 0-0: Fastball
Jose Urquidy
Best pitch on 1-2: Fastball
Ryan Pressly
Best pitch on 0-0: Curveball
Future Research
Although not included in this project, I can envision two areas where future research can be directed;
- Compare more metrics such as whiff rate or velocity
- Obtain multiple seasons of data for individual pitchers