If you've been following Advanced Sports Analytics through the NBA season, you are likely familiar with the importance of player correlation when selecting a DFS lineup. For the MLB DFS players, we provide two correlation tools to help players with lineup strategy. Lineup stacking is particularly common in baseball DFS, so having a good understanding of which players make for an optimal stack is extremely useful.
The first tool is a correlation between players on each team. Below is an example of the Toronto Blue Jays player correlation matrix.
Optimal lineup stacks are usually made up of players with the highest correlation. Highly correlated players tend to do well in tandem, while low and negatively correlated players tend to perform independent of each other or in opposition to each other. Unlike NBA lineups, where you see a lot of negative correlation, most baseball teams have positive correlation among their players. But identifying which players have the strongest positive correlation can be advantageous. For example, if you were planning to build a team around Josh Donaldson, a more optimal stack might be Donaldson/Tulowitzki/Martin or as opposed to Donaldson/Encarnacion/Bautista. One of the driving forces behind players' correlations is their spot in the batting order. For example, the 1 or 2 hitters might have a higher correlation with the 3-5 hitters, because the success of the top 2 hitters gives more opportunities for the 3-5 hitters to create fantasy value through RBIs.
Batting Order Correlation
Because of this, we feel relying solely on player correlations as a tool for player stacking is an incomplete approach. Sometimes players correlate because of where they typically bat in the lineup, but might have less correlation on a given day if they are batting out of their usual spot. Conversely, the batting correlation matrices can help target spot starters who, although might have less pure talent as everyday starters, have favorable conditions if they are batting at a particular spot in the order.
Batting order correlations can also be useful early in the season, as many teams feature lineups with new faces, who are featured in the current player correlation matrices (as the data is 2016 data). The fluidity of lineups is much greater in baseball than it is in basketball or football, so there is a lot of value to removing player names from correlation consideration, and looking more closely at the batting order correlations, and then going back and filling in the players who are batting in each slot in a given game.