If you have any questions about the code here, feel free to reach out to me on Twitter or on Reddit.

Shameless Plug Section

If you like Fantasy Football and have an interest in learning how to code, check out our Ultimate Guide on Learning Python with Fantasy Football Online Course. Here is a link to purchase for 15% off. The course includes 16 chapters of material, 14 hours of video, hundreds of data sets, lifetime updates, and a Slack channel invite to join the Fantasy Football with Python community.

When evaluating end-of-season performance look at points scored, not rank

In this part of the intermediate series, we're gonna recap the 2020 season by learning about a bit more about stacked bar plots, visualizing where each of the top 50 RBs on the season got the majority of their fantasy points, and talking about how to evaluate end-of-year performance for players.

This post is partly inspired by a reddit post made to the fantasyfootball sub linked here. The TLDR of the post is basically that to properly evaluate player performance within a position, look at total points scored, not rankings. Here's an example that proves why this is probably wise.

Alvin Kamara, Dalvin Cook, and Derrick Henry were the RB 1, 2, and 3 this season (in PPR). They scored 378, 338, 333, respectively. The RB4 was David Montgomery, and he scored 265 points. That's 68 points behind Henry (4.25 points per game), 73 behind Cook (4.56/game), and 113 points behind Kamara (7.06/game!). Obviously it isn't fair to compare Kamara to Montgomery like you could literally any other set of players whose rank differential was 3 (eg. Kareem Hunt to Antonio Gibson. 3 rank differential, 1.06 points/game differential).

As we'll see below, using a horizontal bar plot is a good way to visualize how much marginally better a player was over the next-best ranked player. We'll be doing this analysis for RBs, and we'll see that the top three running backs (Alvin Kamara, Dalvin Cook, and Derrick Henry) are in their own tier, while everyone else lags behind. The rest of the RBs tend to cluster together in terms of their performances. This means that the reddit post was true, but most true when comparing top-tier players to mid-tier players.

In short, it was super valuable to own any of Kamara, Cook, Henry this season, while you could have owned any of the RBs in the 5 - 15 range and it wouldn't have made a huge difference (exception below).

Of course, we're always thinking about player value when it comes to fantasy football. You paid a price for Kamara, Cook, and Henry - specifically, a first round pick in redraft. /r/fantasyfootball poster boy James Robinson was the cost of a week one waiver order in most leagues or 10% FAAB. So even though Robinson finished RB7 on the season, and he didn't provide a huge marginal benefit over other RBs like Kamara/Cook/Henry did, you still were able to pick up an RB with second round value for basically free. That's extremely valuable in it's own right, for obvious reasons.

The post caught some flack for looking at total points scored and not points per game (not accounting for injuries, basically). Sure, there were players who missed some time and thus finished the season in the rankings and points scored category much lower than they would have if they played a full season. When McCaffrey played this season, he put up definitive RB1 numbers, but the fact remains that he hardly played. I would have rather have had David Montgomery this season than McCaffrey, even though CMC would have certainly have been in the same tier as Cook/Kamara/Henry this season.

Finally, be aware that these are end-of-season rankings, including week 17. I'll try to put together a data set with weekly data for 2020 so you can run this analysis up to week 16, if you'd like. There's some obvious problems with doing this sort of analysis after week 17 and not week 16, namely that certain players don't play week 17 because teams rest their starters and a lot of fantasy seasons end week 16.

With all the disclaimers out of the way, let's write some code.

Into writing the code

We're going to be using a CSV file I've compiled with 2020 fantasy football stats. Data like this is available going back to 1970 through a github repository I maintain here.

We'll need two libraries for this project, pandas and matplotlib. We import them in the first cell of our notebook environment.

Next, we load our data in to pandas using the read_csv function.

	Player	Tm	Pos	Age	G	GS	Tgt	Rec	PassingYds	PassingAtt	RushingYds	RushingTD	RushingAtt	ReceivingYds	ReceivingTD	FantasyPoints	Fumbles	FumblesLost
0	Derrick Henry	TEN	RB	26.0	16	16	31	19	0	0	2027	17	378	114	0	314.0	3	2
1	Alvin Kamara	NOR	RB	25.0	15	10	107	83	0	0	932	16	187	756	5	295.0	1	0
2	Dalvin Cook	MIN	RB	25.0	14	14	54	44	0	0	1557	16	312	361	1	294.0	5	3
3	Davante Adams	GNB	WR	28.0	14	14	149	115	0	0	0	0	0	1374	18	243.0	1	1
4	Travis Kelce	KAN	TE	31.0	15	15	145	105	4	2	0	0	0	1416	11	208.0	1	1

Here, we're doing some basic data manipulation. We filter based off position to grab only RBs and grab the necessary columns.

I mentioned in the beginning of the post that we would not only be addressing the point/rank problem, but also visualizing where each RB got their points from. When I say where, I mean from which stat category (rushing yards, receiving yards, receptions, etc.) Luckily, we can visualize both these concepts at once with a stacked bar plot. We'll take each fantasy-relevant column and multiply it by their fantasy weight. The sum (or stack) of each of these fantasy relevant columns multiplied by their their respective fantasy weights is a player's fantasy point scored (receptions + total yards * 0.1 + total TDs * 6 - fumbles * -2). The stacked bar plot for reach player will show us the proportion of fantasy points a player accrued for each fantasy relevant stat category.

	Player	RushingYds	RushingTD	Rec	ReceivingYds	ReceivingTD	FantasyPoints	PPRPoints
0	Alvin Kamara	93.2	96	83	75.6	30	295.0	378.0
1	Dalvin Cook	155.7	96	44	36.1	6	294.0	338.0
2	Derrick Henry	202.7	102	19	11.4	0	314.0	333.0
3	David Montgomery	107.0	48	54	43.8	12	211.0	265.0
4	Aaron Jones	110.4	54	47	35.5	12	212.0	259.0
5	Jonathan Taylor	116.9	66	36	29.9	6	217.0	253.0
6	James Robinson	107.0	42	49	34.4	18	201.0	250.0
7	Josh Jacobs	106.5	72	33	23.8	0	198.0	231.0
8	Ezekiel Elliott	97.9	36	52	33.8	12	172.0	224.0
9	Kareem Hunt	84.1	36	38	30.4	30	181.0	219.0
10	Nick Chubb	106.7	72	16	15.0	0	192.0	208.0
11	Mike Davis	64.2	36	59	37.3	12	148.0	207.0
12	Antonio Gibson	79.5	66	36	24.7	0	166.0	202.0
13	Melvin Gordon	98.6	54	32	15.8	6	166.0	198.0
14	Nyheim Hines	38.0	18	63	48.2	24	130.0	193.0

Now, all that's left to do is to plot the data. Our rb_df we're left with has the top 50 RBs in terms of PPR fantasy points scored (you're welcome to do this sort of analysis for half PPR or standard). Moreover, we're left with data on how many fantasy points each player recorded in each relevant fantasy stat category. Adding all of these columns up for each player will give us a fantasy points scored and the (sort-of, not including fumbles lost) length of our stacked bar plot for each player.

And that's our visualization!

As we talked about earlier in the post, Kamara, Cook, and Henry all appear to be in their own tier. There's a huge gap between these three players and the next best RB, Montgomery.

There's a ton of interesting insights you can glean from this, and it kind of tells a story of each player's stat performance for the season. For example, it's easy to see that Henry and Chubb had a relatively low amount of fantasy points accrued via receptions, compared to Cook or Kamara.

Thanks for reading!

Learn Python with Fantasy Football: Stacked Bar Plots

Shameless Plug Section

When evaluating end-of-season performance look at points scored, not rank

Into writing the code