NFL Data Science Project Ideas

As I conclude my data science bootcamp with General Assembly, I wanted to take a moment to share some project ideas that I hope to undertake in the coming weeks.

As a lifelong football player/fan I am particularly interested in football analytics and wanted to narrow the scope of my ideas to this subject. Over the last 10 years, the amount of data available to NFL teams to analyze has grown exponentially. Amazon Web Services now powers the NFL’s “Next Gen Stats.”

By placing radio frequency identification tags in each player’s shoulder pads, as well as inside the balls used during games, a tracking system can measure data such as “location, speed, distance traveled and acceleration at a rate of 10 times per second.” and charts individual movements within inches. According to the NFL, upwards of 200 new data points are created for every single play of every game.

Below are three ideas that I think would be interesting to study from the perspective of a data scientist:

  1. Which defensive coordinators are best at disguising their coverages? Coordinators try to confuse opposing quarterbacks by disguising their coverages. Next Gen Stats can be used to create classification models that identify the type of coverage that a defense is playing (single coverage, cover-2, quarters, deep 3rds, etc.). I think it would be interesting to look at which coordinators succeed the most at fooling a machine learning classification algorithm (i.e. for which coordinators do these models perform the worst?).

2. Route identifier. Can I build model which takes in player location data and accurately tells me which route that player ran on a passing play? This would be a relatively straightforward multiclass classification project that I could work to refine and make as accurate as possible.

3. Run/pass offensive play-call predictor. Here, I would utilize player location data to identify the offensive formation/personnel grouping, down and distance information, and past play-call history to build a binary classification model that could hopefully predict with some degree of accuracy whether or not the offense will call a pass or run play.

I am a Data Scientist at General Assembly. I hope to help others entering this field by sharing the wisdom, tips, and best practices I learned along the way.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store