About and FAQ Info for Authors Future Articles Archive Store Home Page Home Page

January 29, 2015

Proceedings of the Natural Institute of Science | Volume 2 | HARD 2

A quest to find some absurd yet accurate predictors of Super Bowl success

PNIS Editorial Staff1
1 - Editor, PNIS

click for pdf

Two days ago, we determined the probability of a team winning the Super Bowl based on its ranking in 24 criteria. For example, the team with the better regular season record won the Super Bowl 62.5% of the time. See full article here.

In that article, we focused exclusively on criteria that were relevant to a team’s performance, such as Total Points Allowed, Total Points Scored, and Total Yards Allowed on Defense (see Table 1 here). According to our numbers, the two criteria that best predicted a team’s chances of winning the Super Bowl were 1st Downs Allowed on Defense and Defensive Simple Rating System (DSRS). The team ranked higher in either criterion went on to win the Super Bowl 66.7% of the time.

This exercise got us thinking about the predictive success of non-football-relevant criteria. You should be able to compare two teams in just about anything and then determine how that comparison predicted Super Bowl success.

Thus, the goal of this paper is to find absurd and irrelevant comparisons of Super Bowl teams. Could any of them actually be accurate predictors of Super Bowl success? And, more importantly, could we find criteria that actually performed better than our 66.7% benchmark we set with the relevant data? The hunt was on.

The Predictors
Just like in our prior article, we looked for 24 criteria. The stipulations for these criteria were that: 1) the data must exist, and 2) they should ideally have nothing to do with playing football. We should also point out that we did not determine the accuracy of these predictors until after we collected all the data, and we did not get rid of any predictors because they turned out to be of poor quality. The 24 predictors are, ranked in order of absurdity from least to most:

• Average team height[1]

• Length of starting quarterback’s full name – middle initials not included

• Sum of the retired jersey numbers of all players who retired before the particular Super Bowl game was played

• Distance of the team’s stadium from the Super Bowl Venue – measured as straight line distance

• Date in which team’s city was incorporated

• Number of sister cities of the team’s city – from: Sister Cities International

• Racial diversity for the team’s city

• Average area code for the team’s city

• Bible-mindedness of the team’s city - based on the rankings given by the American Bible Society

• Percent of vote that the team’s US State gave to the victor in the most recent US presidential election

• Average of the Red Green Blue (RGB) number codes of the team’s main colors

• Opinion of a person that we can attest knows nothing about football

• Color that matches the color from this random color generator from Random.org

• Number of Facebook likes given to the team’s home page on pro-football-reference.com

• Cost to sponsor the team’s home page on pro-football-reference.com

Okay, these next few are a bit involved, so stay with us here:

• In real life, would the team name (i.e., the “Cowboys”) own or use a gun? – After much discussion, the team names that could conceivably have used a gun are: 49ers, Patriots, Buccaneers, Raiders, Cowboys, Bills, Redskins, Chiefs, and Jets.

• Current population size of the team name – For some teams, we made slight substitutions: Seahawks (we used the current population size of ospreys), Broncos (feral horses), Patriots (Tea Party members), Colts (domesticated horses), Buccaneers (pirates), Rams (bighorn sheep), and Packers (meatpackers). We considered 49ers, Giants, Raiders and Titans as extinct.

• Genome size of the team name – measured in c-value, obtained from the Animal Genome Size Database. Substitutions here included: Broncos & Colts (horse), Seahawks (sharp-shinned hawk), Panthers (cougar) and Rams (sheep).

• Normality of the points scored by the team in each regular season game – Normality was assessed using the Shapiro-Wilk’s Test, and we used exact P-values to make the comparison of more/less normal (there’s equal sample sizes, so don’t freak out about us using P-values for comparison purposes)

• Alphabetical distance of the Coach’s last name to the last name of the Nobel Literature Prize winner of that particular year – for example, how did the Coaches of the 2013 Super Bowl teams match up with Alice Munro, alphabetically?

• Weight of the player with the jersey number that was closest to the absolute value of the temperature anomaly (in deg C) for that particular year – climate anomalies available here

• Alphabetical distance of the team name to the element whose atomic number corresponded to whatever number Super Bowl it was – for example, Super Bowl I would match with Hydrogen.

• Atomic mass of the element whose symbol is closest alphabetically to the Coach’s initials

If we didn’t provide a URL, then the information was gathered from a combination of pro-football-reference.com and Wikipedia.

We then determined if being greater or lesser in each criteria led to more Super Bowl wins (for example, did taller teams or shorter teams win the highest percentage of Super Bowls?). Ties were omitted. Table 1 shows the resulting predictive ability of all 24 criteria.

Table 1. Quality of the 24 criteria used for rankings. Percentage refers to percent of time the higher ranked team won the Super Bowl. Numbers in parentheses are number of times better team won and number of instances after accounting for ties.

The Results
The main conclusion is that, we did it!! We were able to find 4 criteria that were better than the 66.7% standard set by the football-relevant criteria. However, two of these criteria (Number of Facebook Likes, and Cost to Sponsor Homepage) are extremely suspect. It’s clearly evident that people were liking a particular team’s homepage on pro-football-reference.com because they won the Super Bowl. We’re proud to say we figured this out when we realized Facebook didn’t exist until 2004. Likewise, the cost to sponsor a page on pro-football-reference.com is probably dependent on the number of views that page gets, which again is probably dependent on if that team won the Super Bowl. So, we can throw out our top 2 predictors.

But there’s nothing suspect about our next two: genome size and alphabetical distance of team name to element name. It turns out that the team name with the larger genome wins almost 70% of the Super Bowls. Thus, we highly recommend the next NFL expansion city should choose the ‘Marbled Lungfish’ as their team name.

Also, if your team name is closer alphabetically to the element whose atomic number matches the Super Bowl number, then you have a 68.8% chance of winning the Super Bowl. Tennessee Titans fans just got a glimmer of hope for next year.

Most of the other predictors, though, fell between 50 and 60%. The average accuracy of these absurd predictors (without the Facebook likes and Cost to Sponsor criteria) was 57.2%, while our average for the football-relevant predictors was 59.2%, so overall we did a bit worse.

The larger message, though, is that you can find just about any accurate predictor of Super Bowl success. In fact, we wouldn’t be surprised if there was a criterion that has predicted all 48 Super Bowls correctly. In a way, our paper is somewhat conceptually similar to this Sasquatch niche modelling paper, which cautions researchers against using just any old data to build ecological niche models.

The even larger takeaway from this study, however, is that we have got to get our Person-who-Knows-Nothing-about-Football out to Vegas. We may or may not have uncovered the next Balki.

[1] okay, this might have something to do with playing football, but no one ever bets on games with average height in mind [Return to main text]




Creative Commons License
Proceedings of the Natural Institute of Science (PNIS) by http://pnis.co is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.