I’ve alway been fascinated by the idea of making money on sports betting markets. In this post, I’ll analyze historical NBA betting data to see how profitable it can be using a **value betting** strategy. By the end, I’ll simulate a few betting strategies and show you how profitable they could be — without referring to ANY basketball knowledge!

Even though you don’t need to be an NBA expert, this post assumes you have some knowledge of sports betting concepts and terminology. If you aren’t familiar with them, check out this short guide for a primer.

## How accurate are sportsbooks’ odds?

In theory, the implied probability of winning your bet should be identical to the probability of your team winning the game. In reality, the sportsbooks don’t always set their odds that way. Instead, they invite action on both sides so the amount of money at stake is balanced, thus reducing their risk and maximizing their profit. If sportsbooks have too much money on one team, they may move the odds away from the game’s “true” win probabilities.

For example, let’s say the Milwaukee Bucks (the team with the best record in 2019-20) are playing against the New York Knicks (a team with… *not* the best record). If bettors rush to bet on the Bucks, the odds could swing very heavily in their favor. The sportsbooks may choose to shift the moneyline odds so that a Bucks win would pay out less. At the same time, the payout for a Knicks win would increase.

These payouts could swing such that implied probability of a Bucks win is 90%, when in reality it might only be 80%. In this case we have a **value bet**, where it would be profitable in the long-run to bet on the Knicks.

Of course, the big challenge that all bettors face is finding the true win probability of each bet AND figuring it out before everyone else does. It’s impossible to know these win probabilities for certain, but can we find games where the **implied odds are more likely to be inefficient**? This could help us spot opportunities for value bets.

## Step 1: Getting data

First things first, we need a large dataset for our analysis. We’ll need data for the following:

- The betting odds of each game. We use this info to determine which team to bet on, and if we win the bet, the amount of our winnings.
- The winner of each game. We need to know whether we won our bets or not.

I ended up using Sportsbook Review, a site that aggregates historical betting odds from many different sportsbooks. From there, I found an open-source repository with a script that can effectively scrape betting data from any historical NBA game. I modified the script to also scrape the final score of each game and to run for an entire NBA season, which I did for three regular seasons: 2017-18, 2018-19, and 2019-20 (only including games before the pandemic suspended the season). Here’s a snapshot of what the dataset looks like:

There are a lot of different ways to place bets on NBA games, like moneylines, point spreads, point totals, etc. For the sake of this post, I’ll focus exclusively on the profitability of moneylines. Similarly, there are many different sportsbooks you can use to bet. For the sake of simplicity, I will focus exclusively on Pinnacle, which is regarded as having some of the most accurate odds in the industry.

## Step 2: Finding differences in implied probability

The goal here is to examine the implied odds of past NBA games and determine if they’ve been historically accurate. If there are big discrepancies, then there could be an opportunity to make money.

I calculated the implied win probability for each bet based on their moneyline odds. You might notice that the sum of the win probabilities in each game is greater than one, which shouldn’t be possible! However, sportsbooks do this on purpose to profit from the total betting action. To adjust for this, I normalized the win probabilities to add up to one, which results in the REAL implied win probability for each bet.

Next, I created “bins” so that all bets with similar implied win probabilities are grouped together. Why do this?

Suppose we have a bet with an implied win probability of 11.7% (or +755 moneyline). It’s hard to find many other bets that have this exact moneyline. But if we include it in a bin of all bets from 10% to 15%, then we have quite a few data points to look at in each bin. We then calculate each bin’s *actual win rate* (number of real-life wins divided by total number of games) and* expected win rate* (average implied win probability of all bets in the bin).

From there, we can take the difference between the actual win rates and expected win rates – which I’ll call the residual – and see if there are any large discrepancies. Here are the results when dividing all bets into 20 bins. Each bin covers an implied win probability interval of about 5 percentage points.

It turns out that the implied win probabilities (and therefore the moneylines) are pretty accurate! In general, the actual and expected win probabilities don’t differ by more than 5%. However, there is a slight negative correlation between residual and expected win rate. It appears that huge underdogs have been slightly underrated, while huge favorites have been slightly overrated.

## Step 3: Simulating the strategy

Now, it’s time to put my (imaginary) money where my mouth is. In the last section, we found that huge underdogs might actually be slightly undervalued. What happens if we simulate betting on underdogs over the last three NBA seasons? We’ll backtest with actual game results and Pinnacle moneylines from the 2016-17 to 2019-20 seasons.

I wrote a function that simulates a betting strategy and tracks our winnings over time. We first must set a bet amount, which will be $100 every time for the sake of simplicity. We must also set a “win probability threshold,” which determines the underdog teams we’ll bet on. If we set it to 0.5, then we bet on any team with a win probability less than 50% (aka the underdog of every game). If we set it to 0.2, then we only bet on the big underdogs of lopsided games, where the win probability is less than 20%.

We’re ready to go now. What happens if we run the simulation with a threshold of 0.5, which places a $100 bet on the underdog of every game?

Oh no, we lost money! We **lost $3,903** after three seasons, with an especially brutal stretch from bets 600 to 1,000.

It’s worth noting that the expected value of every bet is negative. As I mentioned earlier, the sportsbook takes a cut of every bet through the virgorish or “vig.” Pinnacle has a vig of about 2-3%, which is actually quite low. Any strategy that is profitable must be at least 3% better than break-even!

Next, let’s try the *exact opposite* strategy and bet on the *favorite* of every game. After a tiny tweak to my simulator function, we get the following results over time:

What an absolute nightmare! With this strategy, we **lost $10,352**. Comparing this graph with the previous one, we can see that the trends move in opposite directions (as they should), but the winnings are completely outweighed by the magnitude of the losses.

Finally, let’s test our hyped-up strategy of betting on huge underdogs. What happens if we run the simulation with a threshold of 0.2?

We make a **total profit of $7,182**!! Not bad at all!

One really important caveat is that we lost nearly $3,000 before making profits afterwards. In order to survive with this strategy, you will need to have a large bankroll and/or make small-sized bets. Otherwise, it’d be easy to go on a long losing streak and completely run out of cash.

If you’d like to check out my complete Jupyter notebook, you can find that here.

## Conclusion

Based on this analysis, there may be a profitable strategy by betting on big-time underdogs. You could think of every bet as a lottery ticket with a high likelihood of losing but a large upside.

I do think it’s feasible that these underdogs are relatively “underpriced” while the heavy favorites are “overpriced.” There could be a psychological explanation for a lot of bettors; people might want to win more *often* at the cost of long-term monetary returns.

In summary, this underdog strategy requires patiently enduring long losing streaks, placing small-sized bets, and having a big enough bankroll. However, if the bottom-feeders of the NBA can pull off enough rare Ws, you just might be able to make cash.

And by the way, the Knicks already beat the Bucks this season, despite having a 12% implied win probability! Perhaps a sign of things to come.

Thank you for taking the time to read my article! Please keep in mind that there are disclaimers that come with my findings:

- This post is meant for cultivating gambling and data science knowledge. Use this info at your own risk!
- I have never wagered any of my own, real-life money using this strategy (yet).
- Despite hearing very positive reviews, I have never used Pinnacle.
- Even though it’s useful for simulations, backtesting is by no means a guarantee for future outcomes.
- The SBR website is missing odds data entirely from 1/4/18 to 1/18/18, so there are about two weeks of games that aren’t considered in my analysis.