Key Takeaways
- A good model produces calibrated probabilities, not just point predictions
- Handling uncertainty (variance in outcomes) matters as much as getting the average right
- The best models account for minutes, matchups, pace, role changes, and usage context
- Backtesting against historical data is the only honest way to evaluate a model
- Overfitting to recent results is the most common failure mode
Point Predictions Are Not Enough
Most casual analysis boils down to: "This player averages 22 points, so I'll bet the Over on 21.5." That's a point prediction, a single number.
The problem is that player performance isn't deterministic. A player projected for 22 points might score anywhere from 8 to 38 on a given night. The shape of that distribution (how wide it is, where the mass sits) determines whether a prop is actually worth betting.
A model that says "22 points" without telling you the probability of going over 21.5 is missing the most important part.
Calibration: The Gold Standard
Calibration is the single most important property of a prediction model. A calibrated model means that when it says there's a 65% chance of the Over, the Over actually hits about 65% of the time.
This sounds obvious, but most models fail here. Common failure modes:
- Overconfident models assign 70-80% probabilities to outcomes that hit only 55% of the time
- Underconfident models cluster everything near 50%, providing no actionable signal
- Biased models systematically favor one side (always lean Over, or always lean Under)
How to Check Calibration
Group your model's predictions into probability buckets (50-55%, 55-60%, 60-65%, etc.) and compare predicted rates to actual hit rates. If the model says 60% and the actual rate is 58-62%, it's well-calibrated. If it says 60% but the actual rate is 50%, you have a problem.
This isn't something you can eyeball from a few days of results. Calibration requires hundreds of predictions per bucket to be meaningful. Small samples produce misleading conclusions.
Modeling Uncertainty Properly
Player stat outcomes follow probability distributions. A model needs to capture both the central tendency (the expected value) and the spread (how variable the outcome is).
Why Variance Matters
Consider two players both projected for 20 points:
- Player A is a high-volume scorer on a consistent team. His distribution is tight, most nights he scores between 16 and 24.
- Player B is a streaky bench player who gets inconsistent minutes. His distribution is wide: he might score 6 or 35.
If the line is 19.5, both players have roughly the same average projection. But the probability of going Over 19.5 could be very different depending on the shape of their distributions.
A model that only outputs "20 points" treats these two players identically. A model that outputs a full distribution captures the difference.
Monte Carlo Simulation
One effective approach is Monte Carlo simulation: run thousands of simulated games, each with randomized inputs drawn from estimated distributions. The result is a simulated distribution of outcomes that naturally captures uncertainty.
This approach handles correlated variables well. A player's points depend on minutes, and minutes depend on game script, foul trouble, and blowout risk. Simulating these together preserves the dependency structure that point estimates ignore.
Features That Actually Matter
A model is only as good as its inputs. Here are the features that consistently drive prediction quality for player props:
Minutes Projection
Minutes played is the single strongest predictor of counting stats. A player who plays 36 minutes will almost always produce more than one who plays 24. Accurate minutes modeling, including variance from blowout risk, foul trouble, and rotation changes, is foundational.
Matchup Context
Not all opponents are equal. A guard facing a top-5 perimeter defense will score differently than one facing a bottom-5 defense. Matchup-specific adjustments at the position level improve accuracy for points, assists, and other stat types.
Pace and Game Environment
A game with a projected total of 235 will produce more counting stats than one projected at 205. Pace (possessions per game) directly affects how many opportunities a player has.
Role and Usage
A player's role can shift mid-season due to trades, injuries to teammates, or coaching decisions. Models that incorporate recent usage rates and detect role changes outperform those that rely solely on season-long averages.
Rest and Schedule
Back-to-back games, extended road trips, and rest days all affect performance. Fatigue effects are real, especially for minutes-dependent props.
What to Avoid
Recency Bias
A player who scored 35 last night isn't suddenly a 35-point scorer. Small samples are misleading. Strong models weight recent data appropriately without overreacting to one or two outlier games.
Overfitting
A model tuned to perfectly match historical data will fail on new data. If a model has hundreds of features and rules but was trained on a few thousand games, it's probably memorizing noise rather than learning signal. Simpler models with fewer, more meaningful features tend to generalize better.
Ignoring the Line
A model that predicts outcomes but doesn't compare those predictions to the actual sportsbook line is incomplete. The question isn't "will this player go Over?" but "is the line set correctly?" Those are different questions.
How Propboard Builds Its Model
Propboard uses a two-stage Monte Carlo simulation: first projecting a minutes distribution, then simulating stat rates conditional on minutes played. This produces full probability distributions for every prop market, which are then calibrated against historical outcomes using isotonic regression.
The result is a probability for each side of every line, not a gut feeling, but a calibrated estimate with a measurable track record. Start your free trial to see how the model grades today's player props.
Related Reading
- What Is Expected Value? for why calibrated probabilities translate to profitable bets
- What Probability Distributions Mean for Player Props for a deeper look at distribution shapes and why they matter
Frequently Asked Questions
What does "calibrated" mean for a betting model?
A calibrated model's probability estimates match observed outcomes. When it says 60% chance of the Over, the Over should hit roughly 60% of the time across a large sample. Calibration is tested by bucketing predictions and comparing predicted vs. actual hit rates. Most models fail this test because they're either overconfident or underconfident.
Can a model be accurate but not profitable?
Yes. A model might correctly rank outcomes (identifying which props are more likely to go Over) but assign probabilities that don't exceed implied odds. If the model says 53% and the implied probability is also 53%, there's no edge. Profitability requires the model's probability to exceed implied probability after accounting for the vig.
How many predictions does a model need to prove itself?
At minimum, several hundred per probability bucket. If a model assigns 60% to 20 bets and 15 hit, that's a 75% hit rate on 20 samples, which is noisy and unreliable. You need closer to 200-500 observations per bucket before calibration results stabilize. Across the full model, 1,000+ graded predictions is a reasonable threshold.
What's the difference between a point estimate and a distribution?
A point estimate says "this player will score 22 points." A distribution says "there's a 15% chance of 18 or fewer, a 30% chance of 19-22, a 35% chance of 23-27, and a 20% chance of 28 or more." The distribution tells you the probability of going over or under any line, which is what actually determines whether a bet has value.
For entertainment purposes only. Must be 21+. Please gamble responsibly.