MLB Win Predictor

Starting-pitcher emphasis dial. One knob that scales how hard both the Elo line and the logistic model lean on the probable starter (vs overall team strength). 1.0× is the tuned baseline (Elo starter weight 0.55). Move it and re-run to see the effect — or sweep several levels at once and read the log loss / Brier, which are the honest scorecards. Accuracy alone swings ±1.5% on noise at this sample size, so trust the lowest log loss / Brier, not the highest accuracy.

Emphasis: 1.0× · Elo pw 0.55

Model coefficients — the logistic-regression weights from the most recent walk-forward refit (the model used for today's predictions). Inputs are standardized, so each weight is the change in log-odds of a home win per one standard deviation of that factor — directly comparable across factors. Positive favors the home team. These are fit from data and shift at every weekly refit.

Run an analysis to populate.

Automated model search. Trains many variants — forward-selecting which factors to include, then grid-searching the ensemble weighting — and scores each on a held-out validation slice. The winner is picked on validation; the test slice is scored once, untouched, so the result is honest (no tuning against the test set). Realistic ceiling for single-game MLB is ~55–57%.

Calibration — when the model says X%, do teams actually win about X% of the time? Closer to the diagonal is better.

Accuracy vs baselines on the held-out test set.

Build your own model. Choose which seasons to train on, which factors the logistic model is allowed to use, and how hard to weight each one. Then train and read the result against the held-out test season. The honest scorecards are log loss and Brier (lower = better calibrated); accuracy alone swings on noise, so don't chase a 0.5% accuracy bump. Removing weak/noisy factors often helps — more inputs is not automatically better.

1 · Training seasons — which past years to learn from. The test season (set in the top bar) is always excluded from training and predicted walk-forward.

More seasons = more data but older, less-relevant rosters. Fewer, recent seasons track the current game better but are noisier.

2 · Train & compare

Train to see metrics here.

3 · Build your own factor — type a formula using the variable names listed below (and Math for functions like Math.abs, Math.max, Math.sqrt). Examples: opsDiff*2 - whipDiff · eloDiff + 0.5*runDiff · Math.max(spLevH-spLevA,0). Save it and it becomes a real model input — it trains in the walk-forward, earns a coefficient, and can be Auto-pruned. Your factors are saved in this browser.

Available variables — click to expand (hover a name for its meaning)

Idea scratchpad — jot factor ideas here in plain words before you turn them into a formula. Saved in this browser.

4 · Factors & weights — tick a factor to include it; set its weight (1.0 = normal, 0 = effectively off, >1 = lean on it harder). Weight multiplies the factor's post-fit emphasis, then the model is re-calibrated. The starting-pitcher dial on the Model Performance tab still applies on top of this.

Live Elo ratings after the most recent games. 1500 = league average; ~+100 ≈ a clearly above-average team.

Elo trend — top 6 teams over the season.

show only misses

Pick a team to see every player's Win Shares — each player's slice of the team's actual season wins, allocated by his runs above replacement (batting + pitching). The column adds up to exactly how many games the team has won. Also shown: WCS (wins above an average player, which nets to ~0 leaguewide ≈ the club's wins above .500) for comparison.

Team: Player: Window: Park-context adjust altitude —

Player WCS — click a column header to sort. Bat = batting wins, Pit = pitching wins.

Top contributors by WCS.

Does player value predict next year's wins? The honest out-of-sample test: for each team, take its roster in year t+1, score every player by the WCS he earned the prior year (t), sum it, add it to a .500 baseline, and compare that prediction to the team's actual t+1 wins. Runs across every available season pair and pools all 30 teams. R² tells you how much of next-year win variation prior-year player value captures. (Heads-up: this pulls a roster per team per season — it takes a minute and makes a lot of API calls.)

Method: Aging curve Trend —

Biggest misses (|predicted − actual|).

One season, every team. Pick a base year — its players' WCS predicts the next year's wins for all 30 teams. Default is 2024 → 2025: score each team's 2025 roster by the WCS those players earned in 2024, add a .500 baseline, and line the prediction up against actual 2025 wins, team by team.

Base year: Method: Aging curve Trend Blend w/ current pace —

All teams, predicted vs actual (sorted by actual wins).

Every player gets sorted into style categories from their live stats — handedness, role, arm slot, velocity tier, arsenal depth, and a synthesized archetype (power pitcher, command artist, three-true-outcomes slugger, table-setter, and more). Pick a player to auto-classify, or browse the full taxonomy below.

Team: Player: —

Reference taxonomy — the full menu of pitcher and hitter categories, how each is defined, and example players.

Values each hitter by how much they contribute to runs, RBI and winning — split by game context. Every game is bucketed into Tight (close margin), Low‑run and High‑run. Raw contribution per game is R + RBI − HR (the player's own home run counts a run and an RBI once, so it is de‑double‑counted). Each game is standardized to a Contribution Index where 100 = the league‑average hitter in that same context, so values are comparable across games and seasons. The team panel sums these and matches them against who actually reached the playoffs, won their division, the pennant and the World Series.

Season: Tight margin ≤ Low‑run total ≤ High‑run total ≥ —

Player value — pick a hitter to see their per‑game contribution and Contribution Index in each game context, plus their share of the team's runs in those games.

Team: Hitter:

How team run‑contribution matches success. Each dot is a team in the chosen season — x = overall Contribution Index, y = regular‑season wins. Color marks how far the team went in October.

Today's player value — each starting hitter's projected contribution for today's game. Each hitter's true‑talent wOBA is built from this season plus the past three years (recency‑weighted, regressed to league), matched by odds‑ratio against today's opposing starter (also a multi‑year blend), then tilted for today's conditions: ballpark altitude, day/night, the weather forecast (temperature, rain, wind) and the team's travel & rest. This is an expectation, not a prediction of the actual box score — single‑game noise dwarfs these tilts, and every context signal is heavily regressed.

—

Combined Total Player Value — one role‑aware number per player, anchored on the Winning Contribution Score: each game's run value (wRAA for hitters, runs saved vs. league ERA for pitchers) is weighted by whether the team won or lost — strong performance in a win counts more than the same in a loss. The WCS is then blended with fielding (fielding % vs. league) and chemistry (tenure, lineup‑network cohesion, any pasted contract). Every component is an index where 100 = league/team average; the total is a weighted average that drops components a player doesn't have. Adjust the weights below — player and team views update live.

Win Contrib. Field Tenure Network Contract Show: —

WCS game weights — multipliers on each game's run value: Good in WIN Bad in WIN Good in LOSS Bad in LOSS

Team contribution vs. outcome. Contribution Index by context (100 = league average that season), win % in tight games, and the team's actual finish. Click a header to sort.

Measures team chemistry from how long the roster has stayed together. For each season we pull the team's roster for that year and several prior years, then compute each player's tenure (consecutive seasons with the club), the lineup network of years played together (how many shared seasons connect each pair of teammates), roster continuity vs. last year, and the newcomer load. A Chemistry Index combines average tenure and network cohesion, and the league panel matches it against who actually reached the playoffs and won. Note: player contracts, salaries and free-agent terms are not in the public MLB Stats API, so "years left on contract" and "big FA signing" can't be pulled live — use the optional box below to paste any contracts you want reflected.

Season: Team: Look‑back: —

Lineup network — each node is a position player; a line means they have shared seasons on this club, thicker & brighter = more years together.

Roster tenure — consecutive seasons with the club, and whether the player is new this year. Contract notes appear if you supply them below.

Optional: paste contracts to include FA / years-left

One player per line: Name, yearsLeft, AAV($M), newFA(0/1), signedYear. Example: Aaron Judge, 7, 40, 0, 2023. signedYear is optional — when given, the valuation credits years left to renew and lightly discounts seasons elapsed since signing. Matched by name; shown in the roster table. Age is pulled automatically from the MLB API; only contract timeline needs manual entry.

League chemistry vs. October success. Computes every team's Chemistry Index for the chosen season and plots it against regular-season wins, colored by how far they advanced. This fetches many rosters, so it runs on demand.

—

Replays a full season day by day: each game is predicted using only the results of earlier games — an online Elo + run-expectancy ensemble, updated after every game, never peeking at the game's own outcome. This is a true walk-forward backtest. A complete past season (2024–2025, ~2,400 games) gives the most stable accuracy read; baseball's realistic ceiling is ~57–60%.

Backtest season: Ensemble vs Elo-only vs Run-only shown below

Cumulative prediction accuracy as the season progresses (vs the 50% coin-flip and the home-team baseline).

Today & upcoming — predictions for games not yet played (no result to peek at).

Recent days — the model's daily record on completed games.

Most recent completed games — pre-game pick vs what actually happened.

What this site specializes in

Most baseball sites give you box scores and standings. We specialize in prediction — turning official MLB data into honest, out-of-sample forecasts at three levels: a whole season, an individual player, and a single game. Everything runs live in your browser from the MLB Stats API; nothing is pre-baked.

🎯 Win Prediction

For a season, we project each team's full-year win total from the value of the players on its roster, then — for a season already in progress — blend that projection with how the team is actually playing (weight shifts toward real results as games pile up). For a game, an Elo + logistic-regression ensemble estimates each team's win probability from pre-game signals only (team strength, starting pitcher, recent form, park, travel, weather), updated after every game and never peeking at the result it's predicting.

⚾ Player Value Prediction (WCS)

Our core player metric is WCS — Winning Contribution Score: a player's estimated wins above an average player, built from batting (wOBA→runs above average) and pitching (ERA vs league→runs saved), divided by runs-per-win, with small base-running and leverage nudges. A whole league nets to ~0, so a team's summed WCS ≈ its wins above .500. We also show Win Shares (absolute, summing to a team's actual wins) and project each player forward with an empirical-Bayes (Marcel-style) model that weights recent seasons, regresses small samples, and ages the curve.

How the models work

Season forecast: sum the projected WCS of a team's roster, add a .500 baseline, and (for live seasons) blend with current pace. Player projection: empirical-Bayes weighting of the last three years (5/4/3), reliability-regressed toward the mean and age-adjusted, with an optional per-player trend factor and a replacement-level (WAR) option that widens the spread so weak and elite teams aren't squeezed toward the middle. Game model: online Elo (with starting-pitcher ratings and priors) ensembled with a logistic regression over engineered features, probability-recalibrated, and validated by a true walk-forward backtest that only ever uses earlier games.

Honesty note. Every accuracy number on this site is out-of-sample — the model is scored on seasons/games it never trained on. Baseball is high-variance: even strong pre-game models top out near 56–60% on single games, and season win totals carry an irreducible ±6–9 win error. These are estimates for analysis and entertainment, not betting advice.

How to use this site

The top menu groups everything into four areas. Pick a category, then a page. Most pages have a Method dropdown and toggles so you can customize the analysis.

📅 Season Prediction

Teams & WCS — pick a team to see every player's win contribution for the year.
Power Rankings — live team-strength order from the Elo model (already reflects 2026 results).
Team Chemistry — how a roster's mix of players fits together over multiple seasons.
Live forecast tip: open the validation card, set Base year to “2025→2026 (live forecast)”, and keep Blend w/ current pace on to see the talent projection updated with this season's actual results.

⚾ Player Performance

Player Value — rank players by WCS / Win Shares; test whether last year's value predicted this year's wins.
Player Types — classify each player's style and role from their stat profile.

⚔ Game Prediction

Today's Predictions — win probabilities for today's matchups (pre-game data only).
Model Performance — accuracy, log-loss, Brier and calibration.
Game Explorer — search any game and see the pre-game pick vs the actual result.
Walk-Forward 2026 — replay a season day by day; the truest accuracy read.
Customize Model — toggle features, set training years, and tune weights yourself.

📱 Install as an app

iPhone / iPad (Safari): tap Share → Add to Home Screen.
Android (Chrome): tap the ⋮ menu → Install app / Add to Home screen.
It then opens full-screen like a native app, with its own icon.