|
|
|||||||
|
This interactive tutorial visualizes Bayesian inference using the Beta-Binomial model: how a Prior belief is updated by Evidence (Likelihood) to produce a Posterior distribution. You literally "see" the probability shift as you add successes or failures (e.g. coin flips). The chart shows three curves: Prior (blue) — your initial belief about the probability of success; Likelihood (orange) — the data you observed (scaled for visibility); Posterior (purple, thicker) — the updated belief after combining prior and data. Sliders set the prior strength (Prior Successes α and Prior Failures β). Buttons add one success or one failure. The posterior updates via the conjugate rule: αpost = α0 + successes, βpost = β0 + failures. A strong prior (large α0, β0) resists new data; a weak prior moves quickly toward the data. Sections Mathematical foundation1. Bayes' rule The posterior is proportional to the prior times the likelihood. For a probability p of success (e.g. coin fairness), we start with a prior over p, observe data, and compute the posterior that combines both: P(p | data) ∝ P(data | p) · P(p)
2. Beta distribution (Prior and Posterior) The Beta(α, β) distribution describes a belief over p in [0, 1], normalized by the Beta function B(α, β): f(p; α, β) ∝ pα−1(1−p)β−1 mean = α/(α+β) mode = (α−1)/(α+β−2)
(mode valid when α, β > 1.) We use a log-Gamma implementation so the PDF stays numerically stable for large α, β. 3. Conjugate update (Beta-Binomial) For Bernoulli trials, the Binomial likelihood has the form ps(1−p)f (s = successes, f = failures). The Beta distribution is the conjugate prior, so the update is exact — just add the counts: Beta(α0, β0) + (s, f) → Beta(α0 + s, β0 + f)
No numerical integration is needed — the update is exact and fast. 4. Likelihood curve The Likelihood (orange) is the Binomial likelihood of the observed data as a function of p. It is scaled so its peak is visible alongside the Prior and Posterior. Its maximum is at the sample proportion s/(s+f). As you add data, you see the Prior "pulled" toward the Likelihood to form the Posterior. The Bayesian CycleBayesian inference is a continuous "learning loop". The Likelihood isn’t what gets updated — the Prior is what gets updated to become the Posterior. Then, in the next round of learning, that Posterior becomes your new Prior.
The hand-off: for the next experiment, your old Posterior becomes your new Prior —
Posteriorn → Priorn+1. This is what makes Bayesian inference a continuous learning loop.5. The formula as a "correction" You can think of Bayes’ rule as a way to "correct" your initial guess using new data: P(H | D) = P(H) × [ P(D | H) / P(D) ]
6. Why the Likelihood isn’t "updated" In Bayesian terms, the Likelihood is a fixed mathematical function that describes the data-gathering process (e.g. “if the defect rate is 12%, how likely is it that I saw 3 defects?”). It doesn’t change — it is the lens through which you view the data to update your belief. Only the Prior → Posterior transition represents "learning." The Likelihood is always determined by the experiment design and the data you actually observed. SimulationThe interactive simulator is below. Use the controls to explore the concepts described above.
Prior
Likelihood (data)
Posterior
P(p | data) = P(data | p) · P(p) / P(data)
Prior — P(p)
f(p; α, β) = pα−1(1−p)β−1 / B(α, β)
Likelihood — P(data | p)
L(p | data) = ps(1−p)f
Posterior — P(p | data)
Posterior = Beta(α + s, β + f)
2
2
0 success, 0 failure
Posterior mean
0.5000
Posterior mode
—
95% credible interval
[0.025, 0.975]
Scenario: Factory Quality Inspection
A new supplier ships you products. You don’t know the defect rate — it could be 1% or 30%. Inspect items one-by-one and watch the Bayesian posterior zero in on the true defect rate.
8%
2
10
Prior
Posterior
True defect rate
Posterior: Beta(2, 10)
Observed Frequency Histogram
Defective
0
Good
0
Total inspected
0
Obs. defect rate
—
Posterior mean
0.5000
Posterior mode
—
95% credible interval
[0.025, 0.975]
Bayes’ Theorem — Step-by-Step Calculation
P(D | obs) = P(obs | D) · P(D) / P(obs)
where D = defect rate, obs = inspection outcome (def or good)
No inspections yet — inspect an item to see the step-by-step Bayes calculation.
Full Step History
Inspection Log
No inspections yet
Bayesian logic applies to any Hypothesis (H) given Data (D).
Discrete case — the hypothesis is one of a finite set (e.g. “which of 5 suspects?”). Continuous case — the hypothesis is a real number (e.g. “what is the true weight?”). In the Theory and Example 1 tabs we used the Beta distribution because we were estimating a continuous probability θ ∈ [0, 1]. Here we show both cases side-by-side. Discrete Case: Crime Investigation
5 suspects (A–E), each starts at 20% probability. Add evidence to see how Bayes’ rule redistributes the probabilities. P(Hi | D) = P(D | Hi) · P(Hi) / P(D)
No evidence yet — add evidence to see the Bayes calculation.
Continuous Case: Weighing an Object
Estimate the true weight of an object. Prior: N(50, 15²). Each measurement adds noise (σ=3 kg). The posterior narrows and shifts toward the true weight.
73.2 kg
Prior N(50, 15²)
Posterior
True weight
Measurements
Posterior: N(50.0, 15.0²)
Measurements
0
Posterior mean
50.00 kg
Uncertainty (σ)
±15.00 kg
P(W | measurements) = P(measurements | W) · P(W) / P(measurements)
No measurements yet — click Measure to start.
How does this connect? All three cases use the same Bayes’ rule: P(H|D) = P(D|H) · P(H) / P(D).
• Discrete (suspects): H = “suspect i did it” — probabilities sum to 1 across 5 options. • Continuous (weight): H = “true weight is W” — described by a Normal PDF, updated with a Normal likelihood. • Beta-Binomial (Theory / Example 1): H = “true defect rate is θ” — described by a Beta PDF, updated with a Bernoulli likelihood. We aren’t asking “will the next item be defective?” We are asking “what is the true underlying probability θ?” where θ can be any value from 0.00 to 1.00.
Tab 1 — TheoryUse the controls to explore how the prior is updated by evidence:
Tab 2 — Example 1 (Factory Quality Inspection)A realistic simulation that demonstrates Bayesian inference in action. You receive products from a new supplier and need to estimate the defect rate — a real-world problem with no strong "50/50" prior intuition.
Tab 3 — Example 2 (Discrete + Continuous)Demonstrates that Bayes' rule applies to any hypothesis given data — not just the Beta-Binomial case. Discrete Case — Crime Investigation:
Continuous Case — Weighing an Object:
The connection: All three tabs use the same Bayes' rule P(H|D) = P(D|H)·P(H)/P(D). The only thing that changes is the type of distribution used (discrete probabilities, Normal PDF, or Beta PDF). Visualizations
Summary panel
Limitations
|
|||||||