|
|
|||||||||||
|
This interactive tutorial visualizes Bayesian inference using the Beta-Binomial model: how a Prior belief is updated by Evidence (Likelihood) to produce a Posterior distribution. You literally "see" the probability shift as you add successes or failures (e.g. coin flips). The chart shows three curves: Prior (blue) — your initial belief about the probability of success; Likelihood (orange) — the data you observed (scaled for visibility); Posterior (purple, thicker) — the updated belief after combining prior and data. Sliders set the prior strength (Prior Successes α and Prior Failures β). Buttons add one success or one failure. The posterior updates via the conjugate rule: αpost = α0 + successes, βpost = β0 + failures. A strong prior (large α0, β0) resists new data; a weak prior moves quickly toward the data. Mathematical foundation1. Bayes' rule Posterior ∝ Prior × Likelihood. For a probability p of success (e.g. coin fairness), we start with a prior distribution over p, observe data (successes and failures), and compute the posterior distribution that combines both. 2. Beta distribution (Prior and Posterior) The Beta(α, β) distribution describes a belief over p in [0, 1]. Its PDF is f(p; α, β) ∝ pα−1(1−p)β−1, normalized by the Beta function B(α, β). Mean = α/(α+β); Mode = (α−1)/(α+β−2) when α, β > 1. We use a log-Gamma implementation so the PDF remains numerically stable for large α, β. 3. Conjugate update (Beta-Binomial) For Bernoulli trials (e.g. coin flips), the Binomial likelihood has the form ps(1−p)f (s = successes, f = failures). The Beta distribution is the conjugate prior: if Prior is Beta(α0, β0) and you observe s successes and f failures, the Posterior is Beta(α0 + s, β0 + f). No numerical integration is needed — the update is exact and fast. 4. Likelihood curve The Likelihood (orange) is the Binomial likelihood of the observed data as a function of p. It is scaled so its peak is visible alongside the Prior and Posterior. Its maximum is at the sample proportion s/(s+f). As you add data, you see the Prior "pulled" toward the Likelihood to form the Posterior. The Bayesian CycleBayesian inference is a continuous "learning loop". The Likelihood isn’t what gets updated — the Prior is what gets updated to become the Posterior. Then, in the next round of learning, that Posterior becomes your new Prior.
The Hand-off: For the next experiment, your old Posterior becomes your new Prior. (Posteriorn → Priorn+1) 5. The formula as a "correction" You can think of Bayes’ rule as a way to "correct" your initial guess using new data:
6. Why the Likelihood isn’t "updated" In Bayesian terms, the Likelihood is a fixed mathematical function that describes the data-gathering process (e.g. “if the defect rate is 12%, how likely is it that I saw 3 defects?”). It doesn’t change — it is the lens through which you view the data to update your belief. Only the Prior → Posterior transition represents "learning." The Likelihood is always determined by the experiment design and the data you actually observed.
Prior
Likelihood (data)
Posterior
P(p | data) = P(data | p) · P(p) / P(data)
Prior — P(p)
f(p; α, β) = pα−1(1−p)β−1 / B(α, β)
Likelihood — P(data | p)
L(p | data) = ps(1−p)f
Posterior — P(p | data)
Posterior = Beta(α + s, β + f)
2
2
0 success, 0 failure
Posterior mean
0.5000
Posterior mode
—
95% credible interval
[0.025, 0.975]
Scenario: Factory Quality Inspection
A new supplier ships you products. You don’t know the defect rate — it could be 1% or 30%. Inspect items one-by-one and watch the Bayesian posterior zero in on the true defect rate.
8%
2
10
Prior
Posterior
True defect rate
Posterior: Beta(2, 10)
Observed Frequency Histogram
Defective
0
Good
0
Total inspected
0
Obs. defect rate
—
Posterior mean
0.5000
Posterior mode
—
95% credible interval
[0.025, 0.975]
Bayes’ Theorem — Step-by-Step Calculation
P(D | obs) = P(obs | D) · P(D) / P(obs)
where D = defect rate, obs = inspection outcome (def or good)
No inspections yet — inspect an item to see the step-by-step Bayes calculation.
Full Step History
Inspection Log
No inspections yet
Bayesian logic applies to any Hypothesis (H) given Data (D).
Discrete case — the hypothesis is one of a finite set (e.g. “which of 5 suspects?”). Continuous case — the hypothesis is a real number (e.g. “what is the true weight?”). In the Theory and Example 1 tabs we used the Beta distribution because we were estimating a continuous probability θ ∈ [0, 1]. Here we show both cases side-by-side. Discrete Case: Crime Investigation
5 suspects (A–E), each starts at 20% probability. Add evidence to see how Bayes’ rule redistributes the probabilities. P(Hi | D) = P(D | Hi) · P(Hi) / P(D)
No evidence yet — add evidence to see the Bayes calculation.
Continuous Case: Weighing an Object
Estimate the true weight of an object. Prior: N(50, 15²). Each measurement adds noise (σ=3 kg). The posterior narrows and shifts toward the true weight.
73.2 kg
Prior N(50, 15²)
Posterior
True weight
Measurements
Posterior: N(50.0, 15.0²)
Measurements
0
Posterior mean
50.00 kg
Uncertainty (σ)
±15.00 kg
P(W | measurements) = P(measurements | W) · P(W) / P(measurements)
No measurements yet — click Measure to start.
How does this connect? All three cases use the same Bayes’ rule: P(H|D) = P(D|H) · P(H) / P(D).
• Discrete (suspects): H = “suspect i did it” — probabilities sum to 1 across 5 options. • Continuous (weight): H = “true weight is W” — described by a Normal PDF, updated with a Normal likelihood. • Beta-Binomial (Theory / Example 1): H = “true defect rate is θ” — described by a Beta PDF, updated with a Bernoulli likelihood. We aren’t asking “will the next item be defective?” We are asking “what is the true underlying probability θ?” where θ can be any value from 0.00 to 1.00.
Tab 1 — TheoryUse the controls to explore how the prior is updated by evidence:
Tab 2 — Example 1 (Factory Quality Inspection)A realistic simulation that demonstrates Bayesian inference in action. You receive products from a new supplier and need to estimate the defect rate — a real-world problem with no strong "50/50" prior intuition.
Tab 3 — Example 2 (Discrete + Continuous)Demonstrates that Bayes' rule applies to any hypothesis given data — not just the Beta-Binomial case. Discrete Case — Crime Investigation:
Continuous Case — Weighing an Object:
The connection: All three tabs use the same Bayes' rule P(H|D) = P(D|H)·P(H)/P(D). The only thing that changes is the type of distribution used (discrete probabilities, Normal PDF, or Beta PDF). Visualizations
Summary panel
|
|||||||||||