Web Simulation 

 

 

 

 

Bayesian Inference - Beta-Binomial Updating Tutorial 

This interactive tutorial visualizes Bayesian inference using the Beta-Binomial model: how a Prior belief is updated by Evidence (Likelihood) to produce a Posterior distribution. You literally "see" the probability shift as you add successes or failures (e.g. coin flips).

The chart shows three curves: Prior (blue) — your initial belief about the probability of success; Likelihood (orange) — the data you observed (scaled for visibility); Posterior (purple, thicker) — the updated belief after combining prior and data. Sliders set the prior strength (Prior Successes α and Prior Failures β). Buttons add one success or one failure. The posterior updates via the conjugate rule: αpost = α0 + successes, βpost = β0 + failures. A strong prior (large α0, β0) resists new data; a weak prior moves quickly toward the data.

Mathematical foundation

1. Bayes' rule

Posterior ∝ Prior × Likelihood. For a probability p of success (e.g. coin fairness), we start with a prior distribution over p, observe data (successes and failures), and compute the posterior distribution that combines both.

2. Beta distribution (Prior and Posterior)

The Beta(α, β) distribution describes a belief over p in [0, 1]. Its PDF is f(p; α, β) ∝ pα−1(1−p)β−1, normalized by the Beta function B(α, β). Mean = α/(α+β); Mode = (α−1)/(α+β−2) when α, β > 1. We use a log-Gamma implementation so the PDF remains numerically stable for large α, β.

3. Conjugate update (Beta-Binomial)

For Bernoulli trials (e.g. coin flips), the Binomial likelihood has the form ps(1−p)f (s = successes, f = failures). The Beta distribution is the conjugate prior: if Prior is Beta(α0, β0) and you observe s successes and f failures, the Posterior is Beta(α0 + s, β0 + f). No numerical integration is needed — the update is exact and fast.

4. Likelihood curve

The Likelihood (orange) is the Binomial likelihood of the observed data as a function of p. It is scaled so its peak is visible alongside the Prior and Posterior. Its maximum is at the sample proportion s/(s+f). As you add data, you see the Prior "pulled" toward the Likelihood to form the Posterior.

The Bayesian Cycle

Bayesian inference is a continuous "learning loop". The Likelihood isn’t what gets updated — the Prior is what gets updated to become the Posterior. Then, in the next round of learning, that Posterior becomes your new Prior.

1. Prior 2. Likelihood 3. Posterior
The “Old You”
What you believe before
the experiment
The “Lens”
New evidence / data
you just observed
The “New You”
Updated belief after
combining the two

The Hand-off: For the next experiment, your old Posterior becomes your new Prior.   (Posteriorn → Priorn+1)

5. The formula as a "correction"

You can think of Bayes’ rule as a way to "correct" your initial guess using new data:

  • P(H) — the “Old You” (a priori).
  • P(D | H) / P(D) — the “Learning Factor” (how much the new data should change your mind).
  • P(H | D) — the “New You” (a posteriori).

6. Why the Likelihood isn’t "updated"

In Bayesian terms, the Likelihood is a fixed mathematical function that describes the data-gathering process (e.g. “if the defect rate is 12%, how likely is it that I saw 3 defects?”). It doesn’t change — it is the lens through which you view the data to update your belief. Only the Prior → Posterior transition represents "learning." The Likelihood is always determined by the experiment design and the data you actually observed.

Prior Likelihood (data) Posterior
P(p | data) = P(data | p) · P(p) / P(data)
Prior — P(p)
f(p; α, β) = pα−1(1−p)β−1 / B(α, β)
Likelihood — P(data | p)
L(p | data) = ps(1−p)f
Posterior — P(p | data)
Posterior = Beta(α + s, β + f)
2
2
0 success, 0 failure
Posterior mean
0.5000
Posterior mode
95% credible interval
[0.025, 0.975]
Scenario: Factory Quality Inspection

A new supplier ships you products. You don’t know the defect rate — it could be 1% or 30%. Inspect items one-by-one and watch the Bayesian posterior zero in on the true defect rate.

8%
2
10
Prior Posterior True defect rate
Posterior: Beta(2, 10)
Observed Frequency Histogram
Defective
0
Good
0
Total inspected
0
Obs. defect rate
Posterior mean
0.5000
Posterior mode
95% credible interval
[0.025, 0.975]
Bayes’ Theorem — Step-by-Step Calculation
P(D | obs) = P(obs | D) · P(D) / P(obs)
where D = defect rate, obs = inspection outcome (def or good)
No inspections yet — inspect an item to see the step-by-step Bayes calculation.
Full Step History
Inspection Log
No inspections yet
Bayesian logic applies to any Hypothesis (H) given Data (D).
Discrete case — the hypothesis is one of a finite set (e.g. “which of 5 suspects?”).
Continuous case — the hypothesis is a real number (e.g. “what is the true weight?”).
In the Theory and Example 1 tabs we used the Beta distribution because we were estimating a continuous probability θ ∈ [0, 1]. Here we show both cases side-by-side.
Discrete Case: Crime Investigation

5 suspects (A–E), each starts at 20% probability. Add evidence to see how Bayes’ rule redistributes the probabilities.

P(Hi | D) = P(D | Hi) · P(Hi) / P(D)
No evidence yet — add evidence to see the Bayes calculation.
Continuous Case: Weighing an Object

Estimate the true weight of an object. Prior: N(50, 15²). Each measurement adds noise (σ=3 kg). The posterior narrows and shifts toward the true weight.

73.2 kg
Prior N(50, 15²) Posterior True weight Measurements
Posterior: N(50.0, 15.0²)
Measurements
0
Posterior mean
50.00 kg
Uncertainty (σ)
±15.00 kg
P(W | measurements) = P(measurements | W) · P(W) / P(measurements)
No measurements yet — click Measure to start.
How does this connect? All three cases use the same Bayes’ rule: P(H|D) = P(D|H) · P(H) / P(D).
Discrete (suspects): H = “suspect i did it” — probabilities sum to 1 across 5 options.
Continuous (weight): H = “true weight is W” — described by a Normal PDF, updated with a Normal likelihood.
Beta-Binomial (Theory / Example 1): H = “true defect rate is θ” — described by a Beta PDF, updated with a Bernoulli likelihood. We aren’t asking “will the next item be defective?” We are asking “what is the true underlying probability θ?” where θ can be any value from 0.00 to 1.00.

 

Tab 1 — Theory

Use the controls to explore how the prior is updated by evidence:

  1. Set the Prior: Use the sliders "Prior Successes (α)" and "Prior Failures (β)" to choose your initial belief. Equal values (e.g. 2, 2) give a symmetric prior centered at 0.5. Larger values make the prior "stronger" (narrower curve that resists new data).
  2. Add Data: Click "Add Success" or "Add Failure" to simulate new trials (e.g. coin flips). The Likelihood (orange) appears and the Posterior (purple) updates: it is pulled toward the data. With many successes the posterior shifts right; with many failures it shifts left.
  3. Strong vs Weak Prior: Try a strong prior (e.g. α=20, β=20) and add a few successes — the curve moves only slightly. Then reset evidence, set a weak prior (α=2, β=2), add the same number of successes — the posterior moves much more. This illustrates how prior strength affects the update.
  4. Summary Panel: Posterior mean, mode, and 95% credible interval update in real time. The "Live Insight" text explains the current belief in plain language.
  5. Reset evidence: Clears successes and failures; the prior is unchanged. Use this to try different data sequences with the same prior.

Tab 2 — Example 1 (Factory Quality Inspection)

A realistic simulation that demonstrates Bayesian inference in action. You receive products from a new supplier and need to estimate the defect rate — a real-world problem with no strong "50/50" prior intuition.

  1. Set True Defect Rate: The actual defect rate of the supplier (hidden from the inspector). Default is 8%. Range 1%–50%. This is the value the posterior should converge to.
  2. Set Prior α, β: The starting prior. Default is Beta(2, 10) — a mildly informed prior meaning "based on past experience, defect rates are usually around 10–20%" (prior mean ≈ 17%). The posterior will be pulled down toward the true 8% as evidence accumulates. Try Beta(1, 1) for a completely flat prior, or Beta(5, 5) for a strong belief that the rate is near 50%.
  3. Inspect items: Click "Inspect 1", "Inspect 10", or "Inspect 100" to draw random items from the supplier. Or click "Auto" to watch continuous inspection (stops at 500 items).
  4. Posterior PDF chart: Shows the Prior (blue) and Posterior (purple, shaded) Beta distributions. A red dashed vertical line marks the true defect rate. Watch the posterior narrow and converge as evidence accumulates.
  5. Histogram: Shows the observed relative frequencies of Defective vs Good items — the frequentist estimate of the defect rate.
  6. Inspection log: A scrollable record of every inspection: D = Defective, G = Good.
  7. Reset: Clears all inspections and resets to the prior.

Tab 3 — Example 2 (Discrete + Continuous)

Demonstrates that Bayes' rule applies to any hypothesis given data — not just the Beta-Binomial case.

Discrete Case — Crime Investigation:

  • 5 suspects (A–E), each starting with a uniform 20% prior probability.
  • Select a suspect and an evidence type (Witness ID, Forensic match, or Alibi confirmed), then click "Add Evidence".
  • Witness ID for suspect X: P(evidence | X) = 0.80, P(evidence | other) = 0.05 — strongly points to X.
  • Forensic match for suspect X: P(evidence | X) = 0.90, P(evidence | other) = 0.025 — very strong evidence for X.
  • Alibi confirmed for suspect X: P(evidence | X) = 0.02, P(evidence | other) ≈ 0.245 — nearly eliminates X.
  • The equation panel shows the Bayes calculation for all 5 suspects at each step.

Continuous Case — Weighing an Object:

  • Estimate the true weight of an object using noisy measurements.
  • Prior: N(50, 15²) — broad initial guess. Measurement noise: σ = 3 kg.
  • Uses the Normal-Normal conjugate update: Posterior is also Normal, getting narrower with each measurement.
  • Orange dots at the bottom of the chart show individual measurements.
  • The red dashed line marks the true weight.

The connection: All three tabs use the same Bayes' rule P(H|D) = P(D|H)·P(H)/P(D). The only thing that changes is the type of distribution used (discrete probabilities, Normal PDF, or Beta PDF).

Visualizations

  • Prior (blue): Initial belief before data (Beta, Normal, or uniform bars).
  • Likelihood (orange): (Theory tab) Binomial likelihood of the observed data.
  • Posterior (purple): Updated belief after combining prior and data.
  • True value (red dashed): Ground-truth for comparison (Example 1 and 2).

Summary panel

  • Posterior mean: Point estimate for the unknown parameter.
  • Posterior mode: Peak of the posterior distribution.
  • 95% credible interval / σ: Uncertainty measure (Bayesian analogue of a confidence interval).