Web Simulator | ShareTechnote

Web Simulation

Bayesian Inference - Beta-Binomial Updating Tutorial

This interactive tutorial visualizes Bayesian inference using the Beta-Binomial model: how a Prior belief is updated by Evidence (Likelihood) to produce a Posterior distribution. You literally "see" the probability shift as you add successes or failures (e.g. coin flips).

The chart shows three curves: Prior (blue) — your initial belief about the probability of success; Likelihood (orange) — the data you observed (scaled for visibility); Posterior (purple, thicker) — the updated belief after combining prior and data. Sliders set the prior strength (Prior Successes α and Prior Failures β). Buttons add one success or one failure. The posterior updates via the conjugate rule: α_post = α₀ + successes, β_post = β₀ + failures. A strong prior (large α₀, β₀) resists new data; a weak prior moves quickly toward the data.

Sections

Mathematical foundation
The Bayesian Cycle
Simulation
Tab 1 — Theory
Tab 2 — Example 1 (Factory Quality Inspection)
Tab 3 — Example 2 (Discrete + Continuous)
Visualizations
Summary panel
Limitations

Mathematical foundation

1. Bayes' rule

The posterior is proportional to the prior times the likelihood. For a probability p of success (e.g. coin fairness), we start with a prior over p, observe data, and compute the posterior that combines both:

P(p | data) ∝ P(data | p) · P(p)

2. Beta distribution (Prior and Posterior)

The Beta(α, β) distribution describes a belief over p in [0, 1], normalized by the Beta function B(α, β):

f(p; α, β) ∝ p^α−1(1−p)^β−1 mean = α/(α+β) mode = (α−1)/(α+β−2)

(mode valid when α, β > 1.) We use a log-Gamma implementation so the PDF stays numerically stable for large α, β.

3. Conjugate update (Beta-Binomial)

For Bernoulli trials, the Binomial likelihood has the form p^s(1−p)^f (s = successes, f = failures). The Beta distribution is the conjugate prior, so the update is exact — just add the counts:

Beta(α₀, β₀) + (s, f) → Beta(α₀ + s, β₀ + f)

No numerical integration is needed — the update is exact and fast.

4. Likelihood curve

The Likelihood (orange) is the Binomial likelihood of the observed data as a function of p. It is scaled so its peak is visible alongside the Prior and Posterior. Its maximum is at the sample proportion s/(s+f). As you add data, you see the Prior "pulled" toward the Likelihood to form the Posterior.

The Bayesian Cycle

Bayesian inference is a continuous "learning loop". The Likelihood isn’t what gets updated — the Prior is what gets updated to become the Posterior. Then, in the next round of learning, that Posterior becomes your new Prior.

1. Prior	2. Likelihood	3. Posterior
The “Old You” — what you believe before the experiment	The “Lens” — new evidence / data you just observed	The “New You” — updated belief after combining the two

The hand-off: for the next experiment, your old Posterior becomes your new Prior — Posterior_n → Prior_n+1. This is what makes Bayesian inference a continuous learning loop.

5. The formula as a "correction"

You can think of Bayes’ rule as a way to "correct" your initial guess using new data:

P(H | D) = P(H) × [ P(D | H) / P(D) ]

P(H) — the “Old You” (a priori).
P(D | H) / P(D) — the “Learning Factor” (how much the new data should change your mind).
P(H | D) — the “New You” (a posteriori).

6. Why the Likelihood isn’t "updated"

In Bayesian terms, the Likelihood is a fixed mathematical function that describes the data-gathering process (e.g. “if the defect rate is 12%, how likely is it that I saw 3 defects?”). It doesn’t change — it is the lens through which you view the data to update your belief. Only the Prior → Posterior transition represents "learning." The Likelihood is always determined by the experiment design and the data you actually observed.

Simulation

The interactive simulator is below. Use the controls to explore the concepts described above.

Prior Likelihood (data) Posterior

P(p | data) = P(data | p) · P(p) / P(data)

Prior — P(p)

f(p; α, β) = p^α−1(1−p)^β−1 / B(α, β)

Likelihood — P(data | p)

L(p | data) = p^s(1−p)^f

Posterior — P(p | data)

Posterior = Beta(α + s, β + f)

Prior Successes (α): 2

Prior Failures (β): 2

0 success, 0 failure

Posterior mean

0.5000

Posterior mode

—

95% credible interval

[0.025, 0.975]

Scenario: Factory Quality Inspection

A new supplier ships you products. You don’t know the defect rate — it could be 1% or 30%. Inspect items one-by-one and watch the Bayesian posterior zero in on the true defect rate.

True Defect Rate: 8%

Prior α (defects): 2

Prior β (good): 10

Prior Posterior True defect rate

Posterior: Beta(2, 10)

Observed Frequency Histogram

Defective

Good

Total inspected

Obs. defect rate

—

Posterior mean

0.5000

Posterior mode

—

95% credible interval

[0.025, 0.975]

Bayes’ Theorem — Step-by-Step Calculation

P(D | obs) = P(obs | D) · P(D) / P(obs)

where D = defect rate, obs = inspection outcome (def or good)

No inspections yet — inspect an item to see the step-by-step Bayes calculation.

Full Step History

Inspection Log

No inspections yet

Bayesian logic applies to any Hypothesis (H) given Data (D).
Discrete case — the hypothesis is one of a finite set (e.g. “which of 5 suspects?”).
Continuous case — the hypothesis is a real number (e.g. “what is the true weight?”).
In the Theory and Example 1 tabs we used the Beta distribution because we were estimating a continuous probability θ ∈ [0, 1]. Here we show both cases side-by-side.

Discrete Case: Crime Investigation

5 suspects (A–E), each starts at 20% probability. Add evidence to see how Bayes’ rule redistributes the probabilities.

Suspect: Evidence:

P(H_i | D) = P(D | H_i) · P(H_i) / P(D)

No evidence yet — add evidence to see the Bayes calculation.

Continuous Case: Weighing an Object

Estimate the true weight of an object. Prior: N(50, 15²). Each measurement adds noise (σ=3 kg). The posterior narrows and shifts toward the true weight.

True weight: 73.2 kg

Prior N(50, 15²) Posterior True weight Measurements

Posterior: N(50.0, 15.0²)

Measurements

Posterior mean

50.00 kg

Uncertainty (σ)

±15.00 kg

P(W | measurements) = P(measurements | W) · P(W) / P(measurements)

No measurements yet — click Measure to start.

How does this connect? All three cases use the same Bayes’ rule: P(H|D) = P(D|H) · P(H) / P(D).
• Discrete (suspects): H = “suspect i did it” — probabilities sum to 1 across 5 options.
• Continuous (weight): H = “true weight is W” — described by a Normal PDF, updated with a Normal likelihood.
• Beta-Binomial (Theory / Example 1): H = “true defect rate is θ” — described by a Beta PDF, updated with a Bernoulli likelihood. We aren’t asking “will the next item be defective?” We are asking “what is the true underlying probability θ?” where θ can be any value from 0.00 to 1.00.

Tab 1 — Theory

Use the controls to explore how the prior is updated by evidence:

Set the Prior: Use the sliders "Prior Successes (α)" and "Prior Failures (β)" to choose your initial belief. Equal values (e.g. 2, 2) give a symmetric prior centered at 0.5. Larger values make the prior "stronger" (narrower curve that resists new data).
Add Data: Click "Add Success" or "Add Failure" to simulate new trials (e.g. coin flips). The Likelihood (orange) appears and the Posterior (purple) updates: it is pulled toward the data. With many successes the posterior shifts right; with many failures it shifts left.
Strong vs Weak Prior: Try a strong prior (e.g. α=20, β=20) and add a few successes — the curve moves only slightly. Then reset evidence, set a weak prior (α=2, β=2), add the same number of successes — the posterior moves much more. This illustrates how prior strength affects the update.
Summary Panel: Posterior mean, mode, and 95% credible interval update in real time. The "Live Insight" text explains the current belief in plain language.
Reset evidence: Clears successes and failures; the prior is unchanged. Use this to try different data sequences with the same prior.

Tab 2 — Example 1 (Factory Quality Inspection)

A realistic simulation that demonstrates Bayesian inference in action. You receive products from a new supplier and need to estimate the defect rate — a real-world problem with no strong "50/50" prior intuition.

Set True Defect Rate: The actual defect rate of the supplier (hidden from the inspector). Default is 8%. Range 1%–50%. This is the value the posterior should converge to.
Set Prior α, β: The starting prior. Default is Beta(2, 10) — a mildly informed prior meaning "based on past experience, defect rates are usually around 10–20%" (prior mean ≈ 17%). The posterior will be pulled down toward the true 8% as evidence accumulates. Try Beta(1, 1) for a completely flat prior, or Beta(5, 5) for a strong belief that the rate is near 50%.
Inspect items: Click "Inspect 1", "Inspect 10", or "Inspect 100" to draw random items from the supplier. Or click "Auto" to watch continuous inspection (stops at 500 items).
Posterior PDF chart: Shows the Prior (blue) and Posterior (purple, shaded) Beta distributions. A red dashed vertical line marks the true defect rate. Watch the posterior narrow and converge as evidence accumulates.
Histogram: Shows the observed relative frequencies of Defective vs Good items — the frequentist estimate of the defect rate.
Inspection log: A scrollable record of every inspection: D = Defective, G = Good.
Reset: Clears all inspections and resets to the prior.

Tab 3 — Example 2 (Discrete + Continuous)

Demonstrates that Bayes' rule applies to any hypothesis given data — not just the Beta-Binomial case.

Discrete Case — Crime Investigation:

5 suspects (A–E), each starting with a uniform 20% prior probability.
Select a suspect and an evidence type (Witness ID, Forensic match, or Alibi confirmed), then click "Add Evidence".
Witness ID for suspect X: P(evidence | X) = 0.80, P(evidence | other) = 0.05 — strongly points to X.
Forensic match for suspect X: P(evidence | X) = 0.90, P(evidence | other) = 0.025 — very strong evidence for X.
Alibi confirmed for suspect X: P(evidence | X) = 0.02, P(evidence | other) ≈ 0.245 — nearly eliminates X.
The equation panel shows the Bayes calculation for all 5 suspects at each step.

Continuous Case — Weighing an Object:

Estimate the true weight of an object using noisy measurements.
Prior: N(50, 15²) — broad initial guess. Measurement noise: σ = 3 kg.
Uses the Normal-Normal conjugate update: Posterior is also Normal, getting narrower with each measurement.
Orange dots at the bottom of the chart show individual measurements.
The red dashed line marks the true weight.

The connection: All three tabs use the same Bayes' rule P(H|D) = P(D|H)·P(H)/P(D). The only thing that changes is the type of distribution used (discrete probabilities, Normal PDF, or Beta PDF).

Visualizations

Prior (blue): Initial belief before data (Beta, Normal, or uniform bars).
Likelihood (orange): (Theory tab) Binomial likelihood of the observed data.
Posterior (purple): Updated belief after combining prior and data.
True value (red dashed): Ground-truth for comparison (Example 1 and 2).

Summary panel

Posterior mean: Point estimate for the unknown parameter.
Posterior mode: Peak of the posterior distribution.
95% credible interval / σ: Uncertainty measure (Bayesian analogue of a confidence interval).

Limitations

Single parameter, conjugate only: the demo covers the one-dimensional Beta-Binomial case, where the update is closed-form. Most real models are multi-parameter and need MCMC or variational inference.
Prior sensitivity: with little data the posterior depends heavily on the chosen prior — a strong, wrong prior can dominate. The simulator lets you feel this but does not advise which prior to pick.
IID Bernoulli assumption: trials are assumed independent and identically distributed; correlated or non-stationary data would violate the Binomial likelihood.
Likelihood scaled for display: the orange likelihood curve is rescaled to be visible alongside the densities, so its height is not a probability.