Web Simulator | ShareTechnote

Web Simulation

ANOVA (Analysis of Variance) Interactive Visualizer

This interactive tutorial visualizes One-Way ANOVA (Analysis of Variance). The main purpose of ANOVA is to determine if there is a statistically significant difference between the means of three or more independent groups. While a t-test is great for comparing two groups (e.g. “Treatment vs. Control”), running multiple t-tests on three or more groups becomes prone to errors. ANOVA solves this by looking at the entire dataset at once.

Sections

1. The Core Purpose: Signal vs. Noise
2. Primary Usages
3. Why Not Just Use Multiple t-Tests?
4. Partitioning Variance
5. Mean Squares and the F-Statistic
6. Significance Testing
7. Effect Size (η²)
8. What ANOVA Does Not Tell You
9. The “Squares” Visualization
Simulation
Application Tabs
Interaction Methods
Theory Tab Controls
ANOVA Table
Step-by-Step Calculation Breakdown
F-Statistic Gauge & F-Distribution Curve
Interpretation (Example Tabs)
Key Insights
Assumptions & Limitations

1. The Core Purpose: Signal vs. Noise

ANOVA asks a simple question: Is the variation between the groups larger than the variation within the groups?

Between-Group Variance (The Signal): How different are the group averages from each other?
Within-Group Variance (The Noise): How much do individual data points spread out inside their own groups?

If the “Signal” is significantly stronger than the “Noise,” ANOVA concludes that at least one group mean is likely different from the others. The null hypothesis H₀ states that all group means are equal: μ₁ = μ₂ = … = μ_k.

2. Primary Usages

ANOVA is a staple in research and engineering because it handles complex experimental designs. Each use case below has a fully interactive example in the simulator:

Clinical Trials (Example 1): Comparing the effectiveness of three different drug dosages (5 mg vs. 10 mg vs. 20 mg).
Agriculture (Example 2): Testing if four different fertilizers produce different crop yields.
Manufacturing (Example 3): Determining if three CNC machines produce parts with the same average thickness.
User Experience (UX) (Example 4): Comparing the time it takes users to complete a checkout task using three different website layouts.

3. Why Not Just Use Multiple t-Tests?

If you have 3 groups (A, B, C) and use t-tests to compare A–B, B–C, and A–C, you run into the Problem of Multiple Comparisons. Every time you run a statistical test, there is a 5% chance of a “False Positive” (Type I Error). By running three tests, your total risk of being wrong increases to about 14%. ANOVA keeps that risk at exactly 5% by performing one single “Omnibus” test.

The math of inflated error: the chance of at least one false positive across m independent tests is 1 − (1 − α)^m. For α = 0.05 and m = 3 that is 1 − 0.95³ ≈ 0.143 — nearly triple the intended 5%.

4. Partitioning Variance

Total variation is split into two components — the “signal” between groups and the “noise” within groups:

Term	Definition	Meaning
SS_Total	∑(x_ij − x)²	Total deviation of every point from the grand mean
SS_Between	∑ n_i(x_i − x)²	Deviation of each group mean from the grand mean (the “signal”)
SS_Within	∑∑(x_ij − x_i)²	Deviation of each point from its own group mean (the “noise”)

SS_Total = SS_Between + SS_Within

5. Mean Squares and the F-Statistic

We normalize the sums of squares by their degrees of freedom to get Mean Squares (k = number of groups, N = total points):

MS_Between = SS_Between / (k − 1) MS_Within = SS_Within / (N − k)

The F-statistic is the ratio of the two — signal over noise:

F = MS_Between / MS_Within = Signal / Noise

A large F means the between-group differences are much bigger than the within-group scatter — evidence that the group means are truly different.

Try this: in the Theory tab, drag one group’s mean line away from the others. MS_Between grows while MS_Within stays fixed, so F climbs — watch the gauge cross from red into green.

6. Significance Testing

Under H₀, the F-statistic follows an F-distribution with degrees of freedom (k−1, N−k). If the computed F exceeds the critical value we reject H₀ and conclude the group means are significantly different:

df(2, 12), α = 0.05 → F_crit ≈ 3.89 reject H₀ when F > F_crit

7. Effect Size (η²)

Eta-squared measures what fraction of total variance is explained by group membership:

η² = SS_Between / SS_Total

η² near 1 means the groups explain almost all variation; near 0 means the groups explain very little. In the social sciences, η² > 0.14 is conventionally a “large” effect.

Significance ≠ importance. With a large sample, a trivially small mean difference can still be “significant” (F > F_crit) while η² stays near zero. Always report the effect size alongside the p-value.

8. What ANOVA Does Not Tell You

ANOVA is an “omnibus” test. If the result is significant, it tells you: “At least one of these groups is different!” However, it does not tell you which specific group is the outlier. To find that out, you must follow up with Post-Hoc Tests (such as Tukey’s HSD or Bonferroni correction), which are specifically designed to safely “drill down” into pairwise comparisons while controlling the overall error rate.

9. The “Squares” Visualization

In the simulation below, each “Sum of Squares” term is shown as a literal square on the canvas. The side length equals the distance from a point to its reference line. Colored squares (within) connect each point to its group mean. Dashed blue squares (between) connect each group mean to the grand mean. ANOVA compares the total area of the between-squares to the total area of the within-squares.

Simulation

The interactive simulator is below. Use the controls to explore the concepts described above.

Controls

Points per group: 5

Preset:

Show Variance Squares (SS)

F-Statistic

0.00

p = 1.0000
Not Significant

Source

Between (Signal)

0.0

0.00

Within (Noise)

0.0

0.00

Total

0.0

Grand Mean

—

Mean A

—

Mean B

—

Mean C

—

η² (Effect Size)

—

How to read this: Each column is a group (A, B, C). Drag the colored circles up/down to change data values. The horizontal colored lines show each group’s mean; the dashed gray line is the grand mean.

Variance Squares: Colored filled squares show within-group variance (noise) — the distance from each point to its group mean. Blue dashed squares show between-group variance (signal) — the distance from each group mean to the grand mean.

F = Signal / Noise: When groups are far apart (large blue squares) and points are tightly clustered (small colored squares), F is large and the result is significant. When groups overlap, F drops and the result is not significant.

Yellow line on the F-bar marks F_crit ≈ 3.89 (for α=0.05 with df(2,12)). The p-value adjusts dynamically based on actual degrees of freedom.

Scenario — Clinical Trial

A pharmaceutical company is testing a new pain-relief drug. They randomly assign 15 patients to three dosage groups (5 mg, 10 mg, and 20 mg) and measure each patient’s symptom improvement score on a 0–50 scale after 4 weeks of treatment.

Research question: Does the dosage level significantly affect symptom improvement?

Raw Data — click any value to edit

Patient	5 mg (Low)	10 mg (Medium)	20 mg (High)

F-Statistic

0.00

p = 1.0000
Not Significant

ANOVA Summary Table

Source	SS	df	MS	F

Step-by-Step Calculation

Interpretation

Scenario — Agriculture Field Trial

An agricultural research station wants to know whether four different fertilizers (Organic, Nitrogen-Rich, Phosphorus-Rich, and a balanced NPK blend) produce different crop yields. They randomly assign 24 test plots (6 per fertilizer) and measure the yield in bushels per acre after one growing season.

Research question: Does the type of fertilizer significantly affect crop yield?

Raw Data — click any value to edit

Plot	Organic	Nitrogen	Phosphorus	NPK Blend

F-Statistic

0.00

p = 1.0000
Not Significant

ANOVA Summary Table

Source	SS	df	MS	F

Step-by-Step Calculation

Interpretation

Scenario — Manufacturing Quality Control

A factory operates three CNC machines (Machine 1, Machine 2, Machine 3) that stamp metal parts to a target thickness of 5.00 mm. Quality engineers randomly sample 8 parts per machine and measure the actual thickness to determine whether the machines are producing parts with the same average thickness.

Research question: Do the three machines produce parts with significantly different mean thicknesses?

Raw Data — thickness in mm — click any value to edit

Part	Machine 1	Machine 2	Machine 3

F-Statistic

0.00

p = 1.0000
Not Significant

ANOVA Summary Table

Source	SS	df	MS	F

Step-by-Step Calculation

Interpretation

Scenario — UX Task-Completion Time

A UX team is evaluating three website layouts (Layout A — Classic, Layout B — Modern, Layout C — Minimal) by asking 10 users per layout to complete an identical checkout task. The dependent variable is time in seconds to complete the task.

Research question: Does the website layout significantly affect the time users need to finish the checkout task?

Raw Data — seconds — click any value to edit

User	Layout A (Classic)	Layout B (Modern)	Layout C (Minimal)

F-Statistic

0.00

p = 1.0000
Not Significant

ANOVA Summary Table

Source	SS	df	MS	F

Step-by-Step Calculation

Interpretation

Application Tabs

The simulator is organized into five tabs. Each tab is fully interactive — every change updates all statistics, charts, and interpretations in real time.

Tab	What it demonstrates
Theory	A sandbox with 3 groups of adjustable data points. Includes presets, a points-per-group slider, randomize/reset buttons, and a variance-squares toggle. Use this to build intuition about how ANOVA works.
Example 1: Clinical Trial	Three drug dosages (5 mg, 10 mg, 20 mg) and symptom improvement scores. Demonstrates a typical biomedical comparison.
Example 2: Agriculture	Four fertilizers and crop yields. Shows ANOVA with more than three groups (4 groups, 4 colors).
Example 3: Manufacturing	Three CNC machines stamping parts to a 5.00 mm target thickness. Values are decimal (mm) with high precision — demonstrates ANOVA on measurement data.
Example 4: UX	Three website layouts and checkout task-completion times (seconds). Demonstrates how ANOVA guides design decisions.

Interaction Methods

Every example tab supports three ways to modify data, all synced in real time:

Drag data points: Click and drag any colored circle vertically to change its value. The ANOVA table, F-statistic, breakdown, and canvas update instantly.
Drag mean lines: Click and drag any group’s mean line (the horizontal colored line) to shift the entire group up or down. This is the fastest way to switch between “reject H₀” and “fail to reject H₀” scenarios.
Edit table cells: In Example tabs, click any data cell in the raw-data table to type a new value directly.

Theory Tab Controls

Points per group: Slider from 3 to 10. More points → more degrees of freedom → more stable F-statistic.
Presets: Quick-load configurations:
- Well Separated — groups far apart → large F (significant).
- Overlapping — groups stacked together → small F (not significant).
- One Outlier Group — two similar groups plus one extreme → moderate F.
- All Equal — every point is 50 → F = 0, zero variance.
Show Variance Squares: Toggle the geometric visualization of SS_Within (colored squares) and SS_Between (blue dashed squares). Each square’s side equals the distance from a point to its reference mean.
Randomize / Reset: Generates random data or returns to the default configuration.

ANOVA Table

SS (Sum of Squares): Total squared deviation. Between = how far group means are from the grand mean. Within = how far points scatter around their own group mean.
df (Degrees of Freedom): Between = k−1. Within = N−k.
MS (Mean Square): SS divided by df. Normalizes for group count and sample size.
F: The ratio MS_Between / MS_Within. Large F → group means are significantly different.

Step-by-Step Calculation Breakdown

Below the ANOVA table, each tab shows how every result is computed, step by step, with the actual data values plugged into the formulas. The seven steps are: Group Means → Grand Mean → SS_Between → SS_Within → SS_Total → Mean Squares → F-Statistic and η².

F-Statistic Gauge & F-Distribution Curve

Gauge bar: Green when p < 0.05 (significant); red when p ≥ 0.05 (not significant). The yellow line marks the critical F-value.
F-distribution curve: Shown below the gauge in every tab. Displays the F-distribution PDF for the current degrees of freedom, with:
- A shaded rejection region beyond the critical value.
- A vertical line marking the current F-value.
- Axis labels and distribution parameters (d₁, d₂).

Interpretation (Example Tabs)

Each example tab includes a plain-language interpretation section that explains:

Whether the result is statistically significant and what that means in everyday terms.
The effect size (η²) and how much of the total variance is explained by the grouping factor.
A domain-specific recommendation (e.g. which drug dosage is most effective, which machine needs recalibration, which layout is fastest).

Key Insights

Signal vs Noise: ANOVA is fundamentally a signal-to-noise ratio. The “signal” is how different the group averages are; the “noise” is how much the points scatter within each group.
Outlier sensitivity: Drag a single point far from its group — within-group variance explodes and F drops, even if groups are otherwise well separated.
Sample size effect: Increase points per group (Theory tab) and observe that the same group separation produces a larger F (more evidence).
Mean-line dragging: The fastest way to explore significance boundaries. Slowly drag one group’s mean toward the grand mean and watch F shrink from significant to non-significant.
η² interpretation: Values above 0.14 are considered large effect sizes in social sciences. Values near 0 mean group membership explains almost none of the total variance.

Assumptions & Limitations

The F-test is only valid when its underlying assumptions hold. This simulator illustrates the variance arithmetic but does not check these conditions for you:

Independence: observations must be independent within and across groups. Repeated measures on the same subject violate this and need repeated-measures ANOVA instead.
Normality: the residuals within each group are assumed approximately normal. The F-test is fairly robust to mild departures, less so for small samples.
Homogeneity of variance: all groups are assumed to share the same population variance (test with Levene’s or Bartlett’s test). When variances differ widely, use Welch’s ANOVA.
Omnibus only: a significant F tells you at least one mean differs, not which — follow up with a post-hoc test (Tukey HSD, Bonferroni).
Balanced demo: the worked examples use small, roughly balanced groups for clarity; real designs often need larger or unbalanced samples.