|
This interactive tutorial helps you build intuition for how step size (learning rate η) affects gradient descent: convergence, oscillation, or divergence. The simulation uses a single variable x and updates it with x ← x − η·f'(x) at each step. You can choose from six functions, each illustrating a different challenge: a smooth bowl (ideal convex), local minima (the trap), high-frequency ripples, a flat plateau (vanishing gradient), steep walls (exploding gradient), and a non-differentiable V-shape (endless bouncing). Adjust the learning rate (logarithmic scale 0.001–1.5), optionally enable Momentum or an Auto RL method (AdaGrad, RMSprop, Adam), then use Step Fwd, Step Bwd, or Run to run the descent. The main canvas shows the function curve, the tangent at the current x, the trajectory, and an update arrow (green if loss decreased, red if overshot). Click on the plot to set a new starting x and reset the path. The Theory and Parameters sections below spell out the update rules and all controls.
TheoryPlain gradient descent. We minimize f(x) by repeatedly moving opposite to the gradient: x ← x − η·f'(x). The learning rate η controls step size. Too small → slow convergence; too large → oscillation or divergence. Momentum (Polyak). A velocity term smooths updates and can escape shallow local minima: v ← μv − η·f'(x), then x ← x + v. The coefficient μ ∈ [0, 1] damps past velocity. μ = 0 reduces to plain GD; μ close to 1 carries more history and can help on “ripply” landscapes. Adaptive learning rate (Auto RL). These methods scale the effective step size using past gradient information, so different “directions” (here, just sign and magnitude of f') can have different step sizes. When an adaptive method is selected, its parameters appear as sliders; Momentum is disabled. All controls, including Auto RL and its parameters, apply in real time (even during Run).
Convergence & divergence. The run stops when |f'(x)| < 0.001 (converged). If x or f(x) becomes non-finite or |x| > 50, the simulation reports “Diverged” and halts. Reset clears the trajectory and any adaptive/momentum state.
Parameters
Buttons
Visualization
|
||