Web Simulation 

 

 

 

 

Gradient Descent Visualizer I 

This note provides an interactive, visual simulation of Gradient Descent optimization algorithm with momentum. It helps you build intuition on how gradient descent navigates through different optimization landscapes to find minima of functions.

The simulation visualizes gradient descent on six optimization functions: Sphere, Rosenbrock, Rastrigin, Saddle Point, Bi-modal Gaussian, and Tri-modal Gaussian. Each function presents different challenges - from simple convex landscapes to complex non-convex surfaces with multiple local minima and saddle points. The Bi-modal and Tri-modal Gaussian functions are particularly useful for demonstrating how starting position affects which minimum the algorithm converges to.

The visualization shows both a 2D contour plot and a 3D surface plot side by side at the top, with controls below. When you run the descent, you can watch a red path move along the optimization trajectory, following the negative gradient direction. A yellow marker shows the initial starting point, which updates in real-time as you adjust the Start X and Start Y parameters. The trajectory is overlaid on both plots, allowing you to see how the algorithm navigates the landscape in real-time.

You can adjust the learning rate (step size), momentum (default 0.9) to help escape flat regions and local minima, set the starting position, control the variable range for visualization, and set the maximum number of iterations. This interactive exploration helps you understand how these parameters affect the convergence behavior of gradient descent. The plots automatically clear previous trajectories when you change any parameter, ensuring a clean visualization for each experiment.

NOTE : This visualization uses analytical gradients (exact derivatives) for accurate and efficient computation. The functions are implemented with their mathematical formulas and gradient expressions. The plots use a square aspect ratio (1:1) and display without tick marks, labels, or legends for a clean visualization focused on the optimization landscape.

 

Parameters

Followings are short descriptions on each parameters
  • Function: Selects the optimization function to visualize. Six functions are available:
    • Sphere: f(x,y) = x² + y². A simple convex function with a single global minimum at (0,0).
    • Rosenbrock: f(x,y) = (1-x)² + 100(y-x²)². A classic test function with a narrow curved valley. The global minimum is at (1,1).
    • Rastrigin: f(x,y) = 20 + (x² - 10cos(2πx)) + (y² - 10cos(2πy)). A highly multimodal function with many local minima. The global minimum is at (0,0).
    • Saddle Point: f(x,y) = x² - y². A function with a saddle point at (0,0), demonstrating how gradient descent can get stuck at non-optimal points.
    • Bi-modal Gaussian: A function with two local minima created by two Gaussian valleys. Demonstrates how starting position affects which minimum the algorithm finds.
    • Tri-modal Gaussian: A function with three local minima (one global and two local) created by three Gaussian valleys. The deepest valley is at (-2, 0), with medium and shallow valleys at (2, 2) and (2, -2) respectively. Perfect for testing initialization strategies.
  • Learning Rate (α): Controls the step size for each gradient descent update. Larger values (closer to 1.0) take bigger steps, which can lead to faster convergence but may overshoot or oscillate. Smaller values (0.001-0.1) are more stable but slower. Default is 0.1.
  • Momentum: Adds momentum to the gradient descent updates, helping the algorithm escape flat regions and local minima. The momentum value (0 to 0.99) determines how much of the previous velocity is retained. Higher values (0.9-0.99) maintain more velocity, which is especially helpful for functions like Rosenbrock and helps the algorithm converge faster. Default is 0.9. The algorithm uses velocity-based updates: v = momentum * v_old - learningRate * gradient, then position += v.
  • Max Iterations: Maximum number of gradient descent steps to perform. The algorithm will stop after this many iterations even if it hasn't converged. Default is 100.
  • Start X, Start Y: Initial position (x₀, y₀) for the gradient descent algorithm. The starting point significantly affects the convergence path and whether the algorithm finds the global minimum. You can adjust these sliders to explore different starting positions. A yellow marker on both plots shows the current starting position in real-time as you adjust these values. The range automatically adjusts based on the Variable Range parameter.
  • Variable Range (vr): Controls the visualization range for both x and y axes. The plots display the function over the range [-vr, vr] for both variables. This parameter also controls the range of the Start X and Start Y sliders. Adjusting this parameter updates the plots and clamps the starting position to the new range. Default is 3.0.

Buttons

Followings are short descriptions on each Button
  • Run Descent: Starts the gradient descent algorithm from the current starting position. The red marker will move along the optimization path, updating in real-time. The button changes to "Stop" while running, allowing you to pause the algorithm at any time. The trajectory is overlaid on both the contour and 3D surface plots.
  • Reset: Clears the current trajectory and resets the visualization. The plots return to showing only the function surface without the optimization path. The iteration counter and current position display are also reset.

Visualization Features

  • Layout: The two plots (contour and 3D surface) are displayed side by side at the top in a square format (1:1 aspect ratio). The control panel with all parameters is located below the plots for easy access.
  • Contour Plot: A 2D top-down view showing contour lines (level sets) of the function. The color gradient represents function values. The red path shows the gradient descent trajectory, and a yellow marker indicates the initial starting point.
  • 3D Surface Plot: A three-dimensional visualization of the function surface. You can rotate, zoom, and pan the view to explore the landscape. The red path shows the optimization trajectory in 3D space, and a yellow marker shows the starting point on the surface.
  • Real-time Updates: During descent, the iteration count, current position (x, y), and current function value are displayed in real-time. The plots update periodically to show the progress. The initial starting point marker updates immediately as you adjust Start X and Start Y sliders.
  • Clean Visualization: The plots are designed for clarity - no tick marks, tick labels, color bars, legends, or titles. The focus is entirely on the function landscape and optimization trajectory.
  • Automatic Trajectory Clearing: When you change any parameter (function, learning rate, momentum, max iterations, start position, or variable range) or click any button, any previous trajectory path is automatically cleared, ensuring a clean visualization for each experiment.