Web Simulator | ShareTechnote

Web Simulation

Recurrent Neural Network (RNN) - Elman Network Tutorial

This interactive tutorial demonstrates the Recurrent Neural Network (RNN), specifically the Elman Network, a fundamental architecture for processing sequential data. RNNs maintain an internal "memory" (hidden state) that allows them to process sequences by remembering information from previous time steps, making them ideal for tasks involving temporal dependencies, such as language modeling, time series prediction, and sequence classification. The tutorial visualizes the simplest RNN architecture (Elman Network) with a single input neuron, a single hidden neuron with recurrent connections, and a single output neuron, making it easy to understand how RNNs work at a fundamental level.

The visualization displays two main components: (1) Network Diagram (top) - shows the 3 neurons (Input, Hidden with recurrent loop, Output) with connections and weight values, where the Hidden neuron changes color intensity/opacity based on its activation value to visualize the "memory" state, (2) Time Series Graph (bottom) - shows three lines (Input, Hidden State, Output) plotted against time steps, demonstrating how the hidden state "remembers" previous inputs and how the output evolves over time. The graphs are rendered using HTML5 Canvas for real-time visualization with a dark theme (black background) and bright colors for optimal visibility. Real-time statistics display the current values of the hidden state h(t), output y(t), and the mathematical calculation for the current step.

The simulator implements the standard RNN equations: Hidden State: h(t) = tanh(W_in × x(t) + W_hidden × h(t-1) + b) (where W_in is the input weight, W_hidden is the recurrent/feedback weight, and b is the bias), Output: y(t) = W_out × h(t) (where W_out is the output weight). You can define an input sequence (e.g., "0, 1, 0, 0, 0") and adjust the weights (W_in, W_hidden, W_out) and bias (b) using sliders. Control buttons allow you to Step Forward (advance one time step), Play/Auto (run the sequence automatically), and Reset (return to the initial state). An educational math panel displays the exact calculation for the current step (e.g., "h(1) = tanh(0.5 × 1 + 0.8 × 0 + 0.2) = 0.46"), making the mathematical process transparent.

Glass-box design: this simulation shows RNN memory where every weight, activation, and calculation is visible. The Elman Network (1 input, 1 hidden, 1 output) has only 4 parameters (W_in, W_hidden, W_out, b) — simple enough to see every connection yet rich enough to show the core RNN idea. The key is the recurrent weight W_hidden: when it is high, the hidden neuron remembers previous inputs longer (memory decays slowly); when low, memory fades quickly.

Sections

Mathematical Model
Simulation
Usage Example
Parameters
Controls and Visualizations
Key Concepts
Limitations

Mathematical Model

The Recurrent Neural Network (Elman Network) is a fundamental architecture for processing sequential data by maintaining an internal hidden state that acts as "memory" for previous inputs. The network processes sequences one element at a time, updating its hidden state at each time step based on both the current input and the previous hidden state.

RNN Equations:

h(t) = tanh(W_in × x(t) + W_hidden × h(t−1) + b)  (recurrent update)
y(t) = W_out × h(t)  (linear output)
h(0) = 0  (zero initialization)

where:

h(t): Hidden state at time t (the "memory" of the network) - a neuron with internal state and activation
x(t): Input at time t (from the input sequence) - an input value (scalar, not a neuron), typically 0 or 1
y(t): Output at time t (the network's prediction/response) - a neuron that computes y(t) = W_out × h(t)
W_in: Input weight (scales the current input)
W_hidden: Recurrent/feedback weight (scales the previous hidden state, controls memory retention)
W_out: Output weight (scales the hidden state to produce output)
b: Bias term (adds constant offset to hidden state)
tanh: Hyperbolic tangent activation function (squashes values to [-1, 1])
t: Time step (integer: 0, 1, 2, ...)

Understanding the Terms:

Hidden State (h(t)): The hidden state is the "memory" of the network. It captures information from previous inputs and maintains it across time steps. The hidden state is updated at each time step by combining: (1) the current input x(t) weighted by W_in, (2) the previous hidden state h(t-1) weighted by W_hidden (the recurrent connection), and (3) the bias b. The tanh activation function ensures the hidden state stays within [-1, 1], preventing unbounded growth. The recurrent connection (W_hidden × h(t-1)) is what gives RNNs their memory: when W_hidden is large (close to 1), the network "remembers" previous inputs for many time steps (memory decays slowly); when W_hidden is small (close to 0), memory fades quickly.

Recurrent Connection (W_hidden): The key feature of RNNs is the recurrent connection that feeds the hidden state back into itself. This creates a feedback loop that allows the network to maintain information across time steps. The weight W_hidden controls how much of the previous hidden state is retained: higher values (e.g., 0.9) create strong memory (hidden state persists for many steps), lower values (e.g., 0.1) create weak memory (hidden state fades quickly). This is visualized in the network diagram as a loop from the Hidden neuron back to itself, and in the time series graph as the hidden state "decaying" when the input is zero.

Output (y(t)): The output is a linear transformation of the hidden state: y(t) = W_out × h(t). The output weight W_out scales the hidden state to produce the final output. Since the hidden state contains the network's "memory" of previous inputs, the output reflects both the current input and information from past inputs. The output can represent a prediction, classification, or any other task-specific value.

Input (x(t)): The input x(t) is an input value (a scalar, typically 0 or 1), not a neuron. It represents external data fed into the network at each time step. The input has no internal state or computation - it's simply a number that gets multiplied by W_in and added to the hidden state calculation. In contrast, h(t) and y(t) are neurons: h(t) is a neuron with internal state, activation function (tanh), and recurrent connections, while y(t) is a neuron (output layer) that performs computation (linear transformation of h(t)). The visualization shows all three as circles for simplicity, but conceptually x(t) is external input data, while h(t) and y(t) are internal network components with computation.

Memory and Temporal Dependencies:

The recurrent connection creates memory by allowing the hidden state to depend on both the current input and previous hidden states. This enables RNNs to process sequences with temporal dependencies - patterns that depend on previous elements in the sequence. For example, in language modeling, the network can learn that after seeing "The cat sat on the...", the next word is likely "mat" because it "remembers" the context from previous words. The memory decay rate is controlled by W_hidden: large values create long-term memory (information persists), small values create short-term memory (information fades quickly).

Network Diagram Visualization: The network diagram shows the 3 neurons (Input, Hidden, Output) with connections and weight labels. The Hidden neuron has a recurrent loop (arrow from itself back to itself) labeled with W_hidden, visually representing the feedback connection. The Hidden neuron changes color intensity/opacity based on its activation value: brighter colors indicate higher activation (stronger memory), dimmer colors indicate lower activation (weaker memory). This visual feedback shows how the memory state evolves over time, with the Hidden neuron "glowing" when it contains information and "fading" when memory decays.

Simulation

The interactive simulator is below. Use the controls to explore the concepts described above.

Usage Example

Follow these steps to explore how the RNN maintains memory and processes sequences:

Initial State: When you first load the simulation, you'll see: (1) Network Diagram (top) - shows the 3 neurons (Input, Hidden with recurrent loop, Output) with default weights, (2) Time Series Graph (bottom) - empty, ready to display the sequence, (3) Math Panel - displays the calculation for the current step. The input sequence field is empty. The hidden state h(0) is initialized to 0. Notice the recurrent loop on the Hidden neuron (arrow from itself back to itself) - this is what creates the memory.
Define Input Sequence: Enter an input sequence in the "Input Sequence" field (e.g., "0, 1, 0, 0, 0" or "1, 1, 0, 1"). The sequence is comma-separated values (0 or 1). This sequence will be processed step by step. Try a simple pattern first: "1, 0, 0, 0, 0" (a single pulse) to observe how memory decays.
Observe Memory Decay: Use the "Step Forward" button to advance one time step at a time. Watch: (1) The Input neuron shows the current input value (0 or 1), (2) The Hidden neuron changes color intensity based on its activation (bright when h(t) is large, dim when h(t) is small), (3) The Output shows the scaled hidden state, (4) The Math Panel displays the exact calculation (e.g., "h(1) = tanh(0.5 × 1 + 0.8 × 0 + 0.2) = 0.46"), (5) The Time Series Graph plots the Input, Hidden State, and Output values. Try the sequence "1, 0, 0, 0, 0" with default weights - notice how the hidden state starts at a high value when input is 1, then decays gradually when input is 0, demonstrating memory retention.
Adjust Memory Weight (W_hidden): Experiment with the "W_hidden (Memory)" slider to control memory retention. Try:
- W_hidden = 0.9 - Strong memory: hidden state decays very slowly (memory persists for many steps)
- W_hidden = 0.5 - Moderate memory: hidden state decays at a moderate rate
- W_hidden = 0.1 - Weak memory: hidden state decays quickly (memory fades fast)
Reset and replay the sequence "1, 0, 0, 0, 0" with different W_hidden values. Observe how the Hidden neuron stays "lit" longer with higher W_hidden values, and the Time Series Graph shows slower decay in the Hidden State line. This demonstrates how W_hidden controls memory retention.
Use Play/Auto Mode: Click "Play" to automatically advance through the sequence. The simulation will process each input step automatically with a small delay between steps, making it easy to observe how the hidden state evolves. Click "Pause" to stop and examine the current state. Use Play to see the dynamic behavior: watch the Hidden neuron "glow" when inputs are present, then "fade" as memory decays when inputs are zero.
Experiment with Input Weight (W_in): Adjust the "W_in (Input)" slider to control how strongly the current input affects the hidden state. Higher values (e.g., 1.0) make the network more responsive to inputs, lower values (e.g., 0.1) make it less responsive. Try the sequence "1, 1, 0, 0, 0" with different W_in values - observe how the hidden state increases more when W_in is higher.
Observe Output Weight (W_out): Adjust the "W_out (Output)" slider to scale the output. The output is simply W_out × h(t), so W_out controls the magnitude of the output signal. Try different values while watching the Output line in the Time Series Graph - it scales proportionally to the hidden state.
Understand the Math Panel: The Math Panel displays the exact calculation for the current step, showing: (1) The hidden state equation with actual values, (2) The intermediate calculations (W_in × x(t), W_hidden × h(t-1), sum + b), (3) The final result after tanh activation. Use Step Forward mode to observe how the calculation changes at each step, especially when the input changes from 1 to 0 - you'll see W_hidden × h(t-1) become the dominant term, demonstrating memory retention.
Try Complex Sequences: Experiment with different input sequences to observe various behaviors:
- "1, 1, 1, 0, 0" - Multiple consecutive inputs: hidden state accumulates, then decays
- "1, 0, 1, 0, 1" - Alternating pattern: hidden state oscillates
- "0, 0, 0, 1, 0" - Late pulse: observe how memory works even after many zeros
Each sequence demonstrates different aspects of memory and temporal processing.
Reset and Explore: Click "Reset" to clear the simulation and return to the initial state (h(0) = 0). Try different combinations of weights and sequences to fully understand how RNNs maintain memory and process sequences. The key insight is the recurrent connection (W_hidden): it creates a feedback loop that allows the network to "remember" previous inputs.

What to watch: the recurrent connection (W_hidden) creates memory. With high W_hidden (e.g. 0.9) the Hidden neuron remembers previous inputs for many steps (it stays "lit" and the Hidden State line decays slowly); with low W_hidden (e.g. 0.1) memory fades quickly. Try a simple sequence like "1, 0, 0, 0, 0" and adjust W_hidden to see the effect clearly. The Math Panel shows the exact calculation at each step.

Parameters

Followings are short descriptions on each parameter

Input Sequence: A comma-separated sequence of input values (0 or 1) that defines the sequence to be processed by the RNN. For example, "0, 1, 0, 0, 0" or "1, 1, 0, 1". The sequence is processed step by step, with each value representing the input x(t) at time step t. You can define any sequence length. Default: empty. The sequence is entered in a text field in the control panel. The RNN processes the sequence from left to right, maintaining its hidden state (memory) across time steps.
W_in (Input Weight): The weight that scales the current input x(t) before adding it to the hidden state. Range: typically -2.0 to 2.0. Default: 0.5. Higher values make the network more responsive to the current input, lower values make it less responsive. The input weight controls how strongly each input affects the hidden state update. Adjust using a slider in the control panel.
W_hidden (Recurrent/Memory Weight): The weight that scales the previous hidden state h(t-1) in the recurrent connection. This is the key parameter that controls memory retention. Range: typically 0.0 to 1.0 (values above 1.0 can cause instability). Default: 0.8. Higher values (e.g., 0.9) create strong memory (hidden state persists for many steps), lower values (e.g., 0.1) create weak memory (hidden state fades quickly). This weight controls the "memory decay rate" - how long the network remembers previous inputs. Adjust using a slider in the control panel.
W_out (Output Weight): The weight that scales the hidden state h(t) to produce the output y(t). Range: typically -2.0 to 2.0. Default: 1.0. The output is simply y(t) = W_out × h(t), so W_out controls the magnitude of the output signal. Adjust using a slider in the control panel.
b (Bias): A constant offset added to the hidden state update. Range: typically -1.0 to 1.0. Default: 0.0. The bias term adds a constant value to the weighted sum before the tanh activation. It can shift the operating point of the hidden state. Adjust using a slider in the control panel.
Hidden State h(t): The internal "memory" of the network, updated at each time step according to h(t) = tanh(W_in × x(t) + W_hidden × h(t-1) + b). Range: [-1, 1] (due to tanh activation). Initial value: h(0) = 0. The hidden state captures information from previous inputs and maintains it across time steps. It is visualized in the network diagram (Hidden neuron color intensity) and in the Time Series Graph (Hidden State line). The hidden state is the key to RNN memory - it carries information forward through the sequence.
Output y(t): The network's output at time step t, calculated as y(t) = W_out × h(t). Range: depends on W_out and h(t). The output is a linear transformation of the hidden state, so it reflects the network's "memory" of previous inputs as well as the current input. Visualized in the Time Series Graph (Output line).
Network Architecture: The RNN architecture consists of: (1) Input x(t) - the input value (not a neuron, just a scalar value 0 or 1) from the sequence, (2) Hidden Neuron h(t) - a neuron that maintains the hidden state h(t) and has a recurrent connection (loop from itself back to itself) with weight W_hidden, (3) Output Neuron y(t) - a neuron that produces the output y(t) = W_out × h(t) from the hidden state. The network diagram visualizes all three components: Input (shown as a green circle representing the input value), Hidden Neuron (shown as a purple/blue circle that changes color intensity based on activation), and Output Neuron (shown as an orange circle). Note that while x(t) is shown as a circle in the diagram for visualization purposes, it represents an input value rather than a neuron with internal computation.

Controls and Visualizations

Followings are short descriptions on each control

Input Sequence Field: A text input field where you enter the input sequence as comma-separated values (0 or 1). For example, "0, 1, 0, 0, 0" or "1, 1, 0, 1". The sequence defines the inputs x(t) that will be processed step by step. Located in the control panel. The sequence can be any length. Default: empty. When you enter a sequence and click Step Forward or Play, the RNN processes each value in order, maintaining its hidden state across steps.
W_in (Input Weight) Slider: Controls the input weight W_in (range: typically -2.0 to 2.0, default: 0.5). Located in the control panel with label and value display. Higher values make the network more responsive to the current input, lower values make it less responsive. The slider updates in real-time, immediately affecting the hidden state calculation when you step through the sequence. Adjust to control how strongly inputs affect the hidden state.
W_hidden (Memory Weight) Slider: Controls the recurrent/memory weight W_hidden (range: typically 0.0 to 1.0, default: 0.8). Located in the control panel with label and value display. This is the most important parameter - it controls memory retention. Higher values (e.g., 0.9) create strong memory (hidden state persists for many steps), lower values (e.g., 0.1) create weak memory (hidden state fades quickly). The slider updates in real-time. Adjust to observe how memory decay changes - try resetting and replaying a sequence with different W_hidden values to see the effect on memory retention.
W_out (Output Weight) Slider: Controls the output weight W_out (range: typically -2.0 to 2.0, default: 1.0). Located in the control panel with label and value display. The output is y(t) = W_out × h(t), so W_out scales the hidden state to produce the output. The slider updates in real-time, immediately affecting the output values. Adjust to scale the output signal.
b (Bias) Slider: Controls the bias term b (range: typically -1.0 to 1.0, default: 0.0). Located in the control panel with label and value display. The bias adds a constant offset to the hidden state update. The slider updates in real-time. Adjust to shift the operating point of the hidden state.
Step Forward Button: Advances the simulation one time step forward, processing the next input in the sequence. Located in the control panel. When clicked, the RNN: (1) Reads the next input x(t) from the sequence, (2) Updates the hidden state h(t) using the RNN equation, (3) Calculates the output y(t) = W_out × h(t), (4) Updates the network diagram (Hidden neuron color intensity, connection values), (5) Plots the new point on the Time Series Graph, (6) Updates the Math Panel with the exact calculation for this step. Use Step Forward to observe the simulation step by step, carefully watching how the hidden state evolves and how the Math Panel shows the calculations.
Play/Auto Button: Automatically advances through the sequence with a small delay between steps. Located in the control panel. When clicked, the simulation processes each input in the sequence automatically, making it easy to observe the overall behavior. Click "Pause" to stop. The Play mode helps visualize the dynamic behavior: watch the Hidden neuron "glow" when inputs are present, then "fade" as memory decays. Use Play to see how memory evolves over time without manually clicking Step Forward.
Pause Button: Pauses the automatic playback. Located in the control panel. When clicked during Play mode, the simulation stops at the current step, allowing you to examine the current state. Click Play again to resume.
Reset Button: Resets the simulation to the initial state. Located in the control panel. When clicked: (1) The hidden state resets to h(0) = 0, (2) The time step resets to t = 0, (3) The Time Series Graph is cleared, (4) The network diagram shows the initial state (Hidden neuron dim). The input sequence and weights are preserved. Use Reset to start over with the same sequence and weights, or to clear the graph and observe the sequence from the beginning.
Network Diagram Canvas: Canvas displaying the RNN architecture visualization showing the 3 neurons (Input, Hidden with recurrent loop, Output) with connections and weight labels. The Hidden neuron has a recurrent loop (arrow from itself back to itself) labeled with W_hidden, visually representing the feedback connection. The Hidden neuron changes color intensity/opacity based on its activation value: brighter colors indicate higher activation (stronger memory), dimmer colors indicate lower activation (weaker memory). Connection lines show the weights W_in, W_hidden, and W_out with their current values. This visual feedback shows how the memory state evolves over time, with the Hidden neuron "glowing" when it contains information and "fading" when memory decays.
Time Series Graph Canvas: Canvas displaying the time series visualization showing lines plotted against time steps: (1) Input (green solid) - the input sequence x(t), (2) Hidden State (blue dashed) - the hidden state h(t) showing how memory evolves, (3) Output (orange dotted) - the output y(t) = W_out × h(t), (4) Target (white dashed, training only) - the target sequence (desired output), only visible during training mode. The X-axis represents time steps (0, 1, 2, ...), and the Y-axis represents values. The graph demonstrates how the hidden state "remembers" previous inputs (decays slowly when W_hidden is high) and how the output reflects both current and past inputs. When training is active, a Training Status Overlay appears at the bottom-left of the graph displaying: Epoch count, Iteration count, and Loss (MSE) value. When you step through a sequence, each new point is plotted, building up the time series over time.
Math Panel: A text display panel that shows the exact mathematical calculation for the current step. Displays: (1) The hidden state equation with actual values (e.g., "h(1) = tanh(W_in × x(1) + W_hidden × h(0) + b)"), (2) The intermediate calculations (e.g., "= tanh(0.5 × 1 + 0.8 × 0 + 0.2)"), (3) The final result (e.g., "= 0.46"). Located above or below the graphs. This educational feature makes the mathematical process transparent, helping users understand exactly how the RNN calculates the hidden state at each step. The Math Panel updates whenever you click Step Forward or when Play mode advances to a new step.
Real-Time Statistics Display: Text overlay displaying current values of h(t) (hidden state), y(t) (output), and other relevant statistics in real-time. Located above or below the graphs. The statistics update continuously as you step through the sequence, showing the current state of the system. Uses Courier New font with bright text on dark background for visibility.

Key Concepts

Recurrent Neural Network (RNN): A type of neural network architecture designed to process sequential data by maintaining an internal hidden state (memory) that allows the network to remember information from previous time steps. Unlike feedforward networks, RNNs have recurrent connections that feed the hidden state back into itself, creating a feedback loop. This enables RNNs to process sequences with temporal dependencies - patterns that depend on previous elements in the sequence. RNNs are used for tasks like language modeling, time series prediction, speech recognition, and sequence classification.
Elman Network: The simplest RNN architecture, named after Jeffrey Elman, consisting of an input layer, a hidden layer with recurrent connections, and an output layer. In this simulation, we use the minimal Elman Network: 1 input neuron, 1 hidden neuron with a recurrent connection, and 1 output neuron. This simple architecture makes it easy to understand the fundamental concepts of RNNs: how the recurrent connection creates memory, how the hidden state evolves over time, and how the network processes sequences. Despite its simplicity, the Elman Network demonstrates all the key principles of RNNs.
Recurrent Connection (Feedback Loop): The key feature that distinguishes RNNs from feedforward networks. The recurrent connection feeds the hidden state h(t) back into itself at the next time step, creating a feedback loop: h(t) depends on both the current input x(t) and the previous hidden state h(t-1). This feedback loop is what gives RNNs their memory: the hidden state carries information forward through the sequence, allowing the network to "remember" previous inputs. In the network diagram, the recurrent connection is visualized as a loop from the Hidden neuron back to itself, labeled with W_hidden.
Hidden State (Memory): The internal state h(t) of the network that acts as "memory" for previous inputs. The hidden state is updated at each time step according to h(t) = tanh(W_in × x(t) + W_hidden × h(t-1) + b). The hidden state captures information from previous inputs and maintains it across time steps, enabling the network to process sequences with temporal dependencies. The hidden state is visualized in the network diagram (Hidden neuron color intensity) and in the Time Series Graph (Hidden State line). The memory decay rate is controlled by W_hidden: large values create long-term memory (information persists), small values create short-term memory (information fades quickly).
Memory Retention (W_hidden): The recurrent weight W_hidden controls how much of the previous hidden state is retained at each time step. When W_hidden is large (e.g., 0.9), the network retains most of the previous hidden state, creating strong memory that persists for many time steps. When W_hidden is small (e.g., 0.1), the network retains little of the previous hidden state, creating weak memory that fades quickly. This parameter controls the "memory decay rate" - how long the network remembers previous inputs. Experiment with different W_hidden values to observe how memory retention changes: try a sequence like "1, 0, 0, 0, 0" with W_hidden = 0.9 (strong memory, slow decay) vs. W_hidden = 0.1 (weak memory, fast decay).
Tanh Activation: The hyperbolic tangent activation function, used in the hidden state update: h(t) = tanh(...). Tanh squashes values to the range [-1, 1], preventing unbounded growth of the hidden state. This is important for RNN stability - without activation, the hidden state could grow indefinitely through the recurrent connection. Tanh is commonly used in RNNs because it: (1) Keeps values bounded, (2) Is differentiable everywhere, (3) Has a symmetric output range. The tanh function ensures that the hidden state stays within reasonable bounds, making the network stable and predictable.
Temporal Dependencies: Patterns in sequential data where the value at time t depends on values at previous time steps. For example, in language modeling, the word "mat" is more likely after "The cat sat on the..." because it depends on the previous context. RNNs are designed to handle temporal dependencies by maintaining a hidden state that carries information from previous inputs. The recurrent connection allows the network to "remember" previous inputs and use that information when processing the current input. This makes RNNs ideal for tasks involving sequences with dependencies, such as language modeling, time series prediction, and sequence classification.
Sequence Processing: RNNs process sequences one element at a time, updating their hidden state at each time step. The network reads the first input x(0), updates h(0), then reads x(1), updates h(1) (which depends on both x(1) and h(0)), and so on. This sequential processing allows RNNs to handle variable-length sequences and maintain context across time steps. The simulation demonstrates this by processing an input sequence step by step, showing how the hidden state evolves as each input is processed. The Step Forward button advances one time step at a time, making the sequential processing explicit and observable.
Memory Decay: When the input is zero, the hidden state decays according to h(t) = tanh(W_hidden × h(t-1) + b). The decay rate depends on W_hidden: large values cause slow decay (memory persists), small values cause fast decay (memory fades quickly). This is visualized in the Time Series Graph: after an input pulse (e.g., "1, 0, 0, 0, 0"), the Hidden State line decays gradually when W_hidden is high, or quickly when W_hidden is low. The Hidden neuron's color intensity also reflects this: it stays "lit" longer with high W_hidden, and "fades" quickly with low W_hidden. Memory decay is the mechanism by which RNNs forget old information and make room for new information.
Initial State: The hidden state is initialized to h(0) = 0 at the beginning of the sequence. This is a common initialization for RNNs, representing "no memory" at the start. As the sequence is processed, the hidden state accumulates information from the inputs. When you click Reset, the hidden state returns to h(0) = 0, and the simulation starts over. The initial state affects how the network processes the first few inputs, but its influence fades as more inputs are processed (depending on W_hidden).
What to Look For: When exploring the simulation, observe: (1) How the Hidden neuron's color intensity reflects the hidden state value (bright when h(t) is large, dim when h(t) is small), (2) How the Hidden State line in the Time Series Graph shows memory decay when inputs are zero, (3) How W_hidden controls memory retention (try resetting and replaying "1, 0, 0, 0, 0" with different W_hidden values), (4) How the Math Panel shows the exact calculations at each step, making the mathematical process transparent, (5) How the recurrent loop in the network diagram visually represents the feedback connection that creates memory. This is the moment "math" becomes "memory" - the recurrent connection creates a feedback loop that allows the network to remember and process sequences with temporal dependencies.

Big picture: the recurrent connection creates a feedback loop that lets the network maintain state and process sequences with temporal dependencies — no explicit memory programming required. Real systems use more capable architectures (LSTM, GRU) and many hidden units, but the principle is identical: recurrence is what gives the network memory.

Limitations

Single scalar neuron. One input, one hidden unit, one output, all scalar with 4 parameters. Real RNNs use vector states and weight matrices with hundreds of hidden units, so this demo shows the mechanism, not capacity.
No training. Weights and bias are set by sliders, not learned by backpropagation-through-time (BPTT). There is no dataset, loss, or gradient — you cannot watch it learn a task.
Vanishing-gradient problem is hidden. Because there is no BPTT, the very issue that motivates LSTM/GRU (gradients vanishing/exploding over long sequences) is described but not demonstrated numerically.
Plain Elman cell only. No gating (LSTM/GRU), no stacking/bidirectionality, and no attention/Transformer alternatives (see [[lstmi]] for the gated version).
Short toy sequences. Inputs are short hand-entered 0/1 sequences; the long-range dependencies where memory really matters are not exercised.
Idealized arithmetic. Exact double-precision math with no dropout, regularization, or quantization present in real deployments.