|
|
||
|
This interactive tutorial visualizes word embeddings in 2D using Principal Component Analysis (PCA). Word embeddings are high-dimensional vectors (512D from the Universal Sentence Encoder) that capture semantic meaning; similar words sit close in vector space. PCA reduces 512D to 2D so we can see semantic clusters and explore vector analogies (e.g., King - Man + Woman ≈ Queen). Pipeline: Enter words (comma-separated); USE produces 512D vectors. You can either Run PCA Projection (2D or 3D) or Manual projection onto chosen axes (0–511). A grid canvas shows points with hover tooltips. The Vector Analogy panel computes A - B + C in 512D, projects into the same space (PCA or manual), and highlights the nearest word (cosine similarity). Same basis: the projection (PCA or manual axes) is stored so the analogy result uses the same basis as the plotted words. Press Enter in the input box to run PCA; change Manual axis spin boxes to auto-apply manual projection.
Sections Mathematical modelEmbeddings: each word is mapped to 512D by the Universal Sentence Encoder. PCA via SVD: the embedding matrix X is centered to Xc, decomposed, and projected onto the first k right singular vectors (the same mean and V are reused for the analogy vector): Xc = U S VT → Y = Xc × V[:, 0:k]
Manual projection: pick axis indices (0–511); each point is simply [vec[axis0], vec[axis1]] (or three axes for 3D). Vector analogy: compute the result in 512D, project it with the stored basis, and highlight the nearest word by cosine similarity: result_vec = embed(A) − embed(B) + embed(C)
SimulationThe interactive simulator is below. Use the controls to explore the concepts described above.
Input words (comma separated):
Preset:
PCA:
Status: Waiting for model...
Manual:
Axis:
Initializing AI models...
If this hangs, open this page via http (e.g. local server), not file://. Vector Analogy (A - B + C)Explore semantic relationships (e.g. King - Man + Woman ≈ Queen)
Analogy preset:
-
+
Result ≈ --
The arrow from B to A represents a "direction" in meaning; adding it to C often points toward the analogous word. PCA or manual projection puts the 512D result into this 2D/3D view.
512D word vectors (before PCA)
Usage ExampleFollow these steps to explore word embeddings and PCA:
Try this: run PCA or Manual projection first with a list that includes the analogy words. Then use the Vector Analogy panel so the result lands in the same 2D/3D space and the nearest neighbor is among your points.
Parameters
Controls and Visualizations
Key Concepts
Consistency: the same projection (PCA or manual axes) is used for the main word list and the analogy result, so A−B+C appears in the correct position relative to the other points in 2D/3D.
Limitations
|
||