Word Embeddings
Turning words into vectors. Understand Word2Vec, GloVe, and how computers capture semantic meaning.
Turning words into vectors. Understand Word2Vec, GloVe, and how computers capture semantic meaning. This hands-on tutorial focuses on practical implementation of word embeddings concepts.
Word Embeddings
How do you explain to a computer that "King" is similar to "Queen"? With Word Embeddings, we represent words as dense vectors (lists of numbers) in a high-dimensional space.
1. One-Hot Encoding vs. Embeddings π
-
One-Hot Encoding:
Cat=[1, 0, 0, 0]Dog=[0, 1, 0, 0]- Problem: No relationship between vectors. They are orthogonal. Huge memory usage.
-
Embeddings:
Cat=[0.2, 0.9, -0.1]Dog=[0.2, 0.8, -0.2]- Result: Similar words have similar vectors!
2. Word2Vec π§
Created by Google in 2013. It learns word associations from a large corpus of text. Key idea: "You shall know a word by the company it keeps."
- CBOW (Continuous Bag of Words): Predict target word from context.
- Skip-Gram: Predict context words from target word.
3. Vector Arithmetic β
The most famous example of Word2Vec magic:
King - Man + Woman = Queen
This shows that the model captured the concept of "Gender" and "Royalty" as directions in the vector space.
4. Cosine Similarity π
To measure how similar two words are, we calculate the cosine of the angle between their vectors.
- 1.0: Identical direction (Synonyms).
- 0.0: Unrelated (Orthogonal).
- -1.0: Opposite direction (Antonyms).
Interactive Challenge: Vector Similarity
Let's simulate vector similarity with NumPy.
Quiz
Quiz
Question 1 of 3What is the main advantage of Embeddings over One-Hot Encoding?
Key Takeaways
β
Embeddings are dense vector representations of words.
β
Word2Vec learns these from context.
β
Vector Arithmetic allows us to manipulate concepts mathematically.
What's Next?
Word2Vec is great, but it has a flaw: "Bank" has the same vector in "River Bank" and "Bank Account". We need a model that understands Context. Enter the Transformer.
Next Chapter: Transformers & Attention.