How did we make the vis? [code here]


We figured that principal component analysis (PCA) might help surface the vectors of grammar we're interested in. We did PCA on the entire FastText vocabulary. Since our word embeddings live in 300-dimension space, PCA gave us 300 principal components.

After we found our principal components, we chose small sample sets. We chose these sample sets because we thought they would demonstrate interesting grammatical variance (e.g. past vs. present tense).

Finally, we projected each word vector onto our principal components and visualised two dimensions at a time using a compass plot (one word of the pair at the origin and the other placed based on relative position to the first word in the pair i.e. vector subtraction of word2 - word1).

But what are word embeddings?


If you're not familiar with word embeddings, what follows is an attempt at a super handwavy explanation (see the Further Reading section for better, less handwavy explanations).

Word embeddings are functions that take words and map them onto high-dimensional vectors, typically on the order of hundreds of dimensions. A word embedding starts off by assigning each word a random, meaningless vector. It then “learns” meaningful vectors for each word in order to perform some task**.**

One such task is predicting a target word from context words. Say you read the phrase “UChicago: where ____ comes to die.” What goes in “____”? This is exactly how today’s word embedding of interest, let’s call it Walter, learns vector representations for words. Walter goes tot he Reg and looks at cue cards that have phrases like “UChicago: where ____ comes to die” on one side and “fun” on the other side. At first, Walter will probably get a lot of words wrong. For instance, it might guess that UChicago is where “bicycles” come to die. But after many epochs of the life of the mind (slash the life of the grind), Walter will become one of us and correctly guess that UChicago is where “fun” comes to die.

One beautiful thing you can do with word vectors is vector addition for analogies (queen - woman + man = king).

https://d2mxuefqeaa7sj.cloudfront.net/s_5E7FCBF0907A90C612E5501D97CF598D3BE9FA861F58A3B25AE514F4769FC70C_1528059354512_Mikolov-GenderVecs.png

This visualisation is our attempt at visually exploring these directions of grammar with the help of PCA.