Word Embedding

It’s a technique that converts words into vectors

Can be:

  1. Count or Frequency (BOW, TFIDF, OHE)
  2. Deep Learning Training Model (Word2Vec [CBOW, Skip grams], )

Word2Vec

Word2Vec will solve those issues. We try to make a vector for each word that has a limited size.

Feature representation:

image.png

Distance = 1 - Cosine Similarity

image.png

CBOW [Continuous Bag Of Words]

EX:

Doc_1 = Krish channel is related to data science

Window size = 5

Training Data