five types of chunking in rag-

  1. Fixed size chunking
  2. Semantic chunking
  3. Recursive chunking
  4. Structural chunking
  5. LLM chunking

fixed size-

chunk(200mb)-hyperparameter

divide pages into chunks of fixed size

advantages- quick, fast processing

disadvantages- semantic breaks, lost context

image.png

Semantic chunking-

(threshold=0.8)-hyperparameter

tries to solve problem of fixed size chunking(no meaning of chunks connections)