<aside> 💡

VIbhor Sharma. - June 20th 2025

</aside>

LLM is an AI model trained to understand and generate language
Based on transform architechture
Learns from massive text datasets

OpenAi GPT 4.5: 128k Tokens, Claude: Sonnet and opus, Gemini 2.5: 1m+ token context Multimodel (adds a lot of context due to token size which can be annoying), Meta LLaMA 3

Key Concepts

Parameters (total sum of weights + bias)
- Billions to Trillions (larger model detect subtle changes and small models are good at one thing)
Zero shot (do it once), few shots and then fine tuning
Chain of thought reasoning- based on how human thinks
Mixture of experts - only the expert is called (only 32 billion out of 100 billion parameters are working at once) more efficient

How do LLMs Work

Predict next token based on context
Self-attention and postional encoding (looking at a token and comparing it to the tokens around it to understand context)
Embedding: turn text into vectors so the computer can understand
Large Scale training with human feedback