https://xmind.ai/share/GP2Tl6s4?xid=y0IgtWIl
CNN | Xmind AI
π
μ€λμ E2E νμλΌμΈ (μ΄ 5μκ°)
0. μ€λΉ (10λΆ)
- [x] κΉνλΈ μ λ ν¬μ§ν 리 μμ± (ex:
cnn-vit-comparison
)
- [x] PyTorch, matplotlib, tqdm λ± νκ²½ μΈν
1. λ
Όλ¬Έ μ΄ν΄ λ° λ§μΈλλ§΅ (40λΆ)
- [x] CNN ꡬ쑰 λ³΅μ΅ (LeNet, VGG, ResNet Level)
- [ ] ViT ν΅μ¬ ꡬ쑰 λ³΅μ΅ (Patch Embedding, Encoder Block, Classification Head)
- [ ] CNNκ³Ό ViTμ μ°¨μ΄λ₯Ό ν λμ 보λ λ§μΈλλ§΅ (κ³μΈ΅, νλ¦, μ₯λ¨μ ) μμ±
2. CNN ꡬν (40λΆ)
- [ ] LeNet5 κΈ°λ° κ°λ¨ CNN ꡬν
- [ ] λ°μ΄ν°μ
μ MNIST / CIFAR-10 μ€ νλ μ ν
- [ ] Training loop μμ± λ° μ€κ° μ»€λ° (ex: 'feat: basic CNN implemented')
3. ViT ꡬν (2μκ°)
- [ ] Patch Embedding (Conv or Reshape λ°©μ)
- [ ] Transformer Encoder Block (Multi-head Self-Attention, MLP, LayerNorm, Residual)