🧾 Project Prompt (Assignment)
Entrena un modelo que clasifique datos
You must deliver a baseline ML classifier and a short evaluation report for stakeholders.
Train a classifier on a real dataset (Titanic, Iris, or a Kaggle tabular dataset).
Requirements
- Clean the dataset (missing values, categorical encoding)
- Engineer at least 3 features
- Train at least 2 models (baseline + improved)
- Evaluate with appropriate metrics (accuracy + at least 1 more)
- Save the trained model artifact to disk
- Write a short “model card” describing data + approach + limitations
🛠 Tools (Tech Stack)
- Python
- pandas + numpy
- scikit-learn
- matplotlib/seaborn (optional)
💡 Implementation Hints
- Use a scikit-learn
Pipeline (preprocess + model)
- Fix a random seed for reproducibility
- Include a confusion matrix for classification tasks