Intro

Important Concepts

ML Experiment:
- We are not talking about A/B Testing, but the process of building an ML model.
Experiment Run:
- Each trial in an ML Experiment (including the models’ versions, types, and hyperparameters).
Run Artifact:
- Any file associated with an ML run.
Experiment Metadata
- All the information that is related to the experiment.

It’s the process of keeping track of all the relevant information from an ML experiment.
It’s so important because of:
1. Reproducibility
2. Organization
3. Optimization
Tracking experiments in spreadsheets may be good, but it’s not enough because:
1. Error Prone
2. No Standard Format
3. Visibility & Collaboration

It’s the tool that we’ll use instead for the experiment tracking.
It’s an open-source platform for the ML lifecycle.
It contains 4 main modules:
1. Tracking
2. Models
3. Model Registry
4. Projects
The MLflow Tracking module allows you to keep track of:
1. Parameters
  - Includes any data relevant to the model experiment:
    1. Hyperparameters
    2. data used
2. Metrics
3. Metadata
4. Artifacts
5. Models
  - You may ignore tracking the models as you track the hyperparameters for each trial.
It also logs extra information for each run:
1. Source Code (File name that was run)
2. Code Version
3. Start & End Time
4. Name of the author
To use the model registry feature in MLflow, we’ll need to connect it to an RDBMS:
1. PostgreSQL
2. MySQL
3. SQLite
4. MSSQL Server