You’ve trained a promising PyTorch model in a Jupyter notebook, the results look good, the charts are pretty, and your experiment runs smoothly. Then comes the question:
How do you turn this notebook into something reproducible, observable, and ready for production?
Many ML projects stall when moving from experimentation to engineering. Pipelines provide the structure, repeatability, and traceability needed to bridge that gap.
In this article, we’ll transform the official PyTorch Sequence-to-Sequence Translation Tutorial into a production-ready ML pipeline, using ZenML for orchestration and MLflow for experiment tracking.
Before touching code, start by structuring your repo. A clear, modular layout is the foundation of a maintainable pipeline.
translation-with-sequence2sequence-and-attention/
├── main.py # Entry point with CLI
├── config.yaml # Configuration management
├── pyproject.toml # Dependencies and project metadata
├── data/
│ └── eng-fra.txt # Raw translation data
├── src/
│ ├── pipeline.py # Main ZenML pipeline definition
│ ├── steps/ # Individual pipeline steps
│ │ ├── a_preprocess_data.py # Data preprocessing step
│ │ ├── b_prepare_dataloaders.py # Data preparation step
│ │ ├── c_train_model.py # Model training step
│ │ └── d_evaluate_model.py # Model evaluation step
│ └── utils/
│ ├── models/
│ │ ├── seq2seq.py
│ │ ├── encoder.py
│ │ ├── simple_decoder.py
│ │ └── attention_decoder.py
│ ├── materialisers/
│ │ ├── lang_vocab_materialiser.py
│ │ └── seq2seq_model_materialiser.py
│ ├── read_language.py
│ └── string_formatting.py
└── mlruns/ # MLflow experiment tracking data
Design principles:
config.yaml.utils/.Each stage of the original notebook becomes a ZenML step. ZenML pipelines are composed of independent, reusable steps that define the flow of data and artifacts.
Below is a simple diagram of the flow:
graph TD
A[Preprocess Data] --> B[Prepare Dataloaders]
B --> C[Train Model]
C --> D[Evaluate Model]
In this case, there is enforced order. But you can imagine further parallelise the pipeline for faster processing. You could for example split the prepare dataloaders step into prepare train dataloader, prepare test dataloader, and test dataloader can be prepared while train model runs.