Notes taken by Abd @ITNA Digital

Links

🔗 Link to the video

Keywords

Docker, Docker Image, Data Pipeline, Advantages of Docker

Table of Contents


Introduction to Docker

Keywords: Docker, Docker Image, Data Pipeline, Advantages of Docker, Reproducibility, Integration tests.

Docker delivers software in packages in called containers. These containers are isolated from one another. If we run a data pipeline inside a container it is virtually isolated from the rest of the things running on the computer.

Data pipeline: Gets in Data → Processes Data → Generates more Data!

Untitled

Docker allows to run multiple databases and multiple operations in isolated environment inside the same host computer without creating conflict.

pgAdmin allows you to communicate with the Postgres db.

An advantage of docker is reproducibility, which allows you to create a docker image which is a snapshot (literally and in the technical sense) of your environment that contains instructions to set up an isolated docker environment.

Untitled

We can run the container we have created through the docker image where we have specified and configured the environment beyond the host computer and essentially everywhere like - Google Cloud (Kubernetes), AWS Batch etc.