Notes taken by Abd @ITNA Digital
Links
🔗 Link to the video
Keywords
Docker, Docker Image, Data Pipeline, Advantages of Docker
Table of Contents
Keywords: Docker, Docker Image, Data Pipeline, Advantages of Docker, Reproducibility, Integration tests.
Docker delivers software in packages in called containers. These containers are isolated from one another. If we run a data pipeline inside a container it is virtually isolated from the rest of the things running on the computer.
Data pipeline: Gets in Data → Processes Data → Generates more Data!
Docker allows to run multiple databases and multiple operations in isolated environment inside the same host computer without creating conflict.
pgAdmin allows you to communicate with the Postgres db.
An advantage of docker is reproducibility, which allows you to create a docker image which is a snapshot (literally and in the technical sense) of your environment that contains instructions to set up an isolated docker environment.
We can run the container we have created through the docker image where we have specified and configured the environment beyond the host computer and essentially everywhere like - Google Cloud (Kubernetes), AWS Batch etc.