Location: Remote, Africa
Full Time
Role Overview
We are looking for a highly skilled HPC / Full-Stack Infrastructure Engineer to architect and manage the compute, data, and deployment backbone of our scientific AI systems. You will build and scale infrastructure that supports climate modeling, geospatial pipelines, and physics-informed machine learning workloads.
Key Responsibilities
- Architect, deploy, and maintain high-performance computing (HPC) environments
- Build reliable data and model pipelines for scientific AI workloads
- Implement MLOps and DevOps systems for model training, evaluation, and deployment
- Optimize compute workflows (GPU clusters, Kubernetes, distributed systems)
- Manage APIs, backend services, cloud infra, and developer tooling
- Ensure secure, scalable, and efficient infrastructure across the stack
Requirements
- 3-5 years experience in DevOps Engineering/Infrastructure Engineering, or similar roles
- Expertise in Linux systems, Docker, Kubernetes, GPU computing
- Experience with HPC or distributed training frameworks (Ray, MPI, Dask, Slurm)
- Strong backend engineering experience (Python, Go, or Node.js)
- Experience with cloud platforms (AWS, GCP, Azure)
- Understanding of CI/CD pipelines and infrastructure-as-code (Terraform, Ansible)
- Experience supporting scientific or ML teams is a strong plus
- Comfortable communicating in English, both written and verbal