đź“”About Us

We are dedicated to being the go-to data layer used to build AGI. The future of intelligence won’t be defined by compute alone—it will be shaped by the quality and richness of the data these systems learn from. At Sumo, we believe data is the most under-invested and under-appreciated driver of AI progress. We give AI teams access to data that can’t be scraped from the open web and that makes the difference between incremental improvement and breakthrough capability.


🏗️ Role Overview

The Data Engineer is the critical owner and architect of Sumo's entire data supply chain. You will manage the complex, large-scale data flow from partner ingestion (ingress) through rigorous preparation, to secure client delivery (egress). This role requires the prescriptive design and operation of high-throughput pipelines that specialize in LLM-specific data transformation and strict Data Governance. You will manage our cloud environment and proprietary de-identification applications, owning the processes for privacy control, curation, and the final structuring of data that directly powers next-generation AI models.


🛠️ Key Responsibilities

Data Flow & Governance