Background

Nowadays people use different compute systems for different kinds of workloads.

Batch Processing OLAP Streaming
Presto/Trino ⭐️ ★ ★ ⭐️ ⭐️ ⭐️ ★★★
Spark ⭐️ ⭐️ ⭐️ ⭐️ ★ ★ ⭐️⭐️★
Impala ★★★ ⭐️ ⭐️ ⭐️ ★★★
Flink ⭐️ ★ ★ ⭐️⭐️★ ⭐️ ⭐️ ⭐️

Use cases

Scores

★★★: Can’t process at all.

⭐️★★: Can handle some cases, but not good at.

⭐️⭐️★: Can handle most cases, but not dominating the market.

⭐️⭐️⭐️: Dominating the market.

Problems

  1. Most of above systems are written using java, which is inefficient.
  2. Not designed for cloud, but designed for hadoop.
  3. No one system can support all workloads.

Design

Design Goal

The goal is to design a unified platform for different computing workloads: