About three months ago, @Eric Fu wrote a design doc about the watermark in our streaming system. Every watermark must be assigned over a specified time window in that design. However, many cases of watermarks are not covered in that design.
Previously, we also wrote a deprecated document [Deprecated] RFC: Watermark in RisingWave (WaterMark part I), which mixed the
We decide to introduce a new message type:
pub enum Message {
Chunk(StreamChunk),
Barrier(Barrier),
Watermark(col_idx, Timestamp),
}
Term 1.1: Every record has multiple watermark columns with the timestamp type.
Term 1.2: If the order of a data record d in the stream is after another watermark record w, then d[watermark_col] > w.val.
Based on Term 1.2, we can get a noteworthy property:
Prop 1.1: We can postpone or remove any watermark record in a valid stream.