StreamExecutionEnvironment

DataStream 结构详解

DataStream Source
- basic
- GenerateSequence
- Collection
- Socket
- Local File
- third-party
- Kafka Connector ( source / sink )
- 下面针对 Kafka Connector 进行举例说明
- ElasticSearch Connector ( sink )
- Cassandra ( sink )
- Amazon Kinesis Streams ( source / sink )
- HDFS ( sink )
- RabbitMQ ( source / sink )
- Apache NiFi ( source / sink )
- Twitter Streaming API ( source )
- JDBC ( sink )
- ...
- Customize
DataStream Transformer
- One Record
- map
- filter
- flatmap ( 一对多转换 )
- Window
- NonKeyed DataStream ( 未经过 KeyBy 处理 )
- timeWindowAll
- countWindowAll
- windowAll
- Keyed DataSteam
- timeWindow
- coutWindow
- window
- Multi DataStream
- NonKeyed DataStream
- join
- connect
- coGroup
- union
- Keyed DataSteam
- Split & Side Output ( 旁路输出 )
- Shuffle
- global(): 全部发往第一个 task
- broadcast(): 所有数据广播给所有的 task
- foward(): 上下游并发度一样时,一对一发送
- shuffle(): 随机均匀分配
- rebalance(): 轮询
- recale(): 本地轮询
- partitionCustom(): 自定义单播
- Other
- KeyBy: 将数据按 key hash 分散到不同的分区中, 后续可以支持 Keyed Operator