Event Streaming using Kafka-Novice
This story talks about the event and processing of events using Kafka.
Event- Shows the change in state of object. The unbounded form of flow of event is called event stream. It shows something happened in the source of the event.
Streaming- The unbounded form of source of event
Producer- The source that produces the event or event stream
Consumer- To continuously capture and analyze the events stream.
There are three types consumers in Kafka
- Consumer API- Very simple use case like read data, parse and route
- Stream API- For complex logics requires like reading pulse rate, fraud detection or time window based use case
- KSQL- Similar to Stream based API with pre-defined use case and available feature in KSQL
- Consumer for stream(Stream API) work in similar fashion as it works for simple Kafka consumer API.
- The Kafka streaming API is developed on top of Kafka consumer API. It has addition inbuilt capability to transform, enrich or route the stream to another direction.
- It can be scaled vertically by adding more consumer and horizontally by adding more CPU to existing infrastructure.
- It supports for automatic fault-tolerance and load balancing.
Vertical scaling and Load Balancing
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 3);
Horizontal scaling
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 3);
In case of two different consumer types, the number of thread for the consumer is different. Both consumer group are running on the same machine.
Here the machine may have the limitation to run the number of consumer on it. So new machine can be added in order to run the a group consumer on different machine.
Fault-tolerance and load balancing
Suppose one of the machine goes down in above diagram, Kafka automatically do the fault tolerance and put the failed instance to healthy node.
Kafka automatically moves the running consumer on the healthy node in case of the current node goes down.
Kafka Stream API
There are two types of APIs
Stream DSL: Very high level of API for most of the common use. It is recommended if it matches with the solution requirement. map, filter, join and aggregations.
Kafka Processor API: These are low level APIs and have more manual work at development side. It connects processors as well as interact directly with state stores. Processor interface is used to customise the processor APIs.