Event Streaming using Kafka-Novice

Narayan Kumar
3 min readOct 23, 2020

--

This story talks about the event and processing of events using Kafka.

Event- Shows the change in state of object. The unbounded form of flow of event is called event stream. It shows something happened in the source of the event.

Streaming- The unbounded form of source of event

Producer- The source that produces the event or event stream

Consumer- To continuously capture and analyze the events stream.

There are three types consumers in Kafka

  • Consumer API- Very simple use case like read data, parse and route
  • Stream API- For complex logics requires like reading pulse rate, fraud detection or time window based use case
  • KSQL- Similar to Stream based API with pre-defined use case and available feature in KSQL
Kafka Stream Consumer
  • Consumer for stream(Stream API) work in similar fashion as it works for simple Kafka consumer API.
The Functional expect of Kafka Stream consumer
  • The Kafka streaming API is developed on top of Kafka consumer API. It has addition inbuilt capability to transform, enrich or route the stream to another direction.
  • It can be scaled vertically by adding more consumer and horizontally by adding more CPU to existing infrastructure.
  • It supports for automatic fault-tolerance and load balancing.

Vertical scaling and Load Balancing

Consumer 1
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 3);
Consumer scaled vertically

Horizontal scaling

props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 3);

In case of two different consumer types, the number of thread for the consumer is different. Both consumer group are running on the same machine.

Two Consumer Group on the same machine

Here the machine may have the limitation to run the number of consumer on it. So new machine can be added in order to run the a group consumer on different machine.

New machine is added Machine-2

Fault-tolerance and load balancing

Suppose one of the machine goes down in above diagram, Kafka automatically do the fault tolerance and put the failed instance to healthy node.

Machine-2 Down

Kafka automatically moves the running consumer on the healthy node in case of the current node goes down.

Kafka Stream API

There are two types of APIs

Stream DSL: Very high level of API for most of the common use. It is recommended if it matches with the solution requirement. map, filter, join and aggregations.

Kafka Processor API: These are low level APIs and have more manual work at development side. It connects processors as well as interact directly with state stores. Processor interface is used to customise the processor APIs.

--

--

Narayan Kumar

Middleware Expert in Design and Development, Working on Kafka, Eager to investigate and learn — Software Engineer.