Kafka Streams Repartition
The data record is partitioned by input message in Kafka Streams. From time to time, we want to partition the message by a different key. Kafka Streams has a simple API to achieve this
KStream stream = ...
KStream repartionedStream = stream.selectKey(...)
In the selectKey, you pass in a key, value to new key function for how do you convert the old key to a new key. This function can be a lambda
stream.selectKey((key, value) -> value.userId(), ..., ...)
Another piece is the new key/value serde object.
By just calling selectKey, no new topic will be created as Kafka streams take a lazy computation approach: only key-dependent operations like groupBy
or join
or .through("topic-name")
after .selectKey()
. If using through
, the new topic needs to created beforehand.
A new topic is not needed if the following operations are in-memory.