This article is awesome. Explaining every capabilities of the tool so that we can go deeper as needed. As a beginner in kafka, this is very helpful. Thanks!
I am currently working in a team that heavily uses and builds on top of Kafka infrastructure - we make use of kafka infra + Kafka streams + Kafka connect - and nowhere until this article have I come across such a crisp, succinct and comprehensive enough that I have already bookmarked it as my go to article moving forward! Thank you Stanislav!
This is hands down one of the most comprehensiv explainers on Kafka Ive seen. Breaking down the whole stream table duality thing and explaining how the metadata log works with KRaft was super helpful, I always found that aspect a bit confusing. The tiered storage section is also facinating, I didnt realize how much cost savings you could get by offloading to S3. One thing Im curious about is the consumer group protocl details, like how does the coordinator handle a situation where a consumer is slow but not dead? Does it just wait indefinitely or is there some kind of timeout? Either way, great stuff, definitely bookmarking this!
Incredible article! I was going to use Kafka for an earlier draft of my project. I scaled the project down and realize it became overkill. But I really do like Kafka a lot.
This article is so comprehensive. One can find their way round kafka with this .
This article is awesome. Explaining every capabilities of the tool so that we can go deeper as needed. As a beginner in kafka, this is very helpful. Thanks!
This article is so helpful and easy to follow.
One suggestion is to allow click-zoom the images for mobile users
I am currently working in a team that heavily uses and builds on top of Kafka infrastructure - we make use of kafka infra + Kafka streams + Kafka connect - and nowhere until this article have I come across such a crisp, succinct and comprehensive enough that I have already bookmarked it as my go to article moving forward! Thank you Stanislav!
🙇♂️
This is hands down one of the most comprehensiv explainers on Kafka Ive seen. Breaking down the whole stream table duality thing and explaining how the metadata log works with KRaft was super helpful, I always found that aspect a bit confusing. The tiered storage section is also facinating, I didnt realize how much cost savings you could get by offloading to S3. One thing Im curious about is the consumer group protocl details, like how does the coordinator handle a situation where a consumer is slow but not dead? Does it just wait indefinitely or is there some kind of timeout? Either way, great stuff, definitely bookmarking this!
Timeouts in consumers are `session.timeout.ms` and `poll.timeout.ms`; I wrote about it back in the day here: https://www.confluent.io/blog/apache-kafka-data-access-semantics-consumers-and-membership/
Basically, as long as the consumer heartbeats and actively polls for messages, it’ll remain in the group. It can accumulate consumer lag in that time.
If you’re surprised how much S3 can save you, you’ll be blown away by how much you can save by eliminating networking: https://topicpartition.io/blog/kip-1150-diskless-topics-in-apache-kafka
Incredible article! I was going to use Kafka for an earlier draft of my project. I scaled the project down and realize it became overkill. But I really do like Kafka a lot.
For small cases, you should just use postgres. https://topicpartition.io/blog/postgres-pubsub-queue-benchmarks
One of the best high level overview of kafka
Awesome article...