- Published on
Kafka
- Authors
- Name
- Lucian Oprea
- @LucianDSA_
00:30:00
Kafka Fundamentals
⏷ 1. What are the main components of Kafka (broker, topic, partition, etc.)?
⏷ 2. How does Kafka differ from traditional messaging systems like ActiveMQ or RabbitMQ?
⏷ 3. What is a Kafka cluster, and why is it recommended to run multiple brokers?
⏷ 4. How do producers publish messages to Kafka topics?
⏷ 5. Why is Kafka sometimes described as a “distributed commit log”? What does that imply for data retention and ordering?
Cluster Architecture & Setup
⏷ 6. How does Kafka ensure durability of messages even if multiple brokers fail?
⏷ 7. What are the key broker configuration parameters you typically tune (e.g., log.retention.hours
, num.partitions
)?
⏷ 8. How do you add a new broker to an existing Kafka cluster?
⏷ 9. If you are using Kafka without ZooKeeper (KIP-500), how is metadata handled differently?
Producers & Consumers
⏷ 10. How does a Kafka producer handle message batching and compression?
⏷ 11. What are Consumer Groups, and why are they crucial for scalability?
⏷ 12. How do you commit offsets in a consumer? What’s the difference between manual and automatic offset commits?
⏷ 13. Can you explain how a consumer rebalance happens?
⏷ 14. If a consumer crashes right after polling but before committing offsets, what happens to those uncommitted messages?
⏷ 15. How would you design a system to guarantee exactly-once delivery in Kafka?
⏷ 16. Can you compare synchronous vs. asynchronous consumption in Kafka consumers, and what are the trade-offs?
⏷ 17. Why does Kafka use partitions, and how do partitions help with scalability?
⏷ 18. How does leader-follower replication work in Kafka?
Partitions & Replication
⏷ 19. Why does Kafka use partitions, and how do partitions help with scalability?
⏷ 20. How do you decide the right number of partitions for a topic?
⏷ 21. What are the trade-offs of having a very high number of partitions in a topic?
Kafka Streams & Connect
⏷ 22. If a leader partition crashes and a follower is not fully in sync, what is the potential data impact?
⏷ 23. What is Kafka Streams, and how does it differ from Spark Streaming or Flink?
⏷ 24. How does stateful stream processing work in Kafka Streams?
⏷ 25. What is Kafka Connect, and why is it used for data import/export tasks?
⏷ 26. How would you handle schema evolution in a Kafka Connect pipeline?
Security & Monitoring
⏷ 27. What built-in security features does Kafka offer (e.g., SSL, SASL)?
⏷ 28. How can you enable encryption in transit between Kafka brokers and clients?
⏷ 29. What metrics do you typically monitor in a Kafka cluster (e.g., under replicated partitions)?
⏷ 30. How do tools like Prometheus, Grafana, or Confluent Control Center help with monitoring Kafka?
⏷ 31. How do you maintain and update ACLs in Kafka at scale?
⏷ 32. What is “lag” in Kafka Consumer Groups, and how do you detect and resolve consistently high lag?
Performance & Tuning
⏷ 33. How do you tune Kafka for low latency vs. high throughput?
⏷ 34. What is the impact of acks
settings on performance and reliability?
⏷ 35. How do you optimize producer batch sizes or linger times?
⏷ 36. Why might you adjust heap sizes or GC settings for Kafka brokers?
⏷ 37. How do you troubleshoot frequent “OutOfMemoryError” crashes in a heavily loaded Kafka broker?
⏷ 38. If you suspect disk I/O bottlenecks, what steps would you take to isolate and fix the problem?
⏷ 39. If you had to handle a major Kafka cluster outage during peak traffic, how would you systematically restore service while preserving data consistency?