Kafka

Time yourself:

Your progress:

00:30:00

Kafka Fundamentals

⏷ 1. What are the main components of Kafka (broker, topic, partition, etc.)?

Flag important question

Mark as complete

⏷ 2. How does Kafka differ from traditional messaging systems like ActiveMQ or RabbitMQ?

Flag important question

Mark as complete

⏷ 3. What is a Kafka cluster, and why is it recommended to run multiple brokers?

Flag important question

Mark as complete

⏷ 4. How do producers publish messages to Kafka topics?

Flag important question

Mark as complete

⏷ 5. Why is Kafka sometimes described as a “distributed commit log”? What does that imply for data retention and ordering?

Flag important question

Mark as complete

Cluster Architecture & Setup

⏷ 6. How does Kafka ensure durability of messages even if multiple brokers fail?

Flag important question

Mark as complete

⏷ 7. What are the key broker configuration parameters you typically tune (e.g., log.retention.hours, num.partitions)?

Flag important question

Mark as complete

⏷ 8. How do you add a new broker to an existing Kafka cluster?

Flag important question

Mark as complete

⏷ 9. If you are using Kafka without ZooKeeper (KIP-500), how is metadata handled differently?

Flag important question

Mark as complete

Producers & Consumers

⏷ 10. How does a Kafka producer handle message batching and compression?

Flag important question

Mark as complete

⏷ 11. What are Consumer Groups, and why are they crucial for scalability?

Flag important question

Mark as complete

⏷ 12. How do you commit offsets in a consumer? What’s the difference between manual and automatic offset commits?

Flag important question

Mark as complete

⏷ 13. Can you explain how a consumer rebalance happens?

Flag important question

Mark as complete

⏷ 14. If a consumer crashes right after polling but before committing offsets, what happens to those uncommitted messages?

Flag important question

Mark as complete

⏷ 15. How would you design a system to guarantee exactly-once delivery in Kafka?

Flag important question

Mark as complete

⏷ 16. Can you compare synchronous vs. asynchronous consumption in Kafka consumers, and what are the trade-offs?

Flag important question

Mark as complete

⏷ 17. Why does Kafka use partitions, and how do partitions help with scalability?

Flag important question

Mark as complete

⏷ 18. How does leader-follower replication work in Kafka?

Flag important question

Mark as complete

Partitions & Replication

⏷ 19. Why does Kafka use partitions, and how do partitions help with scalability?

⏷ 20. How do you decide the right number of partitions for a topic?

⏷ 21. What are the trade-offs of having a very high number of partitions in a topic?

Kafka Streams & Connect

⏷ 22. If a leader partition crashes and a follower is not fully in sync, what is the potential data impact?

⏷ 23. What is Kafka Streams, and how does it differ from Spark Streaming or Flink?

⏷ 24. How does stateful stream processing work in Kafka Streams?

⏷ 25. What is Kafka Connect, and why is it used for data import/export tasks?

⏷ 26. How would you handle schema evolution in a Kafka Connect pipeline?

Security & Monitoring

⏷ 27. What built-in security features does Kafka offer (e.g., SSL, SASL)?

⏷ 28. How can you enable encryption in transit between Kafka brokers and clients?

⏷ 29. What metrics do you typically monitor in a Kafka cluster (e.g., under replicated partitions)?

⏷ 30. How do tools like Prometheus, Grafana, or Confluent Control Center help with monitoring Kafka?

⏷ 31. How do you maintain and update ACLs in Kafka at scale?

⏷ 32. What is “lag” in Kafka Consumer Groups, and how do you detect and resolve consistently high lag?

Performance & Tuning

⏷ 33. How do you tune Kafka for low latency vs. high throughput?

⏷ 34. What is the impact of acks settings on performance and reliability?

⏷ 35. How do you optimize producer batch sizes or linger times?

⏷ 36. Why might you adjust heap sizes or GC settings for Kafka brokers?

⏷ 37. How do you troubleshoot frequent “OutOfMemoryError” crashes in a heavily loaded Kafka broker?

⏷ 38. If you suspect disk I/O bottlenecks, what steps would you take to isolate and fix the problem?

⏷ 39. If you had to handle a major Kafka cluster outage during peak traffic, how would you systematically restore service while preserving data consistency?