Kafka Benchmarking

Kafka Introduction

Before understand the Kafka bench-marking, let me give a quick brief of what Kafka is and a few details about how it works. Kafka is a distributed messaging system originally built at LinkedIn and now part of Apache Software Foundation and used by variety of companies.

The general setup is quite simple. Producers send records to the cluster which holds on to these records and hands them out to consumers:

kafka1

The key abstraction in Kafka is the topic. Producers publish their records to a topic, and consumers subscribe to one or more topics. A Kafka topic is just a shared write-ahead log. Producers append records to these logs and consumers subscribe to changes. Each record is a key/value pair.

Here is a simple example of a single producer and consumer reading and writing from a two-partition topic.

kafka2

Kafka Benchmarking

Scenario 1:

All of these benchmarks were run on AWS EC2 instances with the following specifications:

  • Kafka cluster is running on three Machines; the spec of each Machine is Instance Type:-c3.4xlarge, 16 VCPUs, 30 GB RAM, Disk :- 2 * 160 GB (SSD’s) ,RAID:- 0 (Stripping) and XFS Filesystem
  • Zookeeper running on three mahcines; the spec of each machine is of t2.micro

These benchmarks were run for the following cases:

  • Single producer (single thread) and single consumer producing asynchronousyly on a topic with replication factor of 3 and 1 partitions.
  1. 9 producer (3 producer with three thread so total 9 (3*3) ) and 3 consumer producing asynchronously on a topic with replication factor of 3x and 1 partitions.

In this context, a producer running in async mode would mean that the number of acknowledgements the producer requires the leader to have received from its followers before considering a request complete is 0.

Case 1: Single producer (single thread) and Single Consumer producing asynchronously on a topic with replication factor of 3 and 1 partitions.

We have performed the test for various records(2 ,5,10 and 15 lakh) with different record size (10KB,5KB,2KB) by varying throughput (1000 ,5000 and 10000 )

Command:

Producer test performance command :

./bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance –topic <topic-name> –num-records 1000000 –record-size <record-size> –producer-props bootstrap.servers=<broker-list> –throughput <value>

Consumer test performance command:

bin/kafka-consumer-perf-test.sh –zookeeper <zk-list> –threads 1 –topic <topic-name> –messages 1000000

Result:

kafka3

kafka4

Conclusion:

1) With decrease in record size for a message the average latency decreases

2) More Throughput more Latency
kafka5Latency

3) Throughout and Record -Size are inversely proportional.With more Record-size we have less throughput.

kafka6

Case 2:  9 producer (3 producer with three thread so total 9 (3*3) ) and 3 Consumer producing asynchronously on a topic with replication factor of 3x and 1 partitions.

Producer

kafka7

Consumer

kafka8

Conclusion:

By Increasing the number of Producer(compared to first case) the Latency of the producer also increases .

Scenario 2:

All of these benchmarks were run on AWS EC2 instances with the following specifications:

  • Kafka cluster is running on three Machines; the spec of each Machine is c3.4xlarge, 16 VCPUs, 2 * 160 GB (SSD’s), EBS ST1 disk type, EXT4 filesystem
  • Zookeeper running on three machines; the spec of each machine is of t2.micro

These benchmarks were run for the following cases:

1.Single producer (single thread) and single consumer producing asynchronously on a topic with replication factor of 3 and 1 partitions.

2. 9 producer (3 producer with three thread so total 9 (3*3) ) and 3 consumer producing asynchronously on a topic with replication factor of 3x and 1 partitions.

Case 1: Single producer (single thread) and Single Consumer producing asynchronously on a topic with replication factor of 3 and 1 partitions.

kafka9

kafka10

Result:

If we compare above result with Scenario 1 case 1 result ,we found that the avergae latency with EBS ST1 disk is very low.

 

Case 2:  9 producer (3 producer with three thread so total 9 (3*3) ) and 3 consumer producing asynchronously on a topic with replication factor of 3x and 1 partitions.

Producer

kafka11

Consumer

kafka12

Conclusion:

With increase in producer count (compared to above Scenario 2 case 1) the Latency of the producer also increases .

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s