Kafka: A detailed introduction

I’ll cover Kafka in detail with introduction to programmability and will try to cover almost full architecture of it. So here it go:-

We need Kafka when there is need for building a real-time processing system as Kafka is a high-performance publisher-subscriber-based messaging system with highly scalable properties. Traditional systems unable to process this large data and mainly for offline used analysis, Kafka is a solution to the real-time problems of any software solution; that is to say, unify offline or online data processing and routing it to multiple consumers quickly.

Below are the Characteristics of Kafka:-

Persistent messaging: – TBs worth of messaged are persists on disk as well as replicated within the cluster to prevent data loss.
High Throughput: – Kafka designed to handle hundreds of MBs of reads writes per socnd on commodity hardware from large number of clients.
Distributed: – Kafka is cluster centric and can grow transparently without downtime.
Multiple Client Support: – It has easy integration with Java, .NET, PHP, RUBY, and Python.

Real Time: – Message produced by the producer should be immediately visible to consumer thread.

We can build similar solution in Java Messaging Service (JMS) but JMS has its own limitations below:-

Performance of JMS is not good in dealing with large volume of streaming data.
JMS is not good Horizontal scalable solution.
In JMS, broker bore responsibility of how many massages has been consumed by consumer is major drawback.

Note: – We have similar tools available like Kafka are RabbitMQ, Flume and ActiveMQ.

Kafka as a cluster can be installed as Single node with multiple Broker or with Multiple nodes Multiple Broker.

Note: – I skipped the Kafka Installation and guidelines as there are good reference already available for cluster type requirement on Kafka website therefore assuming Kafka already installed and your current directory is at “usr/hdp/current/kafka-broker”

With below diagram you can understand components like:-

Producer: Producers publish data to the topics by choosing the appropriate partition within the topic.
Zookeeper: – ZooKeeper designed to store coordination data: status information, configuration, location information, and so on.
Consumer: – Consumers are the applications or processes that subscribe to topics and process the feed of published messages.
Broker: A Kafka cluster consists of one or more servers where each one may have one or more server processes running and is called the broker. Topics are created within the context of broker processes.
Topic: A topic is a category or feed name to which messages are published by the message producers. Each message in the partition is assigned a unique sequential ID called the offset.

A producer publishes messages to a Kafka topic (you can call it “Messaging Queue”). Kafka topics are created on Kafka broker acting as a Kafka server can be used to store messages if required. Consumers are then subscribe to the Kafka topic (one or more) to get the messages.

In Kafka topics, every partition is mapped to a logical log file that is represent as a set of segment file of equal size. All the message partitions are assigned a unique sequential number called the offset, which is used to identify each message within the partition. Each partition is replicated across configurable number of servers. In a Kafka cluster, each server plays a dual role; it acts as a leader for some of its partitions and also a follower for other partitions. This ensures the load balance within the Kafka cluster.

Now let’s try to cover some programming model of Kafka in command line and I am running Kafka in standalone mode but real power of Kafka is unlocked when it is run in the cluster mode with replication and the topics are appropriately partitioned. Cluster mode give power of parallelism and data safety even when Kafka node goes down.:-

First we create a Kafka Topic, in my case Kafka bin files are present at location “/usr/hdp/current/kafka-broker/bin”. The kafka-topics.sh utility will create a topic, override the default number of partitions from two to one, and show a successful creation message. It also takes ZooKeeper server information, as in this case:  localhost:2181.

[root@sanbox]#bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 1 –partitions 1 –topic Test

This proceeding command create a topic “Test” with a replication factor of 1 with 1 partition. Here we need to mention the ZooKeeper host and port number as well. The number of partition determine the parallelism that can be achieved at consumer side.

Now let’s see the list of topics on My Kafka server are running using below command:-

[root@sanbox]#bin/kafka-topics.sh –list –zookeeper localhost:2181



There is more to know about topic using below command:-

[root@sanbox]#bin/kafka-topics.sh –describe –zookeeper localhost:2181 –topic  Test

Topic:Test PartitionCount:1 ReplicationFactor:1 Configs:

Topic: Test Partition: 0 Leader: 0 Replicas: 0 Isr: 0

Now start a producer to send messages to broker and In my case Kafka Broker is listening at port 6667:-

[root@sanbox]#bin/kafka-console-producer.sh –broker-list localhost:6667 –topic Test

Now let’s start a consumer to consume messages in another window:-

[root@sanbox]#bin/kafka-console-consumer.sh –zookeeper localhost:2181 –topic kafkatopic –from-beginning

After this command type any message at producers end will be appearing at consumer window.

Note: Here we are using single producer to get connected to the single broker. But if we requirement to use multiple brokers then we can pass list in argument broker-list for example below:-

[root@sanbox]#bin/kafka-console-producer.sh –broker-list localhost:6667, localhost:6668 –topic replicated-kafkatopic

We can have one more architecture for called “Multiple nodes – multiple broker” as below:-

You can write client/custom code in any programming languages out of Java ,Python, Scala and JRuby for producer and consumer. But Java is the official client for Kafka broker.

We can start Kafka multiple broker by writing multiple configuration files i.e. server.properties.

Now below let’s have a look on server.properties file and discuss of its default values:-

replication.factor=1 > Set default replication factor
replica.fetchers=1 > Set num.replica.fatchers
and many more …

Same way we have producer.properties file and contains some of the default parameters.

Another important configuration file is with name consumer.properties in config folder contains the default values for consumer.

Now let’s debug Kafka and have a look on tools come with Kafka build:-

bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker: – We get some information on consumer group and offset.
bin/kafka-run-class.sh kafka.tools.DumpLogSegments : – To debug the Kafka logs data for various debugging purposes such as understanding how much of the logs have been written and what’s the status of the various segments.
bin/kafka-run-class.sh kafka.tools.ExportZkOffsets :- When we want to take a backup of the offsets saved in ZooKeeper this tool would be very useful.
bin/kafka-run-class.sh kafka.tools.KafkaMigrationTool :- When we would like to migrate Kafka data from any lower version to another higher version.
bin/kafka-run-class.sh kafka.tools.MirrorMaker :- When we are running two different instance of Kfka and want to replicate data on one another.
bin/kafka-run-class.sh kafka.tools.UpdateOffsetsInZK :- When our Kafka instance is up and running then this tool is to reset offset of a consumer in ZooKeeper.

Next Blog I’ll discuss integrating Kafka with Java and Python. We will write simple producer, consumer in both languages.
Above lest review few things and do that again:-

Window 1 run below command:-

echo “This is my first message” | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh –broker-list sandbox.hortonworks.com:6667 –topic new_messages_recvd  –new-producer

In Window2 run below to see the messages are appearing from Window1:-

kafka-console-consumer.sh –zookeeper sandbox.hortonworks.com:2181 –topic new_messages_recvd

Now let us move a step further to understand what is offset that I skipped completely in my previous article.

Kafka maintains offsets per consumer group. A group is a particular use case for consuming a topic. If we have two identical jobs running in parallel on a given topic to split the workload collectively, they share same group ID and collectively maintain topic offsets.

Groups are high level consumer to manage the major aspects of consuming topics. These consumer manage the offsets per partition per topic including parallel consumption by groups. Zookeeper is responsible to store this information.
Let us create post a message to producer with new topic :-

echo “This is my Mukesh message” | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh –broker-list sandbox.hortonworks.com:6667 –topic new_messages_recvd  –new-producer

Create custom properties configuration file with name “consumer.properties” and add below values to custom properties file:-



Now pass this properties file to consumer using below command and see message appeared:-

/usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh –zookeeper sandbox.hortonworks.com:2181 –topic new_messages_recvd –from-beginning –consumer.config consumer.properties

To see offset details we need to go into Zookeeper using below commands:-

zookeeper-shell.sh sandbox.hortonworks.com:2181

ls /consumers

[console-consumer-2529, console-consumer-31045, hungry_hippo, console-consumer-11236]

ls /consumers/hungry_hippo2/offsets/new_messages_recvd


ls /consumers/hungry_hippo/offsets/new_messages_recvd/0


This is how Kafka manage the offset for consumer group, in our case 2 is our offset.

Let us proceed further into Partitions i.e. a topic can be divided into partitions which may be distributed. Partitions are distribute/shard data across the brokers, enable parallelism and sequencing. Kafka very fast because it reads data sequentially from disk. When a consumer reads a topic, the messages will arrive in the order they were received.

We can Keyed our messages and default key value is null. The default behavior and routing scheme to distribute messages among partitions based on their keys and this routing scheme can be overridden with custom implementation.

/usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh  –zookeeper sandbox.hortonworks.com:2181  –topic new_messages_recvd  –property print.key=true  –property key.separator=,  –from-beginning

This above command passed new parameters to the consumer, one is telling about to print separator and second is to specify a separator.

Now let us specify these values at producer using below commands:-

echo “KEY1,A keyed message” | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh –broker-list sandbox.hortonworks.com:6667 –topic new_messages_recvd –property parse.key=true –property key.separator=, –new-producer

Topics have some basic configuration properties that can change the behavior of a topic. Many topic related settings exists at the broker level and primarily topics help how to retain messages.

A Message can live forever, a period, until logs reach threshold, after log deleted and latest keyed version. Log compaction can be used to maintain the latest version of a message with given key. So before start it we first need to enable the log compaction in the primary server.properties:-


Now some cleanup stuff:-

To Alter a topic

kafka-topics.sh –zookeeper sandbox.hortonworks.com:2181 /chroot –alter –topic new_messages_recvd –partitions 40 –config delete.retention.ms=10000 –deleteConfig retention.ms

To delete topic we use below command:-

kafka-topics.sh –zookeeper sandbox.hortonworks.com:2181  –delete –topic new_messages_recvd

A graceful shutdown is sometime very important therefore use below:-


Mirroring data between Kafka clusters: – Sometime we required to copy data from multiple Kafka clusters to a single one.

13 thoughts on “Kafka: A detailed introduction

  1. What i do not realize is in reality how you are no longer really much more neatly-appreciated than you might be now. You’re so intelligent. You recognize thus significantly in relation to this topic, made me in my view imagine it from a lot of numerous angles. Its like men and women don’t seem to be involved until it抯 something to do with Lady gaga! Your own stuffs outstanding. At all times care for it up!

  2. There are definitely plenty of details like that to take into consideration. That could be a great point to bring up. I supply the thoughts above as normal inspiration but clearly there are questions like the one you bring up where an important factor might be working in honest good faith. I don?t know if finest practices have emerged around issues like that, however I am certain that your job is clearly identified as a good game. Both girls and boys feel the affect of just a moment抯 pleasure, for the rest of their lives.

  3. Whether you believe in God or not, this is a must-read message!!!

    Throughout time, we can see how we have been slowly conditioned to come to this point where we are on the verge of a cashless society. Did you know that the Bible foretold of this event almost 2,000 years ago?

    In Revelation 13:16-18, we read,

    “He (the false prophet who decieves many by his miracles) causes all, both small and great, rich and poor, free and slave, to receive a mark on their right hand or on their foreheads, and that no one may buy or sell except one who has the mark or the name of the beast, or the number of his name.

    Here is wisdom. Let him who has understanding calculate the number of the beast, for it is the number of a man: His number is 666.”

    Referring to the last generation, this could only be speaking of a cashless society. Why? Revelation 13:17 tells us that we cannot buy or sell unless we receive the mark of the beast. If physical money was still in use, we could buy or sell with one another without receiving the mark. This would contradict scripture that states we need the mark to buy or sell!

    These verses could not be referring to something purely spiritual as scripture references two physical locations (our right hand or forehead) stating the mark will be on one “OR” the other. If this mark was purely spiritual, it would indicate only in one place.

    This is where it really starts to come together. It is shocking how accurate the Bible is concerning the implatnable RFID microchip. These are notes from a man named Carl Sanders who worked with a team of engineers to help develop this RFID chip

    “Carl Sanders sat in seventeen New World Order meetings with heads-of-state officials such as Henry Kissinger and Bob Gates of the C.I.A. to discuss plans on how to bring about this one-world system. The government commissioned Carl Sanders to design a microchip for identifying and controlling the peoples of the world—a microchip that could be inserted under the skin with a hypodermic needle (a quick, convenient method that would be gradually accepted by society).

    Carl Sanders, with a team of engineers behind him, with U.S. grant monies supplied by tax dollars, took on this project and designed a microchip that is powered by a lithium battery, rechargeable through the temperature changes in our skin. Without the knowledge of the Bible (Brother Sanders was not a Christian at the time), these engineers spent one-and-a-half-million dollars doing research on the best and most convenient place to have the microchip inserted.

    Guess what? These researchers found that the forehead and the back of the hand (the two places the Bible says the mark will go) are not just the most convenient places, but are also the only viable places for rapid, consistent temperature changes in the skin to recharge the lithium battery. The microchip is approximately seven millimeters in length, .75 millimeters in diameter, about the size of a grain of rice. It is capable of storing pages upon pages of information about you. All your general history, work history, crime record, health history, and financial data can be stored on this chip.

    Brother Sanders believes that this microchip, which he regretfully helped design, is the “mark” spoken about in Revelation 13:16–18. The original Greek word for “mark” is “charagma,” which means a “scratch or etching.” It is also interesting to note that the number 666 is actually a word in the original Greek. The word is “chi xi stigma,” with the last part, “stigma,” also meaning “to stick or prick.” Carl believes this is referring to a hypodermic needle when they poke into the skin to inject the microchip.”

    Mr. Sanders asked a doctor what would happen if the lithium contained within the RFID microchip leaked into the body. The doctor replied by saying a terrible sore would appear in that location. This is what the book of Revelation says:

    “And the first (angel) went, and poured out his vial on the earth; and there fell a noisome and grievous sore on the men which had the mark of the beast, and on them which worshipped his image” (Revelation 16:2).

    You can read more about it here–and to also understand the mystery behind the number 666: https://2ruth.org/rfid-mark-of-the-beast-666-revealed/

    The third angel’s warning in Revelation 14:9-11 states,

    “Then a third angel followed them, saying with a loud voice, ‘If anyone worships the beast and his image, and receives his mark on his forehead or on his hand, he himself shall also drink of the wine of the wrath of God, which is poured out full strength into the cup of His indignation. He shall be tormented with fire and brimstone in the presence of the holy angels and in the presence of the Lamb. And the smoke of their torment ascends forever and ever; and they have no rest day or night, who worship the beast and his image, and whoever receives the mark of his name.'”

    Who is Barack Obama, and why is he still in the public scene?

    So what’s in the name? The meaning of someone’s name can say a lot about a person. God throughout history has given names to people that have a specific meaning tied to their lives. How about the name Barack Obama? Let us take a look at what may be hiding beneath the surface.

    Jesus says in Luke 10:18, “…I saw Satan fall like lightning from heaven.”

    The Hebrew Strongs word (H1299) for “lightning”: “bârâq” (baw-rawk)

    In Isaiah chapter 14, verse 14, we read about Lucifer (Satan) saying in his heart:

    “I will ascend above the heights of the clouds, I will be like the Most High.”

    In the verses in Isaiah that refer directly to Lucifer, several times it mentions him falling from the heights or the heavens. The Hebrew word for the heights or heavens used here is Hebrew Strongs 1116: “bamah”–Pronounced (bam-maw’)

    In Hebrew, the letter “Waw” or “Vav” is often transliterated as a “U” or “O,” and it is primarily used as a conjunction to join concepts together. So to join in Hebrew poetry the concept of lightning (Baraq) and a high place like heaven or the heights of heaven (Bam-Maw), the letter “U” or “O” would be used. So, Baraq “O” Bam-Maw or Baraq “U” Bam-Maw in Hebrew poetry similar to the style written in Isaiah, would translate literally to “Lightning from the heights.” The word “Satan” in Hebrew is a direct translation, therefore “Satan.”

    So when Jesus told His disciples in Luke 10:18 that He beheld Satan fall like lightning from heaven, if this were to be spoken by a Jewish Rabbi today influenced by the poetry in the book of Isaiah, he would say these words in Hebrew–the words of Jesus in Luke 10:18 as, And I saw Satan as Baraq O Bam-Maw.

    The names of both of Obama’s daughters are Malia and Natasha. If we were to write those names backward (the devil does things in reverse) we would get “ailam ahsatan”. Now if we remove the letters that spell “Alah” (Allah being the false god of Islam), we get “I am Satan”. Coincidence? I don’t think so.

    Obama’s campaign logo when he ran in 2008 was a sun over the horizon in the west, with the landscape as the flag of the United States. In Islam, they have their own messiah that they are waiting for called the 12th Imam, or the Mahdi (the Antichrist of the Bible), and one prophecy concerning this man’s appearance is the sun rising in the west.

    “Then I saw another angel flying in the midst of heaven, having the everlasting gospel to preach to those who dwell on the earth—to every nation, tribe, tongue, and people— saying with a loud voice, ‘Fear God and give glory to Him, for the hour of His judgment has come; and worship Him who made heaven and earth, the sea and springs of water.'” (Revelation 14:6-7)

    Why have the word’s of Jesus in His Gospel accounts regarding His death, burial, and resurrection, been translated into over 3,000 languages, and nothing comes close? The same God who formed the heavens and earth that draws all people to Him through His creation, likewise has sent His Word to the ends of the earth so that we may come to personally know Him to be saved in spirit and in truth through His Son Jesus Christ.

    Jesus stands alone among the other religions that say to rightly weigh the scales of good and evil and to make sure you have done more good than bad in this life. Is this how we conduct ourselves justly in a court of law? Bearing the image of God, is this how we project this image into reality?

    Our good works cannot save us. If we step before a judge, being guilty of a crime, the judge will not judge us by the good that we have done, but rather by the crimes we have committed. If we as fallen humanity, created in God’s image, pose this type of justice, how much more a perfect, righteous, and Holy God?

    God has brought down His moral laws through the 10 commandments given to Moses at Mt. Siani. These laws were not given so we may be justified, but rather that we may see the need for a savior. They are the mirror of God’s character of what He has put in each and every one of us, with our conscious bearing witness that we know that it is wrong to steal, lie, dishonor our parents, murder, and so forth.

    We can try and follow the moral laws of the 10 commandments, but we will never catch up to them to be justified before a Holy God. That same word of the law given to Moses became flesh about 2,000 years ago in the body of Jesus Christ. He came to be our justification by fulfilling the law, living a sinless perfect life that only God could fulfill.

    The gap between us and the law can never be reconciled by our own merit, but the arm of Jesus is stretched out by the grace and mercy of God. And if we are to grab on, through faith in Him, He will pull us up being the one to justify us. As in the court of law, if someone steps in and pays our fine, even though we are guilty, the judge can do what is legal and just and let us go free. That is what Jesus did almost 2,000 years ago on the cross. It was a legal transaction being fulfilled in the spiritual realm by the shedding of His blood.

    For God takes no pleasure in the death of the wicked (Ezekiel 18:23). This is why in Isaiah chapter 53, where it speaks of the coming Messiah and His soul being a sacrifice for our sins, why it says it pleased God to crush His only begotten Son.

    This is because the wrath that we deserve was justified by being poured out upon His Son. If that wrath was poured out on us, we would all perish to hell forever. God created a way of escape by pouring it out on His Son whose soul could not be left in Hades but was raised and seated at the right hand of God in power.

    So now when we put on the Lord Jesus Christ (Romans 13:14), God no longer sees the person who deserves His wrath, but rather the glorious image of His perfect Son dwelling in us, justifying us as if we received the wrath we deserve, making a way of escape from the curse of death–now being conformed into the image of the heavenly man in a new nature, and no longer in the image of the fallen man Adam.

    Now what we must do is repent and put our trust and faith in the savior, confessing and forsaking our sins, and to receive His Holy Spirit that we may be born again (for Jesus says we must be born again to enter the Kingdom of God–John chapter 3). This is not just head knowledge of believing in Jesus, but rather receiving His words, taking them to heart, so that we may truly be transformed into the image of God. Where we no longer live to practice sin, but rather turn from our sins and practice righteousness through faith in Him in obedience to His Word by reading the Bible.

    Our works cannot save us, but they can condemn us; it is not that we earn our way into everlasting life, but that we obey our Lord Jesus Christ:

    “And having been perfected, He became the author of eternal salvation to all who obey Him.” (Hebrews 5:9)

    “Now I saw a new heaven and a new earth, for the first heaven and the first earth had passed away. Also there was no more sea. Then I, John, saw the holy city, New Jerusalem, coming down out of heaven from God, prepared as a bride adorned for her husband. And I heard a loud voice from heaven saying, ‘Behold, the tabernacle of God is with men, and He will dwell with them, and they shall be His people. God Himself will be with them and be their God. And God will wipe away every tear from their eyes; there shall be no more death, nor sorrow, nor crying. There shall be no more pain, for the former things have passed away.’

    Then He who sat on the throne said, ‘Behold, I make all things new.’ And He said to me, ‘Write, for these words are true and faithful.’

    And He said to me, ‘It is done! I am the Alpha and the Omega, the Beginning and the End. I will give of the fountain of the water of life freely to him who thirsts. He who overcomes shall inherit all things, and I will be his God and he shall be My son. But the cowardly, unbelieving, abominable, murderers, sexually immoral, sorcerers, idolaters, and all liars shall have their part in the lake which burns with fire and brimstone, which is the second death.'” (Revelation 21:1-8).

Leave a Reply

Your email address will not be published. Required fields are marked *