Apache Zookeeper is distributed coordination service to build distributed applications. Kafka uses the Zookeeper to manage cluster.
Kafka is included with Zookeeper and we can start as service however we can also install standalone zookeeper. This article demonstrates the setting up zookeeper cluster.
Prerequisite
Latest Zookeeper is required minimum JAVA version 1.8 and above (JAVA_HOME environment variable need setup in the system)
Need 7zip software to extract .tgz files in windows.
Software's and Tools
Windows 10 Java 1.8 or higher Zookeeper 3.7.0 |
Download and Extract
Download latest zookeeper binary tar.gz from following apache zookeeper download page.
https://zookeeper.apache.org/releases.html
Direct link as follow
https://www.apache.org/dyn/closer.lua/zookeeper/zookeeper-3.7.0/apache-zookeeper-3.7.0-bin.tar.gz
Extract downloaded “apache-zookeeper-3.7.0-bin.tar.gz” using 7z software in local drive. We are going to setup 3 node cluster so clone extracted directory 2 more times.
Name the directories as follows
zookeeper-node1 zookeeper-node2 zookeeper-node3 |
Configure Cluster
Zookeeper Node1
Step:1
Data Directory
Create data directory in “zookeeper-node1” and we will use this directory path in zookeeper configuration.
Example:
C:\\kafka-workspace\\zookeeper-node1\\data
Step:2
Create “myid” file
We need to create “myid” file in data directory and update file name with node number that is “1”.
Step:3
Zookeeper Configuration
Now have to update few zookeeper configuration properties. Go to zookeeper-node1 directory and rename “zoo_sample.cfg” to “zoo.cfg”
Add/Update “zoo.cfg” file with following properties.
clientPort=2181 dataDir=C:\\kafka-workspace\\zookeeper-node1\\data autopurge.snapRetainCount=3 autopurge.purgeInterval=24 server.1=localhost:2666:3666 server.2=localhost:2667:3667 server.3=localhost:2668:3668 tickTime=2000 initLimit=10 syncLimit=5 |
server.<myid>=<hostname>:<leaderport>:<electionport> |
We are setting up 3 node cluster so we have to configure server.[myid] property which have host name, leader and elector ports. Zookeeper use this information to form cluster.
Note:
we are using same machine to setup cluster so we have to make sure there is no port conflicts.
Zookeeper Node2 & Zookeeper Node2
Repeat Zookeeper Node1 Steps 1,2 and 3 for Zookeeper Node2 and Node3 and maintain unique port numbers.
Zookeeper Node2
myid file content is “2”
Add/Update “zoo.cfg” file with following properties.
clientPort=2182 dataDir=C:\\kafka-workspace\\zookeeper-node1\\data autopurge.snapRetainCount=3 autopurge.purgeInterval=24 server.1=localhost:2666:3666 server.2=localhost:2667:3667 server.3=localhost:2668:3668 tickTime=2000 initLimit=10 syncLimit=5 |
Zookeeper Node3
myid file content is “3”
Add/Update “zoo.cfg” file with following properties.
clientPort=2183 dataDir=C:\\kafka-workspace\\zookeeper-node1\\data autopurge.snapRetainCount=3 autopurge.purgeInterval=24 server.1=localhost:2666:3666 server.2=localhost:2667:3667 server.3=localhost:2668:3668 tickTime=2000 initLimit=10 syncLimit=5 |
Start Zookeeper Node 1,2 and 3
Zookeeper Node1
Open command prompt and locate to “zookeeper-node1” bin directory and use following command to start zookeeper server.
zkServer.cmd |
Zookeeper console logs will give us the more details on server startup.
Note:
You may see “java.net.ConnectException: Connection refused: connect” warning and its due to other nodes are not yet started.
Zookeeper Node2
Open command prompt and locate to “zookeeper-node2” bin directory and use following command to start zookeeper server.
zkServer.cmd |
Zookeeper Node3
Open command prompt and locate to “zookeeper-node3” bin directory and use following command to start zookeeper server.
zkServer.cmd |
In the cluster only one node will be leader and other all nodes are followers. Leader and follower’s information can be view in the logs.
Node1( Leader)
Node2 (Follower)
Node3 (Follower)
Now successfully configured zookeeper cluster on windows.
References
https://zookeeper.apache.org/doc/current/index.html