Wednesday, June 17, 2020

Deploy Apache Zookeeper on Kubernetes



In this post we'll review how to deploy Apache ZooKeeper on kubernetes.

Zookeeper manifest is:

"Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination."

It can be used for cluster coordination actions. For example, it used by kafka for the cluster's brokers election.

To deploy ZooKeeper on kubernetes, we need the following:
  • A ConfigMap with its related configuration files
  • An exposed service enabling clients to access the ZooKeeper
  • A headless service enabling ZooKeeper instances coordination
  • A StatefulSet to run the ZooKeeper instances
  • An init container to update the ZooKeeper instances configuration
  • An updated ZooKeeper container to run the ZooKeeper instances


You might also find the post Deploy Apache Kafka on Kubernetes relevant.


The ConfigMap


The ConfigMap holds two files:
  • The logger configuration file
  • The ZooKeeper configuration file

Notice that the ZooKeeper configuration file is a template, that will be later updated by the init container to include the list of the ZooKeeper instances.


apiVersion: v1
kind: ConfigMap
metadata:
name: zookeeper-config
data:
log4j.properties: |-
log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n

# Suppress connection log messages, three lines per livenessProbe execution
log4j.logger.org.apache.zookeeper.server.NIOServerCnxnFactory=WARN
log4j.logger.org.apache.zookeeper.server.NIOServerCnxn=WARN
zookeeper.properties: |-
4lw.commands.whitelist=*
tickTime=2000
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/log
clientPort=2181
maxClientCnxns=2
initLimit=5
syncLimit=2


The Exposed Service


The exposed service is the service used by the ZooKeeper clients.


apiVersion: v1
kind: Service
metadata:
name: zookeeper-service
spec:
selector:
configid: zookeeper-container
ports:
- port: 80
targetPort: 2181



The Headless Service


The headless service is used only by the ZooKeeper instances, and is not used by the ZooKeeper clients. Its purpose is coordination between the ZooKeeper cluster instances.


apiVersion: v1
kind: Service
metadata:
name: zookeeper-internal-service
spec:
selector:
configid: zookeeper-container
type: ClusterIP
clusterIP: None
publishNotReadyAddresses: true
ports:
- port: 2888
name: peer
- port: 3888
name: election


The StatefulSet


The StatefulSet creates the actual instances of the ZooKeeper cluster.
It contains an init container which updates the ZooKeeper instances configuration, and the actual ZooKeeper container.


apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zookeeper-statefulset
spec:
serviceName: zookeeper-internal-service
selector:
matchLabels:
configid: zookeeper-container
replicas:
podManagementPolicy: Parallel
template:
metadata:
labels:
configid: zookeeper-container
spec:
terminationGracePeriodSeconds: 10
initContainers:
- name: init
image: my-registry/zookeeper-init:latest
env:
- name: ZOO_REPLICAS
value: "3"
volumeMounts:
- name: configmap
mountPath: /etc/kafka-configmap
- name: config
mountPath: /etc/kafka
- name: data
mountPath: /var/lib/zookeeper
containers:
- name: zookeeper
image: my-registry/zookeeper:latest
env:
- name: KAFKA_LOG4J_OPTS
value: -Dlog4j.configuration=file:/etc/kafka/log4j.properties
command:
- ./bin/zookeeper-server-start.sh
- /etc/kafka/zookeeper.properties
lifecycle:
preStop:
exec:
command:
- "/bin/bash"
- "/pre_stop.sh"
readinessProbe:
exec:
command:
- "/bin/bash"
- "/readiness_probe.sh"
volumeMounts:
- name: config
mountPath: /etc/kafka
- name: data
mountPath: /var/lib/zookeeper
volumes:
- name: configmap
configMap:
name: zookeeper-config
- name: config
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "hostpath"
resources:
requests:
storage: 500Mi


The init container


The init container purpose is to update the ZooKeeper instances in the ZooKeeper configuration file.

It is based on the following Dockerfile:


FROM ubuntu:18.04
COPY files /
ENTRYPOINT /entrypoint.sh


And on the following script, which adds the list of the ZooKeeper pods to the configuration file.


#!/bin/bash
set -e

[[ -d /var/lib/zookeeper/data ]] || mkdir /var/lib/zookeeper/data
export ZOOKEEPER_SERVER_ID=${HOSTNAME##*-}
echo "my server id is ${ZOOKEEPER_SERVER_ID}"
echo "${ZOOKEEPER_SERVER_ID}" > /var/lib/zookeeper/data/myid

cp -Lur /etc/kafka-configmap/* /etc/kafka/
sed -i "/^server\\./d" /etc/kafka/zookeeper.properties

# ensure new line in file
echo "" >> /etc/kafka/zookeeper.properties

for N in $(seq ${ZOO_REPLICAS})
do
index=$(( $N - 1 ))
serverName="zookeeper-statefulset-${index}.zookeeper-internal-service"
echo "server.${index}=${serverName}:2888:3888:participant" >> /etc/kafka/zookeeper.properties
done

sed -i "s/server\.$ZOOKEEPER_SERVER_ID\=[a-z0-9.-]*/server.$ZOOKEEPER_SERVER_ID=0.0.0.0/" /etc/kafka/zookeeper.properties




The updated ZooKeeper container


The updated ZooKeeper container is created using the following Dockerfile:

FROM solsson/kafka:2.4.1@sha256:79761e15919b4fe9857ec00313c9df799918ad0340b684c0163ab7035907bb5a
RUN apt update
RUN apt install -y net-tools
RUN apt install -y curl

COPY files /
ENTRYPOINT /entrypoint.sh


And includes a cleanup script: pre_stop.sh


#!/bin/bash
kill -s TERM 1

while $(kill -0 1 2>/dev/null)
do
sleep 1
done



And also includes a readiness probe script: readiness_probe.sh


#!/bin/bash
set -e
response=$(echo ruok | nc -w 1 -q 1 127.0.0.1 2181)

if [[ "$response" == "imok" ]]
then
exit 0
fi

exit 1


Final Notes

In this post we have reviewed deploying a ZooKeeper cluster on kubernetes.
In case setting the ZOO_REPLICAS environment variable to "1", the ZooKeeper will run in a standalone mode.
In case setting the ZOO_REPLICAS environment variable to "3" or more, the ZooKeeper will run in a cluster mode.


No comments:

Post a Comment