In this post we'll review how to deploy Apache ZooKeeper on kubernetes.

"Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination."

It can be used for cluster coordination actions. For example, it used by kafka for the cluster's brokers election.

To deploy ZooKeeper on kubernetes, we need the following:

A ConfigMap with its related configuration files
An exposed service enabling clients to access the ZooKeeper
A headless service enabling ZooKeeper instances coordination
A StatefulSet to run the ZooKeeper instances
An init container to update the ZooKeeper instances configuration
An updated ZooKeeper container to run the ZooKeeper instances

You might also find the post Deploy Apache Kafka on Kubernetes relevant.

The ConfigMap

The ConfigMap holds two files:

The logger configuration file
The ZooKeeper configuration file

Notice that the ZooKeeper configuration file is a template, that will be later updated by the init container to include the list of the ZooKeeper instances.

apiVersion: v1
kind: ConfigMap
metadata:
  name: zookeeper-config
data:
  log4j.properties: |-
    log4j.rootLogger=INFO, stdout
    log4j.appender.stdout=org.apache.log4j.ConsoleAppender
    log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
    log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
    
    # Suppress connection log messages, three lines per livenessProbe execution
    log4j.logger.org.apache.zookeeper.server.NIOServerCnxnFactory=WARN
    log4j.logger.org.apache.zookeeper.server.NIOServerCnxn=WARN
  zookeeper.properties: |-
    4lw.commands.whitelist=*
    tickTime=2000
    dataDir=/var/lib/zookeeper/data
    dataLogDir=/var/lib/zookeeper/log
    clientPort=2181
    maxClientCnxns=2
    initLimit=5
    syncLimit=2

The Exposed Service

The exposed service is the service used by the ZooKeeper clients.

apiVersion: v1
kind: Service
metadata:
  name: zookeeper-service
spec:
  selector:
    configid: zookeeper-container
  ports:
      - port: 80
        targetPort: 2181

The Headless Service

The headless service is used only by the ZooKeeper instances, and is not used by the ZooKeeper clients. Its purpose is coordination between the ZooKeeper cluster instances.

apiVersion: v1
kind: Service
metadata:
  name: zookeeper-internal-service
spec:
  selector:
    configid: zookeeper-container
  type: ClusterIP
  clusterIP: None
  publishNotReadyAddresses: true
  ports:
      - port: 2888
        name: peer
      - port: 3888
        name: election

The StatefulSet

The StatefulSet creates the actual instances of the ZooKeeper cluster.

It contains an init container which updates the ZooKeeper instances configuration, and the actual ZooKeeper container.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zookeeper-statefulset
spec:
  serviceName: zookeeper-internal-service
  selector:
    matchLabels:
      configid: zookeeper-container
  replicas: 
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        configid: zookeeper-container
    spec:
      terminationGracePeriodSeconds: 10
      initContainers:
        - name: init
          image: my-registry/zookeeper-init:latest
          env:
            - name: ZOO_REPLICAS
              value: "3"
          volumeMounts:
            - name: configmap
              mountPath: /etc/kafka-configmap
            - name: config
              mountPath: /etc/kafka
            - name: data
              mountPath: /var/lib/zookeeper
      containers:
        - name: zookeeper
          image: my-registry/zookeeper:latest
          env:
            - name: KAFKA_LOG4J_OPTS
              value: -Dlog4j.configuration=file:/etc/kafka/log4j.properties
          command:
            - ./bin/zookeeper-server-start.sh
            - /etc/kafka/zookeeper.properties
          lifecycle:
            preStop:
              exec:
                command:
                  - "/bin/bash"
                  - "/pre_stop.sh"
          readinessProbe:
            exec:
              command:
                - "/bin/bash"
                - "/readiness_probe.sh"
          volumeMounts:
            - name: config
              mountPath: /etc/kafka
            - name: data
              mountPath: /var/lib/zookeeper
      volumes:
        - name: configmap
          configMap:
            name: zookeeper-config
        - name: config
          emptyDir: {}
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: "hostpath"
        resources:
          requests:
            storage: 500Mi

The init container

The init container purpose is to update the ZooKeeper instances in the ZooKeeper configuration file.

It is based on the following Dockerfile:

FROM ubuntu:18.04
COPY files /
ENTRYPOINT /entrypoint.sh

And on the following script, which adds the list of the ZooKeeper pods to the configuration file.

#!/bin/bash
set -e

[[ -d /var/lib/zookeeper/data ]] || mkdir /var/lib/zookeeper/data
export ZOOKEEPER_SERVER_ID=${HOSTNAME##*-}
echo "my server id is ${ZOOKEEPER_SERVER_ID}"
echo "${ZOOKEEPER_SERVER_ID}" > /var/lib/zookeeper/data/myid

cp -Lur /etc/kafka-configmap/* /etc/kafka/
sed -i "/^server\\./d" /etc/kafka/zookeeper.properties

# ensure new line in file
echo "" >> /etc/kafka/zookeeper.properties

for N in $(seq ${ZOO_REPLICAS})
do
    index=$(( $N - 1 ))
    serverName="zookeeper-statefulset-${index}.zookeeper-internal-service"
    echo "server.${index}=${serverName}:2888:3888:participant" >> /etc/kafka/zookeeper.properties
done

sed -i "s/server\.$ZOOKEEPER_SERVER_ID\=[a-z0-9.-]*/server.$ZOOKEEPER_SERVER_ID=0.0.0.0/" /etc/kafka/zookeeper.properties

The updated ZooKeeper container

The updated ZooKeeper container is created using the following Dockerfile:

FROM solsson/kafka:2.4.1@sha256:79761e15919b4fe9857ec00313c9df799918ad0340b684c0163ab7035907bb5a
RUN apt update
RUN apt install -y net-tools
RUN apt install -y curl

COPY files /
ENTRYPOINT /entrypoint.sh

And includes a cleanup script: pre_stop.sh

#!/bin/bash
kill -s TERM 1

while $(kill -0 1 2>/dev/null)
do
    sleep 1
done

And also includes a readiness probe script: readiness_probe.sh

#!/bin/bash
set -e
response=$(echo ruok | nc -w 1 -q 1 127.0.0.1 2181)

if [[ "$response" == "imok" ]]
then
    exit 0
fi

exit 1

Final Notes

In this post we have reviewed deploying a ZooKeeper cluster on kubernetes.

In case setting the ZOO_REPLICAS environment variable to "1", the ZooKeeper will run in a standalone mode.

In case setting the ZOO_REPLICAS environment variable to "3" or more, the ZooKeeper will run in a cluster mode.

Full Blog TOC

Full Blog Table Of Content with Keywords Available HERE

Wednesday, June 17, 2020

Deploy Apache Zookeeper on Kubernetes