Wednesday, November 20, 2019

redis cluster survivability in kubernetes

A week ago, I've published an article of how to deploy redis cluster on kubernetes.
In the article, I've added an entrypoint.sh, which handles the IP change of the redis pod in case it is restarted.

But, that's not enough.

In case the entire redis cluster pods are restarted, the redis pods will no longer be able to communicate with each other.

Why?

Each redis pods holds a nodes.conf file, which includes list of the redis nodes. Each line in the file contains the redis node ID, and the redis node IP. If we restart all of the redis cluster nodes, all nodes IPs are changing, hence the redis node cannot re-establish connection with then according to the out of date IPs in the nodes.conf.

How can we solve this?



The general solution is as follows:

  1. Create a kubernetes config map holding mapping of resdis node ID to kubernetes pod name.
    The kubernetes pod name is assured not to change, since we are using a kubernetes StatefulSet. An example of such config is:

    79315a4ceef00496afc8fa7a97874e5b71dc547b  redis-statefulset-1
    b4d9be9e397d19c63bce602a8661b85ccc6e2d1d redis-statefulset-2
  2. Upon each redis pod startup read the pods config map, and find the new IP by the pod name, and then replace it in the nodes.conf. This can be done as a nodejs initContainer running in the redis node before the redis container. This container should be have the pods config map mapped a volume, for example under /nodes.txt.
The redis config update init container code is below.
const fs = require('fs')

init()

async function init() {
  const nodesConfPath = '/data/nodes.conf'
  if (!fs.existsSync(nodesConfPath)) {
    return
  }

  const configuration = fs.readFileSync(nodesConfPath, 'utf8')
  const updatedConfiguration = await updateConfiguration(configuration)
  fs.writeFileSync(nodesConfOutputPath, updatedConfiguration)
}

async function updateConfiguration(configuration) {
  const lines = []
  const configLines = configuration.split('\n')
  for (let i=0; i 0) {
      lines.push(await updateConfigurationLine(line))
    }
  })
  return lines.join('\n')
}

async function updateConfigurationLine(line) {
  const sections = line.match(/(\S+) (\S+)(:.*)/)
  if (sections == null) {
    return line
  }
  const nodeId = sections[1]
  const nodeIp = sections[2]
  const other = sections[3]
  const currentNodeIp = await getCurrentNodeIp(nodeId, nodeIp)
  return `${nodeId} ${currentNodeIp}${other}`
}

async function getCurrentNodeIp(nodeId, nodeIp) {
  const nodesPods = fs.readFileSync(nodesPath, 'utf8')
  const nodesPodsLines = nodesPods.split('\n')
  for (let i=0; i< nodesPodsLines.length; i++) {
    const line = nodesPodsLines[i].trim()
    if (line.length > 0) {
      const sections = line.split(' ')
      const configuredNodeId = sections[0]
      const configuredPodName = sections[1]
      if (configuredNodeId === nodeId) {
        const existingNodeIp = await fetchPodIpByName(configuredPodName)
        if (existingNodeIp != null) {
          nodeIp = existingNodeIp
        }
      }
    }
  })

  return nodeIp
}

async function fetchPodIpByName(podName) {
  const jsonParse = '{.status.podIP}'
  const args = `get pods ${podName} -o jsonpath='${jsonParse}'`
  const stdout = await kubectl(args)
  const ip = stdout.match(/(\d+\.\d+\.\d+\.\d+)/)
  if (ip) {
    return ip[1]
  }

  return null
}


async function kubectl(args) {
  return await new Promise((resolve, reject) => {
    const commandLine = `kubectl ${args}`
    exec(commandLine, (err, stdout, stderr) => {
      if (err) {
        reject(err)
        return
      }
      resolve(stdout)
    })
  })
}



Summary

Using the IPs updater init container, in combination with the config map, allows redis cluster to fully recover from both full and partial restarts. Notice that the init container should be granted with permissions to execute list and get for the pods resource.


No comments:

Post a Comment