run KISS: February 2025

We've review KEDA usage with prometheus in this post. However, in real life things get complicated.

How do we handle scaling based on multiple metrics? KEDA does not provide support for this, and the documentation for such task is missing.

Let review an example: We have a deployment with multiple pods that handle some granular tasks. We want the scale the replica pods by the following metrics:

CPU is over 80%
or
Memory is over 80%
or
Tasks rate per second per pod is over 100

First, we need to understand the requirements:

When do we want to scale up?

We want to scale up if any of these metrics is over the thresholds in any pod.

For example:

pod1 CPU=90%, Memory=50%, Tasks rate=20.

pod2 CPU=10%, Memory=50%, Tasks rate=20.

We should scale in this state even we have only a single metric above the threshold.

How do we achieve this?

The trick is to implement a new Prometheus metric with our application.

We create a code in our application that calculate the following metric:

scale_metric=max(memory_ratio, cpu_ratio, tasks_ratio)

Where

memory_ratio = used_memory_percentage / 80%

cpu_ratio = used_cpu_percentage / 80%

tasks_ratio = tasks_per_second / 100

Next we configure KEDA scaling by the max of this metric for all the pods:

triggers:
  - type: kafka
    metadata:
      serverAddress: {{ .Values.keda.prometheusServerUrl }}
      metricName: scale_metric
      threshold: '1'
      query: max(scale_metric)

Final Note

While this solution requires actual coding, and not just configuration, it provides a solid scaling based on all required features of our business logic.

In this post we discuss alternatives for sending messages between microservices.

I have recently designed a system where one microservice sends captured HTTP requests to another microservice. The messaging system could be kafka, NATS, or similar.

The question is how to send the data?

The first intuition is sending as JSON, for example using a GO struct representation:

type TransactionObject struct {
  Method    string
  Path      string
  QueryArgs map[string]string
  Cookies   map[string]string
  Headers   map[string]string
}

The first microservice parse the captures HTTP requests, converts them to objects, marshals the object to JSON, and sends JSON text. The second microservice would read the JSON text and unmarshal it back to object.

While this might sound the simple and obvious methodology, it is not always the best. We spend time on converting to object, and on JSON marshal and unmarshal.

Instead, we can use a simple text for the message. The first microservice parse the captures HTTP requests, and sends the text itself. The second microservice would read the text and parse the HTTP request. Hence we reduce the marshal from and to JSON.

If we choose to use such a methodology, we need to notice that unlike JSON, the text message cannot be dynamically updated with attributes, such as the request time, the processing time, the geo location of the source IP, etc. Luckily when using NATS and kafka we can easily add these attributes as message headers.

Another issue the the amount of consumers. What if we have 100 consumers subscribed to the NATS subject or the kafka topic? Most would think the parsing of the transaction in 100 microservices instead of a simple JSON parsing would be cheaper.

Is it? I've created a sample test for this:

package main

import (
  "encoding/json"
  "fmt"
  "math/rand"
  "strings"
  "time"
)

var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

type TransactionObject struct {
  Method    string
  Path      string
  QueryArgs map[string]string
  Cookies   map[string]string
  Headers   map[string]string
}

func generateText(
  stringLength int,
) string {
  b := make([]rune, stringLength)
  for i := range b {
   b[i] = letterRunes[rand.Intn(len(letterRunes))]
  }
  return string(b)
}

func generatePath() string {
  var sections []string
  amount := rand.Intn(10)
  for range amount {
   sectionLength := 1 + rand.Intn(15)
   sections = append(sections, generateText(sectionLength))
  }
  return "/" + strings.Join(sections, "/")
}

func generateQuery() string {
  amount := rand.Intn(4)
  if amount == 0 {
   return ""
  }
  var sections []string
  for range amount {
   nameLength := 1 + rand.Intn(15)
   valueLength := 1 + rand.Intn(15)
   query := fmt.Sprintf("%v=%v", generateText(nameLength), generateText(valueLength))
   sections = append(sections, query)
  }
  return "?" + strings.Join(sections, "&")
}

func generateCookies() string {
  amount := rand.Intn(3)
  if amount == 0 {
   return ""
  }
  var sections []string
  for range amount {
   nameLength := 1 + rand.Intn(15)
   valueLength := 1 + rand.Intn(15)
   cookie := fmt.Sprintf("Set-Cookie: %v=%v", generateText(nameLength), generateText(valueLength))
   sections = append(sections, cookie)
  }
  return "\n" + strings.Join(sections, "\n")
}

func generateHeaders() string {
  amount := rand.Intn(10)
  if amount == 0 {
   return ""
  }
  var sections []string
  for range amount {
   nameLength := 1 + rand.Intn(15)
   valueLength := 1 + rand.Intn(15)
   header := fmt.Sprintf("%v: %v", generateText(nameLength), generateText(valueLength))
   sections = append(sections, header)
  }
  return "\n" + strings.Join(sections, "\n")
}

func generateTransactionText() string {
  var lines []string

  line := fmt.Sprintf(
   "GET %v%v HTTP/1.1%v%v",
   generatePath(),
   generateQuery(),
   generateHeaders(),
   generateCookies(),
  )

  lines = append(lines, line)
  return strings.Join(lines, "\n")
}

func generateMap(
  sizeLimit int,
) map[string]string {
  amount := rand.Intn(sizeLimit)
  generatedMap := make(map[string]string)
  for range amount {
   generatedMap[generateText(15)] = generateText(15)
  }
  return generatedMap
}

func generateTransactionObject() *TransactionObject {
  return &TransactionObject{
   Method:    "GET",
   Path:      generatePath(),
   QueryArgs: generateMap(4),
   Cookies:   generateMap(3),
   Headers:   generateMap(10),
  }
}

func parseTransactionText(
  text string,
) {
  /*
  out of scope for this blog.
  we use a proprietary parsed, but the GO parser can be used as well
   */
}

func main() {
  transactionsAmount := 10000

  var objects []string
  var texts []string
  for range transactionsAmount {

   text := generateTransactionText()
   texts = append(texts, text)

   object := generateTransactionObject()
   bytes, err := json.Marshal(object)
   if err != nil {
    panic(err)
   }
   objects = append(objects, string(bytes))
  }

  activations := 1000000

  startTimeText := time.Now()
  for i := range activations {
   text := texts[i%transactionsAmount]
   parseTransactionText(text)
  }
  passedTimeText := time.Since(startTimeText)

  startTimeObject := time.Now()
  for i := range activations {
   text := objects[i%transactionsAmount]
   var transactionObject TransactionObject
   err := json.Unmarshal([]byte(text), &transactionObject)
   if err != nil {
    panic(err)
   }
  }
  passedTimeObjects := time.Since(startTimeObject)
  fmt.Printf("text per call time: %v\n", passedTimeText/time.Duration(activations))
  fmt.Printf("objects per call time: %v\n", passedTimeObjects/time.Duration(activations))
}

and the results are:

JSON parsing ~6 microseconds.

Text parsing ~4 microseconds.

We find that JSON parsing has it cost.

Final Note

We find that using simple text instead of JSON for microservices communication is a good alternative that can be used for performance critical pipelines.

Full Blog TOC

Full Blog Table Of Content with Keywords Available HERE

Monday, February 24, 2025

NATS GUI in kubernetes

Sunday, February 9, 2025

Multi Metrics Scaling Using KEDA and Prometheus

Final Note

Monday, February 3, 2025

Should We Use JSON as Message Format?

Final Note