Full Blog TOC

Full Blog Table Of Content with Keywords Available HERE

Monday, February 23, 2026

Add Grafana Alert for Restarting Pod


 


In this post we will configure an alert in grafana for restarting pods.

This relates to grafana version 11.5.1. Other version might have a different syntax.

Follow the next steps:

  • Open grafana GUI

  • Click on Alerting, Alert rules

  • Click on New alert rule

  • Enter rule name: restarting pods

  • Select prometheus as the data source

  • Make sure you have kube-state-metrics installed on the kubernetes cluster. In case it is not, install using:

    helm repo add kube-state-metrics https://kubernetes.github.io/kube-state-metrics

    helm repo update

    helm install kube-state-metrics kube-state-metrics/kube-state-metrics \

      --namespace kube-system \

      --create-namespace


  • Use the following PromQL query: sum by (namespace, pod) ( increase(kube_pod_container_status_restarts_total[5m]) ) > 1
  • Leave the threshold as: A > 0
  • Under Configure no data and error handling, set the Alert state if no data or all values are null to Normal
  • In the email notification message, use the following summary:

    Pod {{ $labels.pod }} restarted

  • In the email description use the following text:

    Pod {{ $labels.pod }} in namespace {{ $labels.namespace }}
    restarted more than once in the last 5 minutes.
    Current value: {{ $values.A.Value }}

  • Make sure to configure email provider such as SendGrid to enable sending emails from grafana


Monday, February 9, 2026

Avoiding Huge Docker Image For LLM Model


 


In this post we will review a method to reduce docker image size for an LLM inference image.

In general using LLM in a docker container tend to create images whose size might be over 20G.

Downloaded docker images from ECR or from NEXUS might be very long. In addition, saving images in these technologies might present a challenge. 

To avoid this we use the following:

1. We extract the high disk space usage folders out from the docker image, and save it into S3. Then we create a new slim image without these folders.

2. Upon deployment, we run an init container to download the file from S3, and extract it to an emptyDir volume.


Step 1: Extract folders and create slim image

The following script runs the original huge image build.

Then it extracts the huge folders ExportFolder1, ExportFolder2 into the OutputFile. This file should be loaded in AWS S3.

Next the script creates a new slim image without the folders ExportFolder1, ExportFolder2.


#!/usr/bin/env bash
set -e

cd "$(dirname "$0")"

DockerRegistry=my-registry.my-company
ProjectVersion=latest
BuildNumber=1234

OutputFile="${OutputFile:-llm-image-${BuildNumber}.tar.gz}"
SlimImageName="${DockerRegistry}/llm-image-slim${ProjectVersion}"

ExportFolder1="root/.cache"
ExportFolder2="usr/local/lib/python3.12"

TempFolder="$(mktemp -d)"
ImageFileSystem="${TempFolder}/rootfs"
TempContainer=llm-image-tmp

date
echo "build full image"
./build.sh


date
echo "extracting container to ${ImageFileSystem}"
mkdir -p "${ImageFileSystem}"
docker rm -f ${TempContainer}
docker create --name ${TempContainer} llm-image/dev:latest
docker export ${TempContainer} | tar -x -C "${ImageFileSystem}"
docker rm -f ${TempContainer}

date
echo "compressing folders to ${OutputFile}"
tar -czf "${OutputFile}" -C "${ImageFileSystem}" "${ExportFolder1}" "${ExportFolder2}"

date
echo "creating slim image ${SlimImageName}"
rm -rf "${ImageFileSystem}/${ExportFolder1}" "${ImageFileSystem}/${ExportFolder2}"
tar -C "${ImageFileSystem}" -c . | docker import - "${SlimImageName}"
docker push ${SlimImageName}

date
echo "cleaning up"
rm -rf "${TempFolder}"

date
echo "Done"



Step 2: Deployment

To handle the S3 download we create an extractor image.


Dockerfile:


FROM amazon/aws-cli


COPY files /

ENTRYPOINT ["/entrypoint.sh"]


Downloader script: entrypoint.sh


#!/usr/bin/env bash
set -e

localFileName="/models-local.tar.gz"

cd /local-storage

echo "Downloading ${MODELS_S3_PATH}"
aws s3 cp ${MODELS_S3_PATH} ${localFileName}

echo "Extracting ${localFileName}"
tar -xzf ${localFileName}

echo "Cleaning up..."
rm ${localFileName}



In the kubernetes deployment file, we add an init container.


initContainers:
- name: extract
image: modelsextract
env:
- name: MODELS_S3_PATH
value: {{ .Values.modelsS3Path | quote }}
volumeMounts:
- mountPath: /local-storage
name: local-storage


An emptyDir volume and mount for the exported folders:


volumes:
- name: local-storage
emptyDir:
sizeLimit: 100Gi


volumeMounts:
- mountPath: /root/.cache
subPath: root/.cache
name: local-storage
- mountPath: /usr/local/lib/python3.12
subPath: usr/local/lib/python3.12
name: local-storage


Also, make sure to use the slim image instead of the huge image in the deployment container.



Final Note

We have reviewed creation of a slim docker image that later downloads the extra data from AWS S3. To save cost and gain speed, the actual download should be done in the same region as the deployment, so make sure to copy it the a relevant regional AWS S3 bucket.


Wednesday, February 4, 2026

Using DuckDB in GO




In this post we show a simple example of using DuckDB in GO.

DuckDB features:

  • Can run in-memory 
  • Supports multiple files types, such as : JSON, NDJSON, CSV EXCEL
  • Supports multiple storage types, such as: AWS S3, GCP Storage, PostgreSQL

In the following example we store our data in parquet files in AWS S3.
We order the file using a folders structure that would later enable use to fetch only the files we need.


package duckdb

import (
"database/sql"
"fmt"
"testing"
"time"

_ "github.com/marcboeker/go-duckdb"
)

func TestValidation(_ *testing.T) {
db, err := sql.Open("duckdb", "")
kiterr.RaiseIfError(err)
defer db.Close()

_, err = db.Exec(`
INSTALL httpfs;
LOAD httpfs;
`)
kiterr.RaiseIfError(err)

_, err = db.Exec(`
SET s3_region='us-east-1';
SET s3_access_key_id='XXX';
SET s3_secret_access_key='XXX';
`)
kiterr.RaiseIfError(err)

createTable(db)
createData(db)
exportData(db)
}

func createTable(
db *sql.DB,
) {
_, err := db.Exec(`
CREATE TABLE IF NOT EXISTS events (
my_text TEXT,
my_value INTEGER,
event_time TIMESTAMP
);
`)
kiterr.RaiseIfError(err)
}

func createData(
db *sql.DB,
) {
statement, err := db.Prepare(`
INSERT INTO events (my_text, my_value, event_time)
VALUES (?, ?, ?)
`)
kiterr.RaiseIfError(err)

startTime := kittime.ParseDay("2000-01-01")
for day := range 2 {
for hour := range 24 {
for dataIndex := range 2 {
dayTime := startTime.Add(time.Duration(day) * 24 * time.Hour)
hourTime := dayTime.Add(time.Duration(hour) * time.Hour)
fmt.Printf("saving data %v for %v\n", dataIndex, kittime.NiceTime(&hourTime))
_, err = statement.Exec("aaa", 11, &hourTime)
kiterr.RaiseIfError(err)
}
}
}
}

func exportData(
db *sql.DB,
) {
_, err := db.Exec(`
COPY (
SELECT
my_text,
my_value,
event_time,
CAST(event_time AS DATE) AS event_date,
strftime(event_time , '%H') AS event_hour
FROM events
)
TO 's3://my-bucket/duck/'
(
FORMAT PARQUET,
COMPRESSION GZIP,
PARTITION_BY (event_date, event_hour)
);
`)
kiterr.RaiseIfError(err)
}


The result in AWS S3 is:


$ aws s3 ls --recursive s3://my-bucket
2026-02-04 11:27:40 542 duck/event_date=2000-01-02/event_hour=10/data_0.parquet
2026-02-04 11:27:45 541 duck/event_date=2000-01-02/event_hour=13/data_0.parquet
2026-02-04 11:27:47 541 duck/event_date=2000-01-02/event_hour=14/data_0.parquet
2026-02-04 11:27:49 542 duck/event_date=2000-01-02/event_hour=15/data_0.parquet
2026-02-04 11:27:50 541 duck/event_date=2000-01-02/event_hour=16/data_0.parquet
2026-02-04 11:27:52 542 duck/event_date=2000-01-02/event_hour=17/data_0.parquet
2026-02-04 11:27:54 541 duck/event_date=2000-01-02/event_hour=18/data_0.parquet
2026-02-04 11:27:58 541 duck/event_date=2000-01-02/event_hour=20/data_0.parquet
2026-02-04 11:28:00 542 duck/event_date=2000-01-02/event_hour=21/data_0.parquet
2026-02-04 11:28:03 542 duck/event_date=2000-01-02/event_hour=23/data_0.parquet


So for a quick start project that needs to store data overtime, and read it up need it is nice solution. For a heavier project that requires more data with high control over the storage type, parallelism, and timing, it might require creation of proprietary code.