Sunday, October 30, 2022

Argo PostSync Hook vs Helm PostInstall Hook


 

In this post we will review a compatibility issue between Argo and Helm.


TL;DR

Using helm's postinstall hook in argo might never be run, hence should be avoided in some cases.


Helm vs. Argo

Helm is a package manager for kubernetes.

Argo is an open source tool used to manage CI/CD workflows in a kubernetes environment.

Argo actually wraps helm charts deployment as part of an argo workflow. However, in practice, argo does not run helm. It uses it own implementation to deploy the helm charts. This means that we have compatibility issues.


The Problem Scenario

In my case, I've had a helm chart using the helm post-install hook. Deploying the chart using helm on a kubernetes cluster works fine.


apiVersion: batch/v1
kind: Job
metadata:
name: my-job
annotations:
"helm.sh/hook": "post-install"
"helm.sh/hook-delete-policy": "hook-succeeded"
"helm.sh/hook-weight": "5"


 However deploying the chart using argo does not complete. Argo does not run the post-install hook.


The Problem Cause

The reason for the symptom is agro translating helm's post-install hook to argo's PostSync hook, which is documented as:


"Executes after all Sync hooks completed and were successful, a successful application, and all resources in a Healthy state."


That's not a precise documentation. 

Argo does not only wait for all pods to be alive, that is, answer to the kubernetes liveness probe. 

Argo also waits for the pods to be ready, that is, answer to the kubernetes readiness probe.

This has become an issue in my case, as the post install create entities that only after their creation the pods can be ready for service.


The Bypass

I've changed the job not to use helm hooks at all. This means that the job need to explicitly wait for the deployment, and then create the related entities that enable pods to be in a ready state. 

Notice that once removing the helm hooks, the job is run only once upon the deplyment. In my case I wanted to job to be run also in post-upgrade, so I used the trick as described in this post to cause rerun of the job by using the revision as part of the job name:


apiVersion: batch/v1
kind: Job
metadata:
name: my-job-{{ .Release.Revision }}


Final Note

In this post we've demonstrated a compatibiliy issue between helm and argo, and explained how can it be bypassed. For most of helm charts, that are not using post install hooks, or not depending on the hooks results to make the pods ready, this will not be an issue. For the charts that do fall into this category, a bypass should be implemented.


Yet Another Bug

A month had passed, and another argo hook compatibility bug was found...
Argo runs the pre-install hooks before any upgrade. This will probably cause many issues for a deployment that uses a pre-install hook. Let's hope that argo will be fixed somewhere in the near future.







No comments:

Post a Comment