Wednesday, September 25, 2019

Using a common javascript folder for several NodeJS applications

There are some alternatives for sharing javascript code between multiple projects.
The most obvious one is to build a common package, and use it in every project.

Great, Right?

Well, not really.

What if you have multiple projects, each using the same common package, and you update the common package every day. What should you do after changing the common package code, and you want to use the new common package in each project?


The required steps are:

  1. Update the common package code
  2. Update the common package version
  3. Push the changes
  4. Publish the updated common package
But wait, we're not done yet.
For each project that is using the common package, follow the next steps:
  1. Update the project package.json to include the new common package version
  2. Run `npm install` to update the node_module, and to update the package-lock.json
  3. Push the change
Now, this is not fun.
Especially if you have many projects...

A Common Code Alternative

If you are the only maintainer of the package, it might be much simpler to directly include the files from each project. It is especially easy if you have a monorepo  GIT.

This prevents the updates hell for each time you update the common package.
It does means you need to invest time in setup of the projects and the common package build, but this is a one-time setup, and then every time you change the common package code, the steps are:
  1. Update the common package code
  2. Push the changes
  3. Run build for all of the projects. This can be done automatically by Jenkins, without manual actions.

Common code setup

Place the common code and the projects in the same root folder, for example:
  • GIT root folder
    • common-package
      • common1.js
      • common2.js
    • project1
    • project2
    • ...
In each project include the common package directly, for example:
const commonPackage = require('common-package/common1');
And use webpack.config.js to handle the common package includes:

module.exports = {

...

resolve: {
  alias: {
    'common-package': path.resolve(__dirname, "../commons-package"),
  }
},

...

};

Docker build

In case you build the projects using docker, you will need to run a script before the docker build, to handle the following:
  • copy the common-package folder into project1
  • set an environment variable to the copied location
  • use the environment variable in webpack
For example, the build-docker.sh could copy the folder:
rm -rf ./common-package
cp -r ../common-package ./
docker build ...
rm -rf ./common-package

The docker file should use the copied folder:
FROM node:12.7 as builder
WORKDIR /app
COPY src/package.json ./
COPY src/package-lock.json ./
RUN npm install

ENV COMMON_DIRECTORY=./common-package
COPY common-package ./common-package

COPY src ./
RUN npm run build

CMD ["npm", "start"]
And the webpack.config.js should be updated to:
const commonDir = process.env.COMMON_DIRECTORY || "../commons-package";

...

    'common-package': path.resolve(__dirname, commonDir),
...



Automatic NPM version increment upon Jenkins build



You have a node javascript package, that is automatically built using a Jenkins job.
The Jenkins job is automatically triggered upon push of updates to GIT.

That's very nice, but you want to automatically increment the version in the library package.json, without you manually handling this upon every push.



This can be done using Jenkins pipeline job. The job should do the following:
  1. Build the package - to ensure that it has no compilation and tests errors
  2. Update the version in package.json, and push the update back to GIT
  3. Rebuild the package, this time with the updated version
  4. Publish the package

But wait, what do you think would happen once the job updates the GIT in step #2?
We will get an infinite loop, since the push would reactivate the build job.
To avoid this, we need to identify if the last push was made by the Jenkins, and if so, do not update the version. So the updated steps are:

  1. Build the package - to ensure that it has no compilation and tests errors
  2. Check the last push, and if it was made by Jenkins, abort.
  3. Update the version in package.json, and push the update back to GIT
  4. Rebuild the package, this time with the updated version
  5. Publish the package

Checking the Last GIT Push

This pipeline groovy script checks the last commit.
The commit and publish step will run only if the last commit does not include the comment: 'update_version_by_jenkins'.
def gitUrl = 'git@bitbucket.org:myrepo/mygit.git'
def lastCommit = ''
sshagent (credentials: ['1938eb47-e7e8-4fed-b88c-87561e128345']) {
  sh('git config --global user.name "Jenkins"')
  sh('git config --global user.email "jenkins@company.com"')
  sh('git checkout master')
  sh('git pull origin master')
  sh('git checkout HEAD --force')
  lastCommit = sh([script: 'git log -1', returnStdout: true])
}

if (lastCommit.contains('update_version_by_jenkins')){
  echo "self triggered build, no need to update version"
} else {
  CommitAndPublish()
}

Update the Version

Update of the version is handled by npm.
In this case we update the third part of the version.
For example: version 1.2.29 will be updated to 1.2.30.

sh '/usr/bin/npm version patch'

Push the Updated package.json

Now we need to push the changes back to GIT.
sh "git add package.json"
sh "git commit -m update_version_by_jenkins"

sshagent (credentials: ['1938eb47-e7e8-4fed-b88c-87561e128345']) {
  sh('git push origin master')
}

Notice

The same logic can be used in a limited mode, for example: increment the version only upon merge to the master branch.

Monday, September 16, 2019

Ethereum contract transactions decoding


Ethereum is a platform for decentralized applications.
It is based on a database containing blocks. The first block is the genesis block, and it is followed by newer blocks. Each block contains, among other, set of transactions that were committed.

The transactions could be ether transfer from one account to another, but the transactions could also represent a contract method activation.

Now, let's assume you have a contract, and an access to the web3 API for the ethereum, and you want to view the transactions related to activation of a specific contract method.

How can we get the method activation history?

We can scan the ethereum blocks, and get transaction from each block:
const Web3 = require('web3');
const web3 = new Web3('ws://127.0.0.1:7475');
// assumming we have a blockNumber
const block = await web3.eth.getBlock(blockNumber, true);
block.transactions.forEach(async (transaction) => {
 // assuming we want a specific contract instance
 if (transaction.to === myContractAddress) {
  // decode the transaction input
 }
});

But wait...
The transaction.input is a hex string of the related method data. How can we know which method is being called?

The answer is that the first 4 bytes of the transaction.input are the hash code of the method. I've tried several ways to get this hash, and found the easiest is to use ethereum web3 API to create hash using dummy variables. Notice that the web3 API creates hash including the actual arguments data sent, but we are only interested in the method hash, so we use the 10 first chars.
(10 chars is 4 bytes, for example: 0x12345678)

const contract = new web3.eth.Contract(abi, contractAddress);
const methods = contract.methods;
const hashes = {};

// example for activation of function with 2 ints arguments
let hash = methods['MyMethod1'](1, 2).encodeABI();
hashes[hash.substring(0, 10)] = 'MyMethod1';

// example for activation of function with address argument
const dummyAddress = '0x7d6BeA6003c1c8944cAECe5b4dde4831180A1eC2';
hash = methods['MyMethod2'](dummyAddress).encodeABI()
hashes[hash.substring(0, 10)] = 'MyMethod2';


Once we have all our methods of interest in the "hashes" variable, we can locate the transactions related to these method hashes:

// decode the transaction input
if (transaction.input.length > 10) {
 const hash = transaction.input.substring(0, 10);
 if (hashes[hash]) {
  console.log(`method ${hashes[hash]} located`);
 }
}

Next we use this information for any purpose we'll choose.

Timelion sparse graph display

Kibana Timelion visualization is a great tool!
However these are some small secrets you need when using it.

Suppose you have a document stored in the ElasticSearch index once in a minute or so, in an unstable rate. The document has a "value1" field that you want to display an average graph for, so you could use the Timelion expression:


.es(index=MyIndexPattern*,timefield=time,metric=avg:value1)

Running this expression using a "1 minute" interval, seems fine:




But when you change the interval to auto or to "1 hour", you get weird or even empty results:



Why?

Well, Timelion splits the data into buckets sized by the selected interval. In case the bucket has no documents for the bucket time period, it does not display any point in the graph.

We can try bypassing the problem by forcing the visualization to use a "1 minute" interval, but in this case, if the user selects a large time span for the graph, an error occurs:

Timelion: Error: Max buckets exceeded: 10080 of 2000 allowed. 
Choose a larger interval or a shorter time span


What can we do?

The solution is very simple: We should let the Timelion for fill in the missing values.
For example, we can ask Timelion to use the last existing value if a bucket has no documents.
This is done using the "fit" function (new line added for readability, don't use new lines in the actual expression) :

.es(index=MyIndexPattern*,timefield=time,metric=avg:value1)
.fit(mode=carry)

This creates a "steps wise" graph:



Other fit modes can be used:
(taken from https://github.com/elastic/kibana/issues/17717)


NameDescriptionExamples
NoneDon't draw that value on the graph[2, null null, 8]
CarryUse the last non null value before that[2, 2, 2, 8]
NearestUse the closest value (either before or after) that was non null[2, 2, 8, 8]
LookaheadUse the next non null value after that (opposite of Carry)[2, 8, 8, 8]
AverageUse the average of the last and next non null value[2, 5, 5, 8]
Linear ScaleLinear interpolate between closest values[2, 4, 6, 8]
Explicit valueSpecify an explicit value (x), that should be used instead[2, x, x, 8]

You can select the one most matching your data.


Tuesday, September 10, 2019

How to automatically load dashboards into Kibana?

Kibana is a great tool to visualize your ElasticSearch data and I've used it as part of my kubernetes cluster deployment. You can configure your index patterns, visualizations, and dashboards to fully cover the product needs.



However, once the product is reinstalled, the index patterns, visualizations, and dashboards configuration are lost. You don't really want to reinstall everything from scratch upon each resinstall of the Kibana.

I've handled it using export and import of the Kibana dashboards.

Export of the Kibana configuration

After configuration kibana dashboard to your needs, you need to manually export its configuration to a JSON file.
Use the following command to locate your dashboard ID:

curl -s 'http://localhost:5601/api/saved_objects/_find?type=dashboard'

In case the Kibana is running within a kubernetes, you can do the same using kubectl
kubectl exec KIBANA_POD_NAME -- \
  curl -s 'http://localhost:5601/api/saved_objects/_find?type=dashboard' 

The output will indicate the dashboard ID:
{
  "page": 1,
  "per_page": 20,
  "total": 1,
  "saved_objects": [
    {
      "type": "dashboard",
      "id": "44532f10-d211-11e9-9c96-fbd87ecb6df2", <== This is the ID
      "attributes": {
        "title": "MyDashboard",

Next use the DASHBOARD_ID to export it:
curl -s \
'http://localhost:5601/api/kibana/dashboards/export?dashboard=DASHBOARD_ID' \
> dashboard.json

Import of the Kibana configuration

To import the dashboard use the dashboard.json file that you've had created as part of the export, and run the following command:

curl -s -X POST -H 'kbn-xsrf: true' -H 'Content-Type: application/json' \ 
  'http://localhost:5601/api/kibana/dashboards/import?force=true' \
 --data-binary "@/dashboard.json"

In case of a kubernetes installation, create an image containing the dashboard.json on the root folder, and running the following entrypoint:
#!/usr/bin/env bash
/import_dashboard.sh &
exec /usr/local/bin/kibana-docker
Where the import_dashboard.sh script should loop with the import command until it is success (as it should wait for kibana startup).
function importDashboard {
	result=$(curl -s -X POST -H 'kbn-xsrf: true' \
	  -H 'Content-Type: application/json' \
	  'http://localhost:5601/api/kibana/dashboards/import?force=true' \
	  --data-binary "@/dashboard.json")
	  
	rc=$?

	if [[ "$rc" == "0" ]]; then
		if [[ $result == *"objects"* ]]; then
			return 0
		fi
	fi
	return 1
}

function importLoop {
    until importDashboard
    do
        echo "Waiting for Kibana startup"
        sleep 3
    done
}




For more details:

Using ingress for bare metal kubernetes cluster

You have a bare metal kubernetes cluster, and you want to enable access to it from the outer world.
So far, living within the kubernetes cluster, everything was simple. When you want to expose a new microservice, you configure a kubernetes service, and that's it.

But now you want your actual customers to access the various services. How can you do it?

You can configure the services to use nodePort, and then each service is accessible on its own port, but that not user friendly. The users usually prefer using FQDN instead of IP:port syntax.

Also, in my case, I've had a module, aka the fetcher-module, running both in the kubernetes cluster as part of one of the deployments, and running out of the kubernetes as part of an external process. This module would then access multiple services using the names service-a and service-b.




So I wanted to fetcher-module to use the same service names: service-a and service-b regardless whether it is within the kubernetes cluster, or out of the kubernetes cluster.

To do this, I've created an ingress object in kubernetes:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - host: service-a
    http:
      paths:
      - backend:
          serviceName: service-a
          servicePort: 80
  - host: service-b
    http:
      paths:
      - backend:
          serviceName: service-b
          servicePort: 80

But, kubernetes does not provide the actual ingress implementation.
You need to install a ingress controller.
I've used NGINX controller, which can be installed using helm:

helm install stable/nginx-ingress \
  --name nginx-ingress \
  --namespace ingress \
  --set controller.kind=DaemonSet \
  --set controller.daemonset.useHostPort=true \
  --set controller.service.enabled=false \
  --set controller.hostNetwork=true

Notice that I've configured the NGINX controller to use the host network.
This means that it runs as a DaemonSet on each kubernetes node, listening on port 80 and 443, so you don't have any other process using these ports on the nodes.

Also, the services can be used in a transparent in/out of the kubernetes cluster only if they are configured on the same ports as the NGINX controller. In this case I've used port 80.

One last issue it the name resolution. In my case, I've had to add the service-a and service-b to the /etc/hosts on the machine out of the kubernetes cluster. A better solution is to add these to a DNS server. Notice that this means that the service-a name points to a single node IP, and it is a single point of failure. In my case this was a reasonable limitation. If this is not the case for you, consider using HAProxy in front of the NGINX controller.


Additional resources:

  1. Kubernetes ingress 
  2. NGINX controller
  3. HAProxy

Monday, September 2, 2019

Jenkins pipeline for a monorepo git


What is a mono repo?

This means you have a single GIT repository for multiple software services.
For this example, lets assume that you have a folder for each service under the root folder of the git repository: service1, service2,  service3.
Each service build should create a docker image, and push it to a docker repository.


What is Jenkins pipeline?

Jenkins pipeline is a method to implement CI/CD using a groovy like script.
See Jenkins documents for more details.


So, what is our purpose?

Whenever a push is made to the GIT, we want a build job in Jenkins to run.
The job should build only the docker images whose source was changed.
Then it should push the docker images, using a tag name based on the GIT branch that was updated.
We want to build only the images whose source was updated since the last Jenkins successful build.

How to create this?

In Jenkins, create new job, and select pipeline.


The job can be triggered to start using a Jenkins webhook, which means that any push to the central GIT repository (such as github and bitbucket) notifies Jenkins of a change.
Alternatively, the job can use polling to check when a change was pushed.

Configure the job to

  • Use "Pipeline script from SCM"
  • Specify the GIT repo details
  • Specify "Jenkinsfile" as the "script path"
  • Use branch specifier of **/**



Now, add Jenkinsfile in the root of your GIT repo.

// no def keyword here to make this global variable
imagesNames = [
    "service1",
    "service2",
    "service3"
]

def BuildImage(imageName){
    stage ("${imageName}") {
        dir("images/${imageName}") {
            def branch = env.GIT_BRANCH
            // replace the IP here with your docker repository IP
            def tagPrefix = "10.1.2.3:5000/${imageName}/${branch}:"
            def tagBuildName = "${tagPrefix}${env.BUILD_NUMBER}"
            def tagLatestName = "${tagPrefix}latest"
            // we assume each service has a build.sh script that receives the docker tag name to build
            sh "./build.sh -tag ${tagBuildName}"
            sh "docker tag ${tagBuildName} ${tagLatestName}"
            sh "docker push ${tagBuildName}"
            sh "docker push ${tagLatestName}"
        }
    }
}

def BuildByChanges(){
    def buildImages = [:]
    def gitDiff = sh(script: "git diff --name-only ${env.GIT_COMMIT} ${env.GIT_PREVIOUS_SUCCESSFUL_COMMIT}", returnStdout: true)

    echo "git diff is:\n${gitDiff}"
    gitDiff.readLines().each { line ->
  imagesNames.each { imageName ->
   if (line.startsWith(imageName)) {
    buildImages[imageName] = true
   }
  }
    }
    echo "building images: ${buildImages}"
    buildImages.each { imageName, build ->
        BuildImage(imageName)
    }
}

def IsBuildAll(){
    if (env.GIT_PREVIOUS_SUCCESSFUL_COMMIT == null){
        echo "No previous successful build on this branch, performing full build"
        return true
    }
    if (env.GIT_BRANCH == 'dev'){
        echo "dev branch, performing full build"
        return true
    }
    if (env.GIT_BRANCH == 'master'){
        echo "The master branch, performing full build"
        return true
    }
    return false
}

pipeline {
    agent any

    stages{
        stage("prepare") {
            steps {
                script {
                    echo "Branch ${env.GIT_BRANCH}"
                    if (IsBuildAll()){
                        imagesNames.each { imageName ->
                            BuildImage(imageName)
                        }
                    } else {
                        BuildByChanges()
                    }
                 }
            }
        }
    }
}

That's it, your build is ready.
Push a change to one of the services, and see the results in Jenkins.


Create docker registry and proxy

Your team starts working with docker, and as part of your build process you need to create a two docker services.




  1. docker proxy
    The docker proxy caches images that you download from the internet. The purpose of such server is to prevent download of an image that was already cached. This is relevant for new machines pulling docker images, and for existing machine that had its local docker images removed.
  2. docker local registry
    The docker local registry stores the images created by your team. This is relevant when you don't want to store the images in a public registry, such as dockerhub.


To perform this, use the following docker compose file:

version: '3.1'
services:
    docker-registry:
        restart: always
        image: registry:2
        ports:
          - 5000:5000
        environment:
          REGISTRY_STORAGE_DELETE_ENABLED: "true"
        volumes:
          - /opt/docker-registry/registry:/var/lib/registry
        networks:
          - node-network
    docker-proxy:
        restart: always
        image: registry:2
        ports:
          - 6000:5000
        environment:
          REGISTRY_PROXY_REMOTEURL: "http://registry-1.docker.io"
        volumes:
          - /opt/docker-registry/proxy:/var/lib/registry
        networks:
          - node-network
    docker-gui:
        restart: always
        image: joxit/docker-registry-ui:static
        ports:
          - 80:80
        environment:
          REGISTRY_URL: http://docker-registry:5000
          DELETE_IMAGES: "true" 
        networks:
          - node-network
networks:
  node-network:
    driver: bridge

Login to the server that you want the servers to run. In this example, suppose the server IP is 10.1.2.3.

Create a new folder, for example under /opt/docker-registry, and create the docker compose file in it.
Then run:

cd /opt/docker-registry
docker-componse up -d

Your servers are ready:

  • local registry server running on port 10.1.2.3:5000
  • docker proxy running on port 10.1.2.3:6000
  • GUI for the local registry running on http://10.1.2.3

Notice that each machine that should use the servers, should update its docker configuration.
In /etc/docker/daemon.json, add


{
  "insecure-registries" : ["10.1.2.3:5000","10.1.2.3:6000"],
  "registry-mirrors" : ["http://10.1.2.3:6000"]
}


and the restart docker service:
sudo service docker restart


If you want to use a secured docker registry, follow the instructions in the docker documentation.


See also Delete images from a Private Docker Registry.