From locally running Node application to Cloud based Kubernetes Deployment image 22

From locally running Node application to Cloud based Kubernetes Deployment

In this article I will discuss the steps I had to go through in order to take my locally running Node application – with various hard coded and sometimes secret values – and deploy it on a cloud based Kubernetes cluster.  I will discuss the containerization of the application, the replacement of hard coded values with references to environment variables, the Docker container image manipulation, the creation of the Kubernetes yaml files for creating the Kubernetes resources and finally the actual execution of the application.

Background

A few days ago in Tokyo I presented at the local J-JUG event as part of the Oracle Groundbreakers Tour of Asia and Pacific. I had prepared a very nice demo: an update in a cloud based Oracle Database was replicated to another cloud based database – a MongoDB database. In this demo, I first used Twitter as the medium for exchanging the update event and then the Oracle Event Hub (managed Apache Kafka) cloud service.

This picture visualizes what I was trying to do:

image

However, my demo failed. I ran a local Node (JS) application that would be invoked over HTTP from within the Oracle Database – and that would publish to Twitter and Kafka. When I was working on the demo in my hotel room, it was all working just fine. I used ngrok to expose my locally running application on the public internet – a great way to easily integrate local services in cloud-spanning demonstrations. It turned out that use of ngrok was not allowed by the network configuration at the Oracle Japan office where I did my presentation. There was no way I could get my laptop to create the tunnel to the ngrok service that would allow it to hand over the HTTP request from the Oracle Database.

This teaches me a lesson. No matter how convenient it may be to run stuff locally – I really should be able to have all components of this demo running in the cloud. And the most obvious way – apart from using a Serverless Function – is to deploy that application on a Kubernetes cluster. Even though I know how to get there – I realized the steps are not as engrained in my head and fingers as should be the case – especially in order to restore my demo to its former glory in less than 30 minutes.

The Action Plan

My demo application – somewhat quickly put together – contains quite a few hard coded values, including confidential settings such as Kafka Server IP address and Topic name as well as Twitter App Credentials. The first step I need to take is to remove all these hard coded values from the application code and replace them with references to environment variables.

The second big step is to build a container for and from my application. This container needs to provide the Node runtime, have all npm modules used by the application and contain the application code itself. The container should automatically start the application and expose the proper port. At the end of this step, I should be able locally run my application in a Docker container – injecting values for the environment variables with the Docker run command.

The third step is the creation of a Container Image from the container – and pushing that image (after meaningful tagging) to a container registry.

Next is the preparation of the Kubernetes resources. My application consists of a Pod and a Service (in Kubernetes terms) that are combined in a Deployment in its own Namespace. The Deployment makes use of two Secrets – one contains the confidential values for the Kafka Server (IP address and topic name) and the other the Twitter client app credentials. Values from these Secrets are used to set some of the environment variables. Other values are hard coded in the Deployment definition.

After arranging access to a Kubernetes Cluster instance – running in the Oracle Cloud Infrastructure, offered through the Oracle Kubernetes Engine (OKE) service – I can deploy the K8S resources and make the application running. Now, finally, I can point my Oracle Database trigger to the service endpoint on Kubernetes in the cloud and start publishing tweets for all relevant database updates.

At this point, I should – and you likewise after reading the remainder of this article – have a good understanding for how to Kubernetalize a Node application, so that I will never be stymied in my demos by stupid network problems. I want to not even think twice about taking my local application and turn it into a containerized application that is running on Kubernetes.

Note: the sources discussed in this article can be found on GitHub: https://github.com/lucasjellema/groundbreaker-japac-tour-cqrs-via-twitter-and-event-hub/tree/master/db-synch-orcl-2-mongodb-over-twitter-or-kafka.

 

1. Replace Hard Coded Values with Environment Variable References

My application contained the hard coded values of the Kafka Broker endpoint and my Twitter App credentials secrets. For a locally running application that is barely acceptable. For an application that is deployed in a cloud environment (and whose source are published on GitHub) that is clearly not a good idea.

Any hard coded value is to be removed from the code – replaced with a reference to a an environment variable, using the Node expression:

process.env.NAME_OF_VARIABLE

or

process.env[‘NAME_OF_VARIABLE’]

Let’s for now not worry how these values are set and provided to the Node application.

I have created a generic code snippet that will check upon starting the application if all expected Environment Variables have been defined and if not writes a warning to the output:

const REQUIRED_ENVIRONMENT_SETTINGS = [
    {name:"PUBLISH_TO_KAFKA_YN" , message:"with either Y (publish event to Kafka) or N (publish to Twitter instead)"},
    {name:"KAFKA_SERVER" , message:"with the IP address of the Kafka Server to which the application should publish"},
    {name:"KAFKA_TOPIC" , message:"with the name of the Kafka Topic to which the application should publish"},
    {name:"TWITTER_CONSUMER_KEY" , message:"with the consumer key for a set of Twitter client credentials"},
    {name:"TWITTER_CONSUMER_SECRET" , message:"with the consumer secret for a set of Twitter client credentials"},
    {name:"TWITTER_ACCESS_TOKEN_KEY" , message:"with the access token key for a set of Twitter client credentials"},
    {name:"TWITTER_ACCESS_TOKEN_SECRET" , message:"with the access token secret for a set of Twitter client credentials"},
    {name:"TWITTER_HASHTAG" , message:"with the value for the twitter hashtag to use when publishing tweets"},
    ]

for(var env of REQUIRED_ENVIRONMENT_SETTINGS) {
  if (!process.env[env.name]) {
    console.error(`Environment variable ${env.name} should be set: ${env.message}`);  
  } else {
    // convenient for debug; however: this line exposes all environment variable values - including any secret values they may contain
    // console.log(`Environment variable ${env.name} is set to : ${process.env[env.name]}`);  
  }
}

This snippet is used in the index.js file in my Node application. This file also contains several references to process.env – that used to be hard coded values.

It seems convenient to use npm start to run the application – for example because it allows we to define environment variables as part of the application start up. When you execute npm start, npm will check the package.json file for a script with key “start”. This script will typically contain something like “node index” or “node index.js”. You can extend this script with the definition of environment variables to be applied before running the Node application, like this (taken from package.json):

"scripts": {

"start": "(export KAFKA_SERVER=myserver.cloud.com && export KAFKA_TOPIC=cool-topic ) || (set KAFKA_SERVER=myserver.cloud.com && set KAFKA_TOPIC=cool-topic && set TWITTER_CONSUMER_KEY=very-secret )&& node index",

…

},

Note: we may have to cater for Linux and Windows environments, that treat environment variables differently.

 

2. Containerize the Node application

In my case, I was working on my Windows laptop, developing and testing the Node application from the Windows command line. Clearly, that is not an ideal environment for building and running a Docker container. What I have done is use Vagrant to run a Virtual Machine with Docker Engine inside. All Docker container manipulation can easily be done inside this Virtual Machine.

Check out the Vagrantfile that instructs Vagrant on leveraging VirtualBox to create and run the desired Virtual Machine. Note that the local directory that contains the Vagrantfile and from which the vagrant up command is executed is automatically shared into the VM, mounted as /vagrant.

Note: I have used this article for inspiration for this section of my article: https://nodejs.org/en/docs/guides/nodejs-docker-webapp/ .

Note 2: I use the dockerignore file to exclude files and directories in the root folder that contains the Dockerfile. Anything listed in dockerignore is not added to the build context and will not end up in the container.

A Docker container image is built using a Docker build file. The starting point of the Docker is the base image that is subsequently extended. In this case, the base image is node:10.13.0-alpine, a small and recent Node runtime environment. I create a directory /usr/src/app and have Docker set this directory as it focal point for all subsequent actions.

Docker container images are created in layers. Each build step in the Dockerfile adds a layer. If the build is rerun, only layers for steps in the Dockerfile that have changed are rerun and only changed layers are actually uploaded when the image is pushed. Therefore, it is smart to have the steps that change the most at the end of the Dockerfile. In my case, that means that the application sources should be copied to the container image at a very late stage in the build process.

First I only copy the package.json file – assuming this will not change very frequently. Immediately after copying package.json, all node modules are installed into the container image using npm install.

Only then are the application sources copied. I have chose to expose port 8080 from the container – this is an extremely arbitrary decision. However, the environment variable PORT – whose value is read in index.js using process.env.PORT – needs to correspond exactly to whatever port I expose.

Finally the instruction to to run the Node application when the container is run: npm start passed to the CMD instruction.

Here is the complete Dockerfile:

# note: run docker build in a directory that contains this Docker build file, the package.json file and all your application sources and static files 
# this directory should NOT contain the node-modules or any other resources that should not go into the Docker container - unless these are explicitly excluded in a .Dockerignore file!
FROM node:10.13.0-alpine

# Create app directory
WORKDIR /usr/src/app

# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm@5+)
COPY package*.json ./

RUN npm install

# Bundle app source - copy Node application from the current directory
COPY . .

# the application will be exposed at port 8080 
ENV PORT=8080

#so we should expose that port
EXPOSE 8080
# run the application, using npm start (which runs the start script in package.json)
CMD [ "npm", "start" ]

Running docker build – to be exact, I run: docker build -t lucasjellema/http-to-twitter-app . – gives the following output:

SNAGHTML3bb00fad

 

The container image is created.

I can now run the container itself, for example with:

docker run -p 8090:8080 -e KAFKA_SERVER=127.1.1.1 -e KAFKA_TOPIC=topic -e TWITTER_CONSUMER_KEY=818 -e TWITTER_CONSUMER_SECRET=secret -e TWITTER_ACCESS_TOKEN_KEY=tokenkey -e TWITTER_ACCESS_TOKEN_SECRET=secret lucasjellema/http-to-twitter-app

SNAGHTML3bb22a02

 

The container is running, the app is running and at port 8090 on the Docker host should I able to access the application:http://192.168.188.120:8090/about (not: 192.168.188.120 is the IP address exposed  by the Virtual Machine managed by Vagrant)

image

3. Build, Tag and Push the Container Image

In order to run a container on a Kubernetes cluster – or indeed on any other machine then the one on which it was built – this container must  be shared or published. The easiest way of doing so is through the use of Container (Image) Registry, such as Docker Hub. In this case I simply tag the container image with the currently applicable tag of lucasjellema/http-to-twitter-app:0.9:

docker tag lucasjellema/http-to-twitter-app:latest lucasjellema/http-to-twitter-app:0.9

I then push the tagged image to the Docker Hub registry: (note: before executing this statement, I have used docker login to connect my session to the Docker Hub):

docker push lucasjellema/http-to-twitter-app:0.9

 

SNAGHTML3bb89eb8

At this point, the Node application is publicly available for pull – and can be run on any Docker compatible container engine. It does not contain any secrets – all dependencies (such as Twitter credentials and Kafka configuration) needs to be injected through environment variable settings.

4. Prepare Kubernetes Resources (Pod, Service, Secrets, Namespace, Deployment)

When the Node application is running on Kubernetes it shall have a number of constituents:

  • a namespace cqrs-demo to isolate the other artifacts in their own compartment
  • two secrets to provide the sensitive and dynamic, deployment specific details regarding Kafka and regarding the Twitter client credentials
  • a Pod for a single container – with the Node application
  • a Service – to expose the Pod on an (externally) accessible endpoint and guide requests to the port exposed by the Pod
  • a Deployment http-to-twitter-app – to configure the Pod through a template that is used for scaling and redeployment

The separate namespace cqrs-demo is created with a simple kubectl command:

kubectl create namespace cqrs-demo

The two secrets are two sets of sensitive data entries. Each entry has a key and a value and the value of course is the sensitive one. In the case of the application in this article I have ensured that only the secret-objects contain sensitive information. There is no password, endpoint, credential in any other artifact. So I can freely share the other files – even on GitHub. But not the secrets files. They contain the valuable goods.

Note: even though the secrets may seem encrypted – in this case they are not. They simply contain the base64 representation of the actual values. These  base64b values can easily be retrieved on the Linux command line using:

echo -n '<value>' | base64

The secrets are created from these yaml files:

apiVersion: v1
kind: Secret
metadata:
  name: twitter-app-credentials-secret
  namespace: cqrs-demo
type: Opaque
data:
  CONSUMER_KEY: U0hhQjA0QURpT
  CONSUMER_SECRET: dTBDT2lasasasas=
  ACCESS_TOKEN_KEY: OTEusd7878
  ACCESS_TOKEN_SECRET: aUNjkasjsdyusdfyusdf

and’

apiVersion: v1
kind: Secret
metadata:
  name: kafka-server-secret
  namespace: cqrs-demo
type: Opaque
data:
  kafka-server-endpoint: Masasas
  kafka-topic: aasasasasasqwqwq==

using these kubectl statements:

kubectl create -f ./kafka-secret.yaml
kubectl create -f ./twitter-app-credentials-secret.yaml

The Kubernetes Dashboard displays the two secrets:

image

And some details for one (but not the sensitive values):

image

The file k8s-deployment.yml contains the definition of both the service as well as the deployment and through the deployment indirectly also the pod.

The service is defined of type LoadBalancer. This results on Oracle Kubernetes Engine on a special external IP address assigned to this service. That could be considered somewhat wasteful. A more elegant approach would be to use a IngressController – that allows us to handle more than just a single service on an external IP address. For the current example, LoadBalancer will do. Note: when you run the Kubernetes artifacts on an environment that does not support LoadBalancer – such as minikube – you can change type LoadBalancer to type NodePort. A random port is then assigned to the service and the service will be available on that port on the IP address of the K8S cluster.

The service is exposed externally at port 80 – although other ports would be perfectly fine too. The service connects to the container port with the logical name app-api-port in the cqrs-demo namespace. This port is defined for the http-to-twitter-app container definition in the http-to-twitter-app deployment. Note: multiple containers can be started for this single container definition – depending on the number of replicas specified in the deployment and for example depending on the question of (re)deployments are taking place. The service mechanism ensures that traffic is load balanced across all container instances that expose the app-api-port.

kind: Service
apiVersion: v1
metadata:
  name: http-to-twitter-app
  namespace: cqrs-demo
  labels:
    k8s-app: http-to-twitter-app
    kubernetes.io/name: http-to-twitter-app
spec:
  selector:
    k8s-app: http-to-twitter-app
  ports:
  - protocol: TCP
    port: 80
    targetPort: app-api-port
  type: LoadBalancer
  # with type LoadBalancer, an external IP will be assigned - if the K8S provider supports that capability, such as OKE
  # with type NodePort, a port is exposed on the cluster; whether that can be accessed or not depends on the cluster configuration; on Minikube it can be, in many other cases an IngressController may have to be configured  

After creating the service, it will take some time (up to a few minutes) before an external IP address is associated with the (load balancer for the) service. The external ip will then be shown as pending. Below what it looks like in the dashboard when the external IP has been assigned although I blurred most of the actual IP address)

image

 

The deployment for now specifies just a single replica. It specifies the container image on which the container (instances) in this deployment are based: lucasjellema/http-to-twitter-app:0.9. This is of course the container image that I pushed in the previous section. The container exposes port 8080 (container port) and this port has been given the logical name app-api-port, that we have seen before.

The K8S cluster instance I was using had an issue with DNS translation from domain names to IP address. Initially, my application was not working because the url api.twitter.com could not be translated into an IP address. Instead of trying to fix this DNS issue, I have made use of a built in feature in Kubernetes called hostAliases. This feature allows we to specify DNS entries that are added at runtime to the hosts file in the container. In this case I instruct Kubernetes to inject the mapping between api.twitter.com and its IP address into the hosts file of the container.

Finally, the container template specifies a series of environment variable values. These are injected into the container when it is started. Some of the values for te environment variables are defined literally in the deployment definition. Others consist of references to entries in secrets, for example the value for TWITTER_CONSUMER_KEY that is derived from the twitter-app-credentials-secret using the CONSUMER_KEY key.

apiVersion: extensions/v1beta1
kind: Deployment
metadata: 
  labels: 
    k8s-app: http-to-twitter-app
  name: http-to-twitter-app
  namespace: cqrs-demo
spec: 
  replicas: 1
  strategy: 
    rollingUpdate: 
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template: 
    metadata: 
      labels: 
        k8s-app: http-to-twitter-app
    spec: 
      hostAliases:
      - ip: "104.244.42.66"
        hostnames:
        - "api.twitter.com"
      containers: 
        - 
          image: "lucasjellema/http-to-twitter-app:0.9"
          imagePullPolicy: Always
          name: http-to-twitter-app
          ports: 
            - 
              containerPort: 8080
              name: app-api-port
              protocol: TCP
          env: 
            - 
              name: PUBLISH_TO_KAFKA_YN
              value: "N"
            - 
              name: TWITTER_HASHTAG
              value: "#GroundbreakersTourOrderEvent"
            - 
              name: TWITTER_CONSUMER_KEY
              valueFrom:
                secretKeyRef:
                  name: twitter-app-credentials-secret
                  key: CONSUMER_KEY
            - 
              name: TWITTER_CONSUMER_SECRET
              valueFrom:
                secretKeyRef:
                  name: twitter-app-credentials-secret
                  key: CONSUMER_SECRET
            - 
              name: TWITTER_ACCESS_TOKEN_KEY
              valueFrom:
                secretKeyRef:
                  name: twitter-app-credentials-secret
                  key: ACCESS_TOKEN_KEY
            - 
              name: TWITTER_ACCESS_TOKEN_SECRET
              valueFrom:
                secretKeyRef:
                  name: twitter-app-credentials-secret
                  key: ACCESS_TOKEN_SECRET
            - 
              name: KAFKA_SERVER
              valueFrom:
                secretKeyRef:
                  name: kafka-server-secret
                  key: kafka-server-endpoint
            - 
              name: KAFKA_TOPIC
              valueFrom:
                secretKeyRef:
                  name: kafka-server-secret
                  key: kafka-topic

The deployment in the dashboard:

image

 

Details on the Pod:

image

Given admin privileges, I can inspect the real values of the environment variables that were derived from secrets.

The Pod logging is easily accessed as well:

image

5. Run and Try Out the Application

When the external IP has been allocated to the Service and the Pod is running successfully, the application can be accessed. From the Oracle Database – and also just from any browser:

image

The public IP address was blurred in the location bar. Note that no Port is specified in the URL – because the port will default yo 80 and that happens to be the port defined in the service as the port to map to the container’s exposed port (8080).

When the database makes its HTTP request, we can see in the Pod logging that the request is processed:

image

And I can even verify that it has done what in the logging the application states it has done:image

Resources

GitHub sources: https://github.com/lucasjellema/groundbreaker-japac-tour-cqrs-via-twitter-and-event-hub

Kubernetes Cheatsheet for Docker developers: https://technology.amis.nl/2018/09/26/from-docker-run-to-kubectl-apply-quick-kubernetes-cheat-sheet-for-docker-users/

Kubernetes Documentation on Secrets: https://kubernetes.io/docs/concepts/configuration/secret/

Kubernetes Docs on Host Aliases: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/

Docker docs on dockerignore https://docs.docker.com/engine/reference/builder/#dockerignore-file

Kubernetes Docs on Deployment: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/