How to deploy InfluxDB in Azure using a VM service with dedicated storage

2

 

InfluxDB isn’t natively supported on Azure. This blog post will teach you how to deploy InfluxDB (or any other database) in a VM  with a managed disk on the Azure platform. This will enable you to use this fast time-series database for your project. If the standard range of supported databases (MySQL, CosmosDB, …) on Azure doesn’t suffice. This blog post is for you.

Prerequisite:

  • Azure account
  • PuTTY(gen)
    • It is possible to generate keys through Azure Cloud Shell but we will use PuTTY.

Where are the keys at?
PuTTY is a free open-source terminal emulator. It supports multiple network protocols, in this blog post we will only use SSH. Fun fact, PuTTY has no official meaning. We will use this tool to create keys and connect to our VM.

Download PuTTY with the link provided in the Prerequisite tab. When PuTTY and PuTTYgen are installed launch PuTTYgen. This is a tool for creating SSH keys.

 

 

 

 

 

 

 

 

Click the generate button and move your cursor around the progress bar to ‘generate some randomness’. We will use these keys to connect to our VM later. Click the “Save public key” and “Save private key” button and save your keys to a secure place on your computer. You can close PuTTYgen and log into Azure Portal.

Creating a Virtual Machine
Now that we have our keys we can get started in Azure. But first I’ll explain why we are using a VM. InfluxDB isn’t natively supported on Azure and that causes us to run it somewhere else.

Most people (including myself) think using a Container Instance with a Shared File Storage is the appropriate option. It’s stateless, secured and a cheap alternative to a VM. The major problem is that on-restart InfluxDB can’t read it’s own data anymore. Shared File storage isn’t supported by InfluxDB because it causes a lot of bugs. Luckily InfluxDB isn’t alone, MongoDB explicitly notes that Shared File Storage isn’t supported at all.

We now have an understanding why we are using a VM. Let’s create one! Search for Virtual Machines and click it.


Since we want to create a new VM click add.


Now we are in the Virtual Machine wizard with a lot of options. Note that most options depend on your case. We will create something cheap for this demo.

  1. Choose your subscription, this is where the expenses are credited.
  2. Choose your resource group, if you did not have one before you can use the create new option.
  3. Choose your VM-name, something recognizable is feasible since the VM creates a lot of services for you with this name ($NAME-ip, …).
  4. Choose your preferred region.
  5. This option protects your data from datacenter outages, if you are reading this blogpost you probably don’t need this. Skip it for now.
  6. I will use Ubuntu 18.04 but you are free to use whatever. Linux is cheaper than Windows, keep that in mind.
  7. B1s has 1vcpu and 1GB memory. Depending on the load you put on your VM choose something more powerful. You can always scale up later. Do not look at disk space. We will create and attach our own managed disk later.
  8. We will use SSH public key authentication.
  9. Copy and paste the public key we generated with PuTTYgen. Also choose your login name, I will use “influx”.
  10. When the VM is running we want to connect to it through PuTTY. Open up port 22 (SSH) with this setting. Otherwise the error “Connection Refused” will occur.

A database saves data, that is why we need some extra disk space. Click Next : Disks.

Select your OS disk. I am using the Standard HDD. If you want something faster, go for it.

  1. Choose your disk type. A standard HDD is 60 mb/s and really cheap. For a lot of use cases this is enough.
  2. Choose the name
  3. I will use 100GB. HDD prices are divided in tiers, click here.
  4. We need an empty disk, click create and go to Next : Management

We are now in the Management section. Make sure you select the correct Storage Account for diagnostics. If you do not have one, click the create new button. This is where your logs are stored.

We are done with the VM settings. Click Review + create and check once more if everything is properly configured. Azure will deploy your VM to the selected resourcegroup, this can take a minute. Once the VM is up and running go to your resource group and click on your VM.

On the right there is a label named “Public IP Address”. Copy the IP address and save it. If you ever forget the IP, this is where it is located. In this tutorial we are not going to configure a DNS. Next up click the networking tab on the left.

Because we want to access the database from outside the VM as well we need to open up port 8086. In the network tab click Add inbound port rule.

Change the port to 8086 and choose your name and description. Next up click Add.

Okay good, but what did we do? We created an Ubuntu VM with an extra HDD to store our data. We also opened port 22 for SSH and port 8086 for InfluxDB. We still need to partition our disk, move Docker to that disk and install InfluxDB. Let’s get right into that.

Establishing a connection with our VM
Open up PuTTY!

Our VM is running but we secured it with a SSH key. We need to attach our private key to open up a session. Go to Auth and click browse, select your private key and click ok. Go back to Session.

Paste your VMs IP address, use port 22 and choose connection type SSH. Click open connection. Log in using your username (the one you from the VM setup). Congratulations! You are now logged in to a computer running in one of Microsofts datacenters.

Inside the VM
No more fancy GUI’s, real (wo)men use the CLI but don’t worry, you don’t have to be a CLI-wizard (yet). When copying a line from this blog, use the right mouse button to paste it into your own CLI.

First we need to partition our managed disk. In my case, a 100GB HDD. Afterwards we need to mount it. That way we can move our InfluxDB data to the managed disk.

Let’s find our managed disk with the following command. We are using two commands to verify that it’s indeed the right disk.

dmesg | grep SCSI
lsblk

I made an 100gb disk which is now called “sdc”. Verify that the disk is in both outputs like in the image below.

First we need to partition the disk. If your disk has a different name than sdc change accordingly.

sudo fdisk /dev/sdc

You are prompted to enter a command. Use the N command to create a new partition. After you created your new partition, we need to write it to our disk. Use the W command to write and exit. We succesfully created a new partition and wrote it to our managed disk.

Now write a file system to the partition using the mkfs command. We are going to create an ext4 filesystem. Execute the command below.

sudo mkfs -t ext4 /dev/sdc

We partitioned the HDD to ext4, nice! The last thing to do is to create a directory to mount the file system using mkdir. I am going to name my folder /databasedata but you can opt for a different name. Afterwards we mount the disk to your folder.

sudo mkdir /databasedata
sudo mount /dev/sdc /databasedata

To ensure that the drive is mounted automatically after a reboot it must be added to the fstab file. It’s a best practice to use the UUID to refer to the drive rather than the name. To find the UUID use:

sudo -i blkid


Ah, there it is! /dev/sdc is the one we need. Copy the UUID. Now format the following line to your own needs. My $FOLDERNAME is /databasedata.

UUID=$UUID /$FOLDERNAME ext4 defaults,nofail 1 2

Now we are going to add the line above to the fstab file. If you are not familiar with vim. Use i to go into insert-mode and once you added the line click ESC and type :wq to save and exit.

sudo vi /etc/fstab

To verify that everything works.

lslbk


It’s mounted correctly!

VMception
We are going to use Docker to run InfluxDB in our VM. Let’s start by installing Docker.

sudo apt-get update
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common

Now that we updated our apt-get package and installed some certificates let’s install Docker. First we will acquire the Docker GPG key, we verify using a fingerprint and download the Ubuntu repository. We update apt-get once more and install Docker-ce.

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
sudo apt-get update
sudo apt-get install docker-ce

We finally have our Docker instance and we are close to installing InfluxDB. Docker isn’t really flexible when it comes to data storage. Since we want everything on our managed disk we need to change the default directory of Docker. First, verify that Docker is working.

docker -v

When Docker starts it checks the daemon.json file for variables. We will edit this file to let Docker know we want all our data on /$YOURFOLDER. Enter the following command to edit, remember the commands vim uses?

sudo vi /etc/docker/daemon.json

Insert the following: (the folder is where your disk is mounted.) And of course save and exit.

{
    "graph": "%YOURFOLDER",
    "storage-driver": "overlay"
}

Good, but Docker only tries to read this file when it initially starts. Let’s restart the daemon and Docker.

sudo systemctl daemon-reload
sudo systemctl restart docker

To confirm that it’s really using the correct directory type the following:

sudo docker info|grep "Docker Root Dir"

Once you verified that Docker moved your files, we can remove the old Docker files.

sudo rm -rf /var/lib/docker

You’re still here? Good! Let’s install InfluxDB
A couple steps ago we changed the root folder of Docker to our managed disk. All Docker data (and Influx) will be stored there. We are creating a volume because volumes are completely managed by Docker and have serveral advantages over bind mounts. Let’s create a volume! I’m going to call it influxdb_data but choose whatever you want

sudo docker volume create influxdb_data

Let’s install Influxdb. You are free ingest as many variables as you want. In this example we are only going to give it authentication, specifiy the port, making sure it restarts on-failure and linking the volume.

sudo docker run -d \
--name="influxdb" \
--restart on-failure \
-p 8086:8086 \
-e INFLUXDB_HTTP_AUTH_ENABLED=true \
-v influxdb_data:/var/lib/influxdb \
influxdb -config /etc/influxdb/influxdb.conf

Verify that your container is running.

sudo docker container ls

To get into our container and start using the database :

sudo docker exec -it influxdb /bin/bash
influx


At this point you should probably be like Dwight. You did it! We now have InfluxDB running locally in a VM, but we can access it with through $VMIPADDRESS:8086

What to do now?
InfluxDB has many options and there is a lot to learn. Read their documentation and start inserting, getting and modifying data. You can also visualize your data in Grafana, which you can also run in a VM! Hopefully this guide has given you a better understanding how to run databases in Azure.

About Author

Passionate nerd who wants to know everything about technology, besides that I try to pick up as many skills as possible besides my passion for IT.

2 Comments

  1. Hi thank you for your tutorial, on the final step when running influxdb -config /etc/influxdb/influxdb.conf I get:
    Unable to find image ‘influxdb:latest’ locally
    docker: Error response from daemon: received unexpected HTTP status: 503 Service Unavailable.
    See ‘docker run –help’.

    Therefore when running `sudo docker container ls` there is nothing:
    CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

    • Hello Dean!

      The “Unable to find image ‘influxdb:latest’ locally” is normal. Docker is saying, “I don’t have the image, that’s why I am going to pull it from the interwebz”.

      “Received HTTP status 503″ means that your Docker instance has problems with connecting to Docker Image Library.

      First try :
      apt-get update

      If this actually works you know that your internet connection is doing good.

      Next try:
      sudo systemctl daemon-reload
      sudo systemctl restart docker

      Restarting docker, something when you reset a password you get errors while pulling.
      Now try the following again:

      sudo docker run -d \
      –name=”influxdb” \
      –restart on-failure \
      -p 8086:8086 \
      -e INFLUXDB_HTTP_AUTH_ENABLED=true \
      -v influxdb_data:/var/lib/influxdb \
      influxdb -config /etc/influxdb/influxdb.conf

      If you still have the same error. Restart your VM completely. Don’t worry about everything you’ve done before. Due to all the steps we took in this tutorial it’s completely safe =)

      Please let me know if this worked.

      Regards
      Sam

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.