InfluxDB V2.0 - Stack Implementation Proof of Concept 2021 01 29 11 18 30 Architecture 970x500.png 970×500

InfluxDB V2.0 – Stack Implementation Proof of Concept

This blogpost will give you detailed instructions and information regarding the InfluxDB stack. Since February 2019 InfluxDB now consists not only the database itself, but also Telegraf, Chronograf and Kapacitor. Later you’ll read about what every part of the stack does.

18-2-2019: Living on the bloody edge, do not run this in production.

What is InfluxDB?

InfluxDB is a time series database made by InfluxData. It is widely used for telemetry and other time related data. For more information about InfluxDB either read https://www.influxdata.com/time-series-platform/

The Stack

A little bit of background information. InfluxDB used to be a stand-alone database that is optimized for time-series data. InfluxData was released 2013 and is market leader since. For a longer period of time the Influx team developed multiple components that revolve around the database. These components are:

  • Telegraf
  • Chronograf
  • Kapacitor

This is called the TICK Stack. When my colleague (Gertjan) and I were at the Influx Conference roadtrip in Amsterdam they announced the biggest change since release. They are going to have a single docker file with all the components in one. When pulling InfluxDB you are also getting the rest of the functionality.

Like specified earlier InfluxDB now comes in a stack with different components. There will be a detailed description of every component so you will have a better understanding of what’s happening.

In figure 1 below you can see the architecture behind the stack. When we discus the stack later you will see that some names have changed. For example, Telegraf is split into a Scraper and Telegraf (this is uses the same component but under a different name). Besides some occasional name changes is not all functionality implemented (yet!). Kapacitor does not have anomaly detection, machine learning, … implemented (yet !!!). On this page we will discus the current functionality, new functionality will be added each update.

https://www.influxdata.com/wp-content/uploads/Architecture-970x500.png

Telegraf

Telegraf is a plugin-driven server agent for collecting and reporting metrics. If you’re familiar with a cloud platform like AWS or Azure, this is basically Kinesis Data Streams or Stream Analytics. Telegraf has plugins to source a variety of metric directly from the system it’s running on, pull metrics from third party APIs or even use statsd and Kafka. We will mostly use an API as our source.

Telegraf has output plugins as well to send metrics to a variety of other datastores, services and message queues like InfluxDB, Kafka, Graphite, OpenTSDB, Datadog and many others. Here are some key features of Telegraf

  • Written entirely in Go. The main advantage of this is that it compiles into a single binary with no external dependencies
  • Minimal memory footprint
  • Plugin system allows new inputs and outputs to be easily added
  • A wide number of plugins for many popular services already exist for well known services and APIs. 

Chronograf

Chronograf is the administrative user inface and visualization engine of the TICK stack. It makes monitoring and alerting for your infrastructure and data easy to setup and maintain. It is fairly simple to use and includes templates and libraries to allow you to rapidly build dashboard with real-time visualizations of your data and to easily create alerting and automation rules. Here are some key features of this plugin:

  • Infrastructure monitoring
  • Alert management
  • Data visualization
  • Database management
  • Multi-organization and multi-user support

Kapacitor

Kapacitor is a native data processing engine for hot data. It can process both stream and batch data from InfluxDB. Kapacitor lets you plug in your own custom logic or user defined function to process alerts with dynamic thresholds, match metrics for patterns, compute statistical anomalies and perform specific actions based on these alerts like dynamic load rebalancing.

Good sale speech, but is it implemented? No, unfortunately not yet. But InfluxData is not lying, the stand-alone version of Kapacitor does do all this but in the new version 2.0 stack it’s yet to be implemented. We will talk about Kapacitor’s functionality later on this page.

InfluxDB

I hope you read the InfluxDB page, if not here is a short summary. InfluxDB is a Time Series Database built from the ground up to handle high write & query loads. InfluxDB is a custom high performance datastore written specifically for timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics. Conserve space on your machine by configuring InfluxDB to keep data for a defined length of time, and automatically expiring and deleting any unwanted data from the system. InfluxDB also offers a SQL-like query language for interacting with data.

Flux

More? Yes! Although this  is not seen in figure 1 (architecture) it’s a super important part of the new 2.0 release. InfluxDB used to have a SQL-like query language to do the usual database work like query data, database management and more. Unfortunately it had some limitations. The solution? Flux!

Flux is InfluxDBs new functional data scripting language designed for querying, analyzing, and acting on time series data. It takes the power of InfluxQL and the functionality of TICKscript and combines them into a single unified syntax. The syntax is inspired by 2018’s most popular scripting language, Javascript. With Flux as a new scripting language the old SQL-like query tool does not disappear.

A Flux code example can be viewed below. The following example illustrates querying data stored from the last five minutes, filtering by the cpu measurement and the cpu=cpu-tag, windowing the data in 1 minute intervals, and calculating the average of each window. We will not go into detail about the Flux language, there will be a dedicated blog post for Flux itself.

from(bucket:"example-bucket")
  |> range(start:-1h)
  |> filter(fn:(r) =>
    r._measurement == "cpu" and
    r.cpu == "cpu-total"
  )
  |> aggregateWindow(every: 1m, fn: mean)

Setup your environment

The best way to learn is to read and do it. We will setup a VM with the new stack running. Afterwards we will setup Telegraf to receive telemetry data from our VM instance. As a final step we will visualize the data. This will give you a great first impression of how the stack works.

We will start by setting up a VM with docker and the new Influx stack. I recommend you follow along with a VM in Azure or AWS. This way you can immediately familiar yourself with the Cloud environment. Of course you can run it locally as well. I will not go over how to setup a VM, you can easily click through the wizard.

For my example I will use Ubuntu 18.04.02 LTS with a standard 30GB hdd on Azure. I opened up port 22 and 9999. I’m using SSH and generated my key with Putty. 

Note: Using cutting edge database technology with Docker, that is so 2018. For this proof-of-concept we are indeed not using Kubernetes (with Minikube) and Terraform. For the purpose of this explanation it’s too much. 

Protip: Familiar yourself with tmux to be a terminal31337. 3 panes in one? Easy.

InfluxDB V2.0 - Stack Implementation Proof of Concept sC7YGFQ

Docker and containers

First we need to install docker. We are using Docker CE 18.09.2. If you have not yet installed Docker, please read this

Now that we have Docker installed we are going to pull and run our Influx container. Remember, instead of running the components separately we are running one container called influx with all the components included. To pull and run the container use:

sudo docker run --name influxdb -p 9999:9999 quay.io/influxdb/influxdb:2.0.0-alpha

If you dont want to send InfluxDB telemetry data back to InfluxData. Opt-out by including “–reporting-disabled” when starting the docker container. 

Note: This is basic configuration. We have not specified what disk to use and other options. This is okay for our proof of concept.

Docker is pulling the InfluxDB 2.0 release from Dockerhub, once downloaded it is automatically started as a container. To check if your container is running please use:

sudo docker container ls

If everything went well you should see this.

InfluxDB V2.0 - Stack Implementation Proof of Concept UnG5NKB

Now that we have our container we are going to log in.

sudo docker exec -it influxdb /bin/bash

And we’re all set. We have downloaded and ran the influx stack and we are now going to configure it. 

Getting started with the new platform

Normally this is the part where I am shooting terminal jargon at you. With the new Influx release, we have Chronograf. This is our brand new GUI that guides us through the initial setup. If you read the docker run command properly you should know that we have forwarded it to port 9999. Make sure you have opened port 9999 on the VM as well. 

Note: We could do this in the CLI, but the GUI is easier for the first time.

Let’s go to 

[ip]:9999
InfluxDB V2.0 - Stack Implementation Proof of Concept E9NXf43

You should be greeted by the Influx welcome page, see below.

InfluxDB V2.0 - Stack Implementation Proof of Concept CR0spRJ

Click on “Get Started” and you need to setup the initial user. This  user is going to be “owner” of their organisation. You are not required to choose an organisation but it is advised. Mainly because we are going to discuss organization and user/role management. The bucket name is where your database name. Click Continue.

InfluxDB V2.0 - Stack Implementation Proof of Concept cZRWe43

Now we are done with the initial setup. Please click “Advanced” and don’t be scared if you’re not familiar with it yet.


There are two types of collecting data, Scraper and Telegraf. Both are managed by Telegraf. 

  • Scraper
    • A scraper collects data from specifiec targets at regular intervals and then writes the scraped data to a bucket. Scrapers can collect data from available data sources as long as the data is in Phometheus data format. For more information about this format click here.
  • Telegraf
    • A Telegraf configuration collects metric data from various systems. Currently it can receive data from: System, Docker, Kubernetes, NGINX and Redis.

We are going to configure Telegraf System.

InfluxDB V2.0 - Stack Implementation Proof of Concept kcCuYO2

On the left bar go to Organisations >  -your company name- > Telegraf > create configuration

InfluxDB V2.0 - Stack Implementation Proof of Concept coms9jK

Select System and click continue

InfluxDB V2.0 - Stack Implementation Proof of Concept G3Hgynf

Type a name for your telegraf system configuration and click “create and verify”. If you need additional configurations click the read plugin name on the left and follow instructions. On the last page please click the verify button to see if it’s working correctly. Now click finish.

Telegraf

We have configured the InfluxDB correctly. Now we need to setup a new docker container with Telegraf. Once again, Telegraf is a plugin-driven server agent for collecting and reporting metrics. Since we want to collect our metrics, let’s build it.

First we need to obtain the Telegraf config. Once you successfully created your Telegraf agent click view to view the config.

InfluxDB V2.0 - Stack Implementation Proof of Concept I8vNtqw


If you configured everything accordingly you will have the exact template like above. Copy your config.

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true
 
  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000
 
  ## For failed writes, telegraf will cache metric_buffer_limit metrics for each
  ## output, and will flush this buffer on a successful write. Oldest metrics
  ## are dropped first when this buffer fills.
  ## This buffer only fills when writes fail to output plugin(s).
  metric_buffer_limit = 10000
 
  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"
 
  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"
 
  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""
 
  ## Logging configuration:
  ## Run telegraf with debug log messages.
  debug = true
  ## Run telegraf in quiet mode (error log messages only).
  quiet = false
  ## Specify the log file name. The empty string means to log to stderr.
  logfile = ""
 
  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ## urls exp: http://127.0.0.1:9999
  urls = ["http://127.0.0.1:9999"]
 
  ## Token for authentication.
  token = "$INFLUX_TOKEN"
 
  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "amis"
 
  ## Destination bucket to write into.
  bucket = "vergaderzalen"
[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = true
  ## If true, collect raw CPU time metrics.
  collect_cpu_time = false
  ## If true, compute and report the sum of all non-idle CPU states.
  report_active = false
[[inputs.disk]]
  ## By default stats will be gathered for all mount points.
  ## Set mount_points will restrict the stats to only the specified mount points.
  # mount_points = ["/"]
  ## Ignore mount points by filesystem type.
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.mem]]
[[inputs.net]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]

SSH 

Let’s go back into our SSH session and create a new folder and file.

pwd
mkdir telegraf
cd telegraf
vi telegraf.conf

With pwd we check in what folder we are. Find a good location, I made mine in “/home/amis/telegraf/telegraf.conf”
Make your folder and cd into it, now create an file called “telegraf.conf”. Remember when you copied the contents of the Telegraf Agent config? When you’re Vim paste your config (use i to insert and right click > paste). Save your file with typing “:wq” (after you clicked ESC).

Good, we now have a file with our configuration. But there is one problem, where is the security? Well, we need a influxdb token to gain write access. Go back to the InfluxDB portal and click on “setup details”.

InfluxDB V2.0 - Stack Implementation Proof of Concept UniMSsR

Copy the following

InfluxDB V2.0 - Stack Implementation Proof of Concept plbPnU6

Only copy the INFLUX_TOKEN={token} part like in the image above. If you look at the config you will see this part

## Token for authentication.token = "$INFLUX_TOKEN"

That means that the Telegraf agent will look for our token in his environment variables. When starting our Docker container we need to pass this environment variable.  

Go back into the VM and run the following command. Replace $pwd with the telegraf.conf location. This will mount our file on the container’s file system. We also pass –network=”host” to share the network. Eventually you can also make a new docker network and mount it on both containers. For this demo we’ll use the host network. This makes us eligible to connect to our influxdb localhost.

sudo docker run -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro --network="host" -e INFLUX_TOKEN=insert_your_token telegraf

Telegraf is running! We have configured that it only captures system information and every X seconds telegraf posts it to InfluxDB. If you stop your vm/containers you only have to use the start influxdb/telegraf command. Settings and everything else will persist. 

(smile)

Let’s say you want to change some configuration, for example you want to change the bucket where the data goes. The only thing you need to do is stop your container, change the file (on the VM) we made earlier and spin the VM back  up. Every time you start the VM it will fetch the mounted file. Easy! 

Chronograph

Wooooo! We are now up and running, now we need to configure a dashboard. Luckily influx already made a dashboard called “system”.  Go to dashboards and click the System dashboard.

InfluxDB V2.0 - Stack Implementation Proof of Concept GRTgFf3

Unfortunately in this release they have not yet added the automatic switch from the default bucket to your custom bucket. Luckily it’s quite easy to fix. Click on System and go to your dashboard.

Hover over a panel and click the configure button

InfluxDB V2.0 - Stack Implementation Proof of Concept BN4y4b1

The panel setting will pop-up. Now we need to make some changes. Go to script editor if you’re not on it already. You should see the following:

InfluxDB V2.0 - Stack Implementation Proof of Concept CF4DaNE

This section is the script edit mode. For more information about Flux click here.
For now we are only going to change one thing on the first line. The from statement needs a bucket,  please fill in your bucket name.

from(bucket: "$yourbucket")

Click the submit button and use the top right green button to save. Do this with every component and eventually it will looks somewhat like this

InfluxDB V2.0 - Stack Implementation Proof of Concept BsXkL1J

We now have successfully made our own dashboard with metrics from our system. Definitely try the to make some graphs and charts yourself! 

Resource

https://www.influxdata.com/time-series-platform/https://www.influxdata.com/time-series-platform/telegraf/https://en.wikipedia.org/wiki/InfluxDBhttps://v2.docs.influxdata.com/v2.0/reference/release-notes/https://docs.influxdata.com/flux/v0.12/introductionhttps://v2.docs.influxdata.com/v2.0/query-data/get-started/https://hub.docker.com/_/telegraf

2 Comments

  1. Alexei February 28, 2019
    • Sam Vruggink February 28, 2019