For conducting some experiments and preparing several demonstrations I needed a locally running Kafka Cluster (of a recent release) in combination with a KSQL server instance. Additional components from the Core Kafka Project and the Confluent Open Source Platform (release 4.1) would be convenient to have. I needed everything to run on my Windows laptop.
This article describes how I could get what I needed using Vagrant and VirtualBox, Docker and Docker Compose and two declarative files. One is the vagrant file that defines the Ubuntu Virtual Machine that Vagrant spins up in collaboration with VirtualBox and that will contain Docker and Docker Compose. This file is discussed in more detail in this article: https://technology.amis.nl/2018/05/21/rapidly-spinning-up-a-vm-with-ubuntu-and-docker-on-my-windows-machine-using-vagrant-and-virtualbox/. The file itself can be found here, as GitHub Gist: https://gist.github.com/lucasjellema/7593677f6d03285236c8f0391f1a78c2.
The second file is the Docker Compose file – which can be found on GitHub as well: https://gist.github.com/lucasjellema/c06f8a790114396f11eadd10434d9b7e . Note: I received great help from Guido Schmutz from Trivadis for this file!
The Docker Compose file is shared into the VM when vagrant boots up the VM
and is executed automatically by the Vagrant docker-compose provisioner.
Alternatively, you can ssh into the VM and execute it manually using these commands:
cd /vagrant
docker-compose up –d
Docker Compose will start all Docker Containers configured in this file, the order determined by the dependencies between the containers. Note: the IP address in this file (192.168.188.102) should correspond with the IP address defined in the vagrantfile. The two gists currently do not correspond because the Vagrantfile defined 192.168.188.110 as the IP address for the VM.
Once Docker Compose has done its thing, all containers configured in the docker-compose.yml file will be running. The Kafka Broker is accessible at 192.168.188.102:9092, the Zoo Keeper at 192.168.188.102:2181 and the REST API at port 8084; the Kafka Connect UI at 8001, the Schema Registry UI at 8002 and the KSQL Server at port 8088. The Kafka Manager listens at port 9000.
To run the KSQL Command Line, use this command to execute the shell in the Docker container called ksql-server:
docker exec -it vagrant_ksql-server_1 /bin/bash
Then, inside that container, simply type
ksql
And for example list all topics:
list topics;
Here follows the complete contents of the docker-compose.yml file (largely credited to Guido Schmutz):
version: '2' services: zookeeper: image: "confluentinc/cp-zookeeper:4.1.0" hostname: zookeeper ports: - "2181:2181" environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 broker-1: image: "confluentinc/cp-enterprise-kafka:4.1.0" hostname: broker-1 depends_on: - zookeeper ports: - "9092:9092" environment: KAFKA_BROKER_ID: 1 KAFKA_BROKER_RACK: rack-a KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' KAFKA_ADVERTISED_HOST_NAME: 192.168.188.102 KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://192.168.188.102:9092' KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter KAFKA_DELETE_TOPIC_ENABLE: "true" KAFKA_JMX_PORT: 9999 KAFKA_JMX_HOSTNAME: 'broker-1' KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker-1:9092 CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181 CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1 CONFLUENT_METRICS_ENABLE: 'true' CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous' schema_registry: image: "confluentinc/cp-schema-registry:4.1.0" hostname: schema_registry container_name: schema_registry depends_on: - zookeeper - broker-1 ports: - "8081:8081" environment: SCHEMA_REGISTRY_HOST_NAME: schema_registry SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181' SCHEMA_REGISTRY_ACCESS_CONTROL_ALLOW_ORIGIN: '*' SCHEMA_REGISTRY_ACCESS_CONTROL_ALLOW_METHODS: 'GET,POST,PUT,OPTIONS' connect: image: confluentinc/cp-kafka-connect:3.3.0 hostname: connect container_name: connect depends_on: - zookeeper - broker-1 - schema_registry ports: - "8083:8083" environment: CONNECT_BOOTSTRAP_SERVERS: 'broker-1:9092' CONNECT_REST_ADVERTISED_HOST_NAME: connect CONNECT_REST_PORT: 8083 CONNECT_GROUP_ID: compose-connect-group CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1 CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081' CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081' CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181' volumes: - ./kafka-connect:/etc/kafka-connect/jars rest-proxy: image: confluentinc/cp-kafka-rest hostname: rest-proxy depends_on: - broker-1 - schema_registry ports: - "8084:8084" environment: KAFKA_REST_ZOOKEEPER_CONNECT: '192.168.188.102:2181' KAFKA_REST_LISTENERS: 'http://0.0.0.0:8084' KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema_registry:8081' KAFKA_REST_HOST_NAME: 'rest-proxy' adminer: image: adminer ports: - 8080:8080 db: image: mujz/pagila environment: - POSTGRES_PASSWORD=sample - POSTGRES_USER=sample - POSTGRES_DB=sample kafka-manager: image: trivadisbds/kafka-manager hostname: kafka-manager depends_on: - zookeeper ports: - "9000:9000" environment: ZK_HOSTS: 'zookeeper:2181' APPLICATION_SECRET: 'letmein' connect-ui: image: landoop/kafka-connect-ui container_name: connect-ui depends_on: - connect ports: - "8001:8000" environment: - "CONNECT_URL=http://connect:8083" schema-registry-ui: image: landoop/schema-registry-ui hostname: schema-registry-ui depends_on: - broker-1 - schema_registry ports: - "8002:8000" environment: SCHEMAREGISTRY_URL: 'http://192.168.188.102:8081' ksql-server: image: "confluentinc/ksql-cli:4.1.0" hostname: ksql-server ports: - '8088:8088' depends_on: - broker-1 - schema_registry # Note: The container's `run` script will perform the same readiness checks # for Kafka and Confluent Schema Registry, but that's ok because they complete fast. # The reason we check for readiness here is that we can insert a sleep time # for topic creation before we start the application. command: "bash -c 'echo Waiting for Kafka to be ready... && \ cub kafka-ready -b 192.168.188.102:9092 1 20 && \ echo Waiting for Confluent Schema Registry to be ready... && \ cub sr-ready schema_registry 8081 20 && \ echo Waiting a few seconds for topic creation to finish... && \ sleep 2 && \ /usr/bin/ksql-server-start /etc/ksql/ksql-server.properties'" environment: KSQL_CONFIG_DIR: "/etc/ksql" KSQL_OPTS: "-Dbootstrap.servers=192.168.188.102:9092 -Dksql.schema.registry.url=http://schema_registry:8081 -Dlisteners=http://0.0.0.0:8088" KSQL_LOG4J_OPTS: "-Dlog4j.configuration=file:/etc/ksql/log4j-rolling.properties" extra_hosts: - "moby:127.0.0.1"
Resources
Vagrant File: https://gist.github.com/lucasjellema/7593677f6d03285236c8f0391f1a78c2
Docker Compose file: https://gist.github.com/lucasjellema/c06f8a790114396f11eadd10434d9b7e