One of the session at CodeOne 2018 discussed an upcoming feature for Oracle Database – supported in Release 12.2 and up – that would allow developers to consume Kafka events directly from SQL and PL/SQL and – at a late stage – also publish events from within the database straight to Kafka Topics. This article briefly walks through the feature as outlined in the session by Melli Annamalai, Senior Principal Product Manager at Oracle.
Note: the pictures in this article are a little bit unclear as they are pictures taken of the slides shown in the session.
The first stage of the Kafka support in Oracle Database is around consuming events. The database can be registered as a consumer (group) on a Kafka Topic (on a single, several or on all partitions). It then allows each database application that has an interest in the Kafka Topic to fetch the events and it will keep track of the application’s offset so as to allow easy control over at least once delivery.
The Oracle Database will not support continuous queries or streaming event analysis (like KSQL or Flink do). It makes it easy to receive (by periodically fetching) all events on a Kafka topic of interest.
The Kafka-to-SQL connector as discussed in this article is planned to be available as part of Big Data Connectors (paid for product) and of Autonomous Data Warehouse Cloud. Depending on customer demand, other ways to get hold of the functionality may arise.
The format of the Kafka message payload is described to the database through a table definition: each column is mapped to an element in the messages. CSV and JSON are supported – Avro is considered. At this moment, only flat payload structures (no nested elements) can be handled (similar as with external tables).
Syntax for registering a Kafka Cluster with the database:
‘<Kafka broker URL>:9092’
,’DBMSKAFKA DEFAULT DIR’ ,
‘Testing DBMS KAFKA’);
An example of the syntax required to create views to read messages from a specific Kafka Topic
views created INTEGER;
view_prefix VARCHAR2(128) ;
DBMS_KAFKA.CREATE KAFKA VIEWS
(‘SENS2’ — logical identifier of the Kafka Cluster
, ‘MONITORAPP’ — name of application (aka consumer group) in database
, ‘sensor2’ — name of Kafka Topic
, ‘sensormessages_shape_table’ — name of the database table that describes the message shape
, views_created — number of views created, corresponding to the number of partitions in the Kafka Topic
Two examples of SQL queries to retrieve Kafka messages from the views just created; note that Oracle adds message properties partition, timestamp and offset :
select timestamp, sensorunitid, temperaturereading
These queries do not load any data into the database: the data is retrieved from the Kafka Topic, returned as query result not stored anywhere.
Messages can be loaded directly from the Kafka Topic into a table using a statement like the following:
( ‘SENS2’, ‘LOADAPP’, ‘sensor2’
dbms_output.put_1ine (‘rows loaded: ‘|| rows loaded) ;
Publish to Kafka Topic
At a later stage – version 2 of the connector – support is added for publishing of events to Kafka:
Also on the roadmap is the ability to query messages from a Kafka Topic from a specified timestamp range.
Note: Oracle Big Data SQL also has support for retrieving data from a Kafka Topic, but in a fairly roundabout way; it requires an Hadoop cluster where Hive is running to get the Kafka event data and make that available to the BigData SQL conector.