One of the session at CodeOne 2018 discussed an upcoming feature for Oracle Database – supported in Release 12.2 and up – that would allow developers to consume Kafka events directly from SQL and PL/SQL and – at a late stage – also publish events from within the database straight to Kafka Topics. This article briefly walks through the feature as outlined in the session by Melli Annamalai, Senior Principal Product Manager at Oracle.
Note: the pictures in this article are a little bit unclear as they are pictures taken of the slides shown in the session.
The first stage of the Kafka support in Oracle Database is around consuming events. The database can be registered as a consumer (group) on a Kafka Topic (on a single, several or on all partitions). It then allows each database application that has an interest in the Kafka Topic to fetch the events and it will keep track of the application’s offset so as to allow easy control over at least once delivery.
The Oracle Database will not support continuous queries or streaming event analysis (like KSQL or Flink do). It makes it easy to receive (by periodically fetching) all events on a Kafka topic of interest.
The Kafka-to-SQL connector as discussed in this article is planned to be available as part of Big Data Connectors (paid for product) and of Autonomous Data Warehouse Cloud. Depending on customer demand, other ways to get hold of the functionality may arise.
The format of the Kafka message payload is described to the database through a table definition: each column is mapped to an element in the messages. CSV and JSON are supported – Avro is considered. At this moment, only flat payload structures (no nested elements) can be handled (similar as with external tables).
Syntax for registering a Kafka Cluster with the database:
BEGIN
dbmskafka.register_cluster
(‘SENS2’
,'<Zookeeper URL>:2181′,
‘<Kafka broker URL>:9092’
,’DBMSKAFKA DEFAULT DIR’ ,
’DBMSKAFKA_LOCATION DIR’
‘Testing DBMS KAFKA’);
END;
An example of the syntax required to create views to read messages from a specific Kafka Topic
DECLARE
views created INTEGER;
view_prefix VARCHAR2(128) ;
BEGIN
DBMS_KAFKA.CREATE KAFKA VIEWS
(‘SENS2’ — logical identifier of the Kafka Cluster
, ‘MONITORAPP’ — name of application (aka consumer group) in database
, ‘sensor2’ — name of Kafka Topic
, ‘sensormessages_shape_table’ — name of the database table that describes the message shape
, views_created — number of views created, corresponding to the number of partitions in the Kafka Topic
, view_prefix
) ;
END;
Two examples of SQL queries to retrieve Kafka messages from the views just created; note that Oracle adds message properties partition, timestamp and offset :
select count(*)
from KV_SENS2_MONITORAPP_SENSOR2_0;
select timestamp, sensorunitid, temperaturereading
from KV_SENS2_MONITORAPP_SENS0R2_0;
These queries do not load any data into the database: the data is retrieved from the Kafka Topic, returned as query result not stored anywhere.
Messages can be loaded directly from the Kafka Topic into a table using a statement like the following:
DECLARE
rows_loaded number;
BEGIN
dbms_kafka.load_table
( ‘SENS2’, ‘LOADAPP’, ‘sensor2’
, ‘sensormessages_shape_table’,rows_loaded
);
dbms_output.put_1ine (‘rows loaded: ‘|| rows loaded) ;
END;
Publish to Kafka Topic
At a later stage – version 2 of the connector – support is added for publishing of events to Kafka:
Also on the roadmap is the ability to query messages from a Kafka Topic from a specified timestamp range.
Note: Oracle Big Data SQL also has support for retrieving data from a Kafka Topic, but in a fairly roundabout way; it requires an Hadoop cluster where Hive is running to get the Kafka event data and make that available to the BigData SQL conector.
Good Afternoon Lucas, We are thinking of exploring and building a used case for consuming Kafka topics from our Oracle 12.2 Database. Just wondering if the Kafka-to-SQL connector is available? If so, is it a part of the Big data connectors? Or we can just get the SQL connector? If its part of the BIG Data connectors – Could you please give me the full name of the product we have to order?
Thanks for your time.
Hi Vijay,
The connector I discussed in this article does not seem to have materialized yet. I have heard anything about it since this session at OOW 2018.
At this point the ways for consuming from a Kafka Topic and use Oracle Database as a sink seem to be the Kafka Connect JDBC Sink Connector
In the other direction – using Oracle Database as a source – there are some options:
– custom Java, leveraging the Database Change Notification mechanism: https://medium.com/@venkyrao/kafka-oracle-dcn-5703c4ca495f
– Kafka Connect handler in Oracle GoldenGate https://dzone.com/articles/creates-a-cdc-stream-from-oracle-database-to-kafka
– Debezium Connector for Oracle https://debezium.io/documentation/reference/1.1/connectors/oracle.html (note: this requires you to have a license for the GoldenGate product because the connector uses the XStream API
– Dbvisit Replicate Connector – https://replicate-connector-for-kafka.readthedocs.io/en/latest/source_connector.html
I hope this helps.
best regards,
Lucas
Thanks a lot Lucas.
Appreciate your very quick response. I tried to look up for this connector online and did not find any information after your article was published and that is why I had asked you the question.
We will try to explore the JDBC sink option for our case here.
Thanks
Vijay
Hi Lucas,
THis is a really good post, is there any template or something like a resipe on how to install or configue “Querying and Publishing Kafka Events from Oracle Database SQL and PL/SQL” ??
If so, I would really apreciate if you can send it to me.
Thaks in advance.
Hi Marcello,
FYI – http://oebsnotes.blogspot.com/2019/01/oracle-to-kafka-made-easy.html
Here it is http://oebsnotes.blogspot.com
Hi Lucas, thanks for attending my session at OOW and writing about it.
Regarding streaming event analysis, it is possible to do analytics using Oracle SQL with this feature. Many SQL analytics functions require data to be finite, and in that case the analytics has to be on a window of data. In addition we are also planning future functionality to include processing of a stream without windowing, perhaps using limited SQL (no aggregation, for example).