In the IoT world, some devices generate large volumes of events that can be difficult for back-end systems to process in real time. Of course you can use NiFi to throttle messages, however this will not be sufficient if the flow of events is consistently higher than what can be […]
Big Data
Apache NiFi: Monitoring metrics and provenance events using Azure Log Analytics
There are several cases where you might want to use Azure Log Analytics to monitor your NiFi instances. An obvious one is when NiFi is running in Azure. Azure Log Analytics can also be used as single monitoring/alerting solution for multiple applications making operations easier by providing a single interface […]
Apache NiFi: Importing and exporting parameters
When you import a new process group or upgrade an existing one, missing parameters contexts and parameters will automatically be added. The new parameters will be filled with values from the environment where the process group was committed to the Registry (except sensitive parameter values). This is usually a development […]
Apache NiFi: Having fun with Jolt transformations
Jolt is a Java library which can be used to transform JSON to JSON. A Jolt transformation specification itself is also a JSON file. You can use it in products such as Apache NiFi and Apache Camel. In this blog post I’ll describe my first experiences with Jolt transformations. For […]
Apache NiFi: JSON to SOAP
Apache NiFi is a powerful open source integration product. A challenge you might encounter when integrating systems is that one system can produce JSON messages and the other has a SOAP API available. In this blog post I’ll show how you can use NiFi to convert JSON input to a […]
Apache NiFi: Automating tasks using NiPyAPI
Apache NiFi has a powerful web-based interface which provides a seamless experience between design, control, feedback, and monitoring. Sometimes however, you want to automate tasks instead of doing them manually using the UI. This does not only allow you to perform the tasks a lot quicker but it also helps […]
Apache NiFi: Avoid these common pitfalls
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It has a powerful UI which can be used for both development and operations. In addition, the NiFi Registry is available to make promoting software from one environment to the next, easy. In order […]
Merge AVRO schema and generate random data or Java classes
Previously I wrote about generating random data which conforms to an AVRO schema (here). In a recent use-case, I encountered the situation where there were several separate schema files containing different AVRO types. The message used types from those different files. For the generation of random data, I first needed […]
Apache NiFi: Forwarding HTTP headers
Apache NiFi can be used to expose various flavors of webservices. Using NiFi in such a way provides benefits like quick development using a GUI and of course data provenance. You know who called you with which data and where the data went. The NiFi is very scalable, delivery can […]
Apache NiFi: Reading COVID data from a REST API and producing it to a Kafka topic
Apache NiFi can be used to accelerate big data projects by allowing easy integration between various data sources. Using Apache NiFi it is easy to track what happened to your data (data provenance) and to provide features like guaranteed ordered delivery and error handling. In this example I’m going to […]
A quick and easy Apache NiFi development environment
Vagrant can be used to quickly create development environments in for example VirtualBox, VMWare or Hyper-V. I decided to use Vagrant to create a quick Apache NiFi development environment. For Apache NiFi development, you also often require input/output for which Kafka can be used, the NiFi Registry to manage shared […]
What is Apache Drill and how to setup your Proof-of-Concept
Apache Drill is a schema-free SQL query engine. Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A single query can join data from multiple datastores. For example, you can […]
Filesystem events to Elasticsearch / Kibana through Kafka Connect / Kafka
Filesystem events are useful to monitor. They can indicate a security breach. They can also help understanding how a complex system works by looking at the files it reads and writes. When monitoring events, you can expect a lot of data to be generated quickly. The events might be interesting […]
Querying connected data in Graph databases with Neo4j
TL;DR; · Graph databases are ideal for query use cases with data with complex relationships and layers of connections · Its query language is fast, efficient and allows for retrieval of information at deeper levels of abstraction in the data · Neo4j is currently the most popular Graph database, and […]
Some impressions from Oracle Analytics Cloud–taken from keynote at Oracle OpenWorld 2017
In his keynote on October 3rd during Oracle OpenWorld 2017, Thomas Kurian stated that the vision at Oracle around analytics has changed quite considerably. He explained this change and the new vision using this slide. All kinds of data, all kinds of users, many more ways to present and visualize […]
The Hello World of Machine Learning – with Python, Pandas, Jupyter doing Iris classification based on quintessential set of flower data
Plenty of articles describe this hello world of Machine Learning. I will merely list some references and personal notes – primarily for my own convenience. The objective is: get a first hands on exposure to machine learning – using a well known example (Iris classification) and using commonly used technology […]
Oracle Service Bus: A quickstart for the Kafka transport
As mentioned on the following blog post by Lucas Jellema, Kafka is going to play a part in several Oracle products. For some usecases it might eventually even replace JMS. In order to allow for easy integration with Kafka, you can use Oracle Service Bus to create a virtualization layer […]
The normalization of Big Data – reporting from Oracle OpenWorld 2016 on Big Data and Data Integration
The importance of data has never been in doubt in the world of Oracle. Through machine learning and predictive analytics as well as real-time streaming data and Big Data, the data spectrum has broadened considerably. With the quickly expanding range of storage and processing facilities, analysis algorithms and visualization means, […]
Talk of the Town at Oracle OpenWorld 2016: Machine Learning & Predictive Analytics
Talk of the town during Oracle OpenWorld 2016 definitely included the term Machine Learning. Machine learning was mentioned in every other session it seemed. Sometimes fully justified – and sometimes quite far fetched. Our working definition of Machine Learning for the purpose of this article: software using historic data to […]
Reflections after Oracle OpenWorld 2015 – Business Analytics (Big Data, GoldenGate, OBI (EE), ODI, NoSQL)
note: I would like to thank Mark Rittman of RittmanMead for sharing many of this findings from Oracle OpenWorld 2015 as well as a comprehensive slide desk. Business Analytics covers the areas of Business Intelligence, Data Discovery and Big Data as well as some of the data gathering and preparation […]
How-to bulk delete ( or archive ) as fast as possible, using minimal undo, redo and temp
Deleting some rows or tens of millions of rows from an Oracle database should be treated in a completely different fashion. Though the delete itself is technically the same, maintaining indexes and validating constraints may have such a time and resource consuming influence that a vast amount of undo and […]
Gathering data for demo projects – Data Visualization, Pattern Recognition and Data Analysis based on the 2014 Eurovision Song Contest
Having access to useful data to create demonstrations and sample applications can be quite a challenge. Demonstrating the power of data visualizations (for example with ADF DVT) or the capabilities of pattern recognition (such as through Oracle Database 12c Match_Recognize) requires a data set that allows for interesting manipulation and […]
Oracle Database 12c: XQuery Full Text
New in Oracle 12c and one of the big new features in XMLDB is the XQuery Full Text functionality and, as mentioned in the post about XQuery Update, is the official W3C standard to handle unstructured pieces of XML content. The XQuery Full Text and XQuery Full Text Index extends […]
Oracle Database 12c: XQuery Update
New, new…? No, not really new, XQuery Update (W3C standard/draft 2011) was already implemented in 11.2.0.3.0, but is now officially also announced. Besides the XQuery Full Text support (XQFT for short, W3C standard/draft 2011), this is one of the big new features in Oracle 12c. With this update XQuery functionality, […]
Hotsos Revisited 2013 – Presentatie materiaal
Hierbij nog dank voor allen die aanwezig waren bij de weer gevulde, informatieve & gezellige avond tijdens “Hotsos Revisited 2013”. Wij presentatoren hebben genoten van het ambiance. Hier ook nog voor degenen die graag het nog een keer willen nalezen het presentatie materiaal van Toon, Jacco, Gerwin, Frits en mij… […]
Training Oracle ADF 11g, 15 tot en met 19 april
Van 15 tot en met 19 april geeft Luc Bors de 5-daagse ADF 11g training op het kantoor van AMIS in Nieuwegein. In 5 dagen leer je de basis van Oracle ADF 11g. De training bestaat basis uit presentatie, demonstratie en hands on, doorspekt met best practices en voorbeelden uit […]
Hotsos 2013 – Presentation material “Creating Structure in Unstructured Data”
Hereby, for those who want another look or for people to share, my presentation content “Creating Structure in Unstructured Data” given during the Hotsos 2013 Symposium on Monday morning. HTH Marco Hotsos 2013 – Creating Structure in Unstructured Data from Marco Gralike
Hotsos 2013 – From Unstructured to Structured…
It has been a while that I have been attending Hotsos, although that is how it feels. In 2011 I flew to Hotsos to see, among others presentations from Maria Colgan, but I ended up being sick the whole week while learning on my hotel room to enjoy American TV. […]
Hotsos Revisited 2013
Van 3 tot en met 7 maart vindt in Irving, Texas, het internationale Oracle performance Hotsos Symposium plaats. Dit jaar belooft het symposium een garantstelling voor inhoudelijk hoogstaande presentaties en discussies, want naast presentaties van Tom Kyte, Cary Millsap, Maria Colgan en Steven Feuerstein over performance, worden er ook onderwerpen […]
Basisregistraties Adressen en Gebouwen – Het importeren van Kadaster BAG data in een Oracle Database
Vorig jaar heb ik behoorlijk wat vragen gekregen over of er een tool was, een methodiek, om BAG data van het Nederlandse Kadaster in een Oracle database te krijgen voor allerlei doeleinden. Basisregistraties Adressen en Gebouwen (BAG) data wordt onder andere uitgeleverd door het Kadaster in XML bestanden waarin alle […]
Oracle XML Training With Marco Gralike
I was asked by Jože Senegačnik, if I would be would be interested in doing a Masterclass/Seminar in Slovenia and, yes of course, I really liked the idea. So after having a quick look in my agenda, regarding my free time, we started to set things up. This 2 day […]
Using the Oracle XMLDB Repository to Automatically Shred Windows Office Documents (Part 1)
People who have attended the UKOUG presentation this year where Mark Drake, Sr. Product Manager XML Technologies / XMLDB, Oracle HQ, and I demonstrated the first principles of the XDB Repository, might have been impressed with its (GEO/KML Spatial, Image EXIF info) capabilities combined with Google Earth. This post will […]
UKOUG 2011: Using your Database as a Fileserver
UKOUG 2011 is nearby and one of the coolest things in Oracle 11g and onwards is, IMHO, a functionality called XDB Repository Events. Most of you probably know that based on XMLDB functionality in the database, the database also can be used in a File server kind of way by […]
2 dagen seminar door Steven Feuerstein: Best of Oracle PL/SQL (8 en 9 december)
In dit tweedaagse seminar neemt Steven Feuerstein je mee ver voorbij de basismogelijkheden van PL/SQL. Steven zal tijdens dit seminar de best practices behandelen die hij op tientallen plekken in de wereld heeft verzameld en die hij ook mede door zijn nauwe samenwerking met het PL/SQL product team van Oracle […]
OOW 2011 – Oracle XMLDB and Big Data
Last day of Oracle Open World and I am currently attending the last presentations. The first presentation, “Oracle XMLDB: A noSQL Approach to Managing all your Unstructured Data”, deals with the no-SQL approach and using Oracle XML DB in the context of using it with “Big Data”, that is unstructured […]
OOW 2011 – NoSQL Databases and Oracle Database Environments
I am currently at a presentation of Patrick Schwanke, Quest Germany, regarding easy and high speed connect between NoSQL and Oracle Databases. Not really what I planned but as mentioned by Alex Nuijten in an earlier post, unstructured data and it’s handling is gaining ground, so I thought it would […]