Category: Data Analytics

Lessons Learned: Efficient Retrieval-Augmented Generation (RAG)

Maarten Smeets June 23, 2024 Artificial Intelligence (AI) and LLM, Technology No Comments

Explore key lessons in implementing a Retrieval-Augmented Generation (RAG) system, balancing innovation with practicality for enhanced AI responses.

[Continue Reading...]

Run LLMs Locally: Essential Tools and Tips

Maarten Smeets May 21, 2024 Artificial Intelligence (AI) and LLM, Technology No Comments

Running Large Language Models (LLMs) locally offers enhanced privacy, independence, cost effectiveness, and unrestricted use. This guide covers tools like LM Studio and Ollama for setup and hands-on learning.

[Continue Reading...]

Building a high-performance API using GPT-4

Romano Schoonheim March 11, 2024 Technology, Artificial Intelligence (AI) and LLM, Software Engineering, Tools No Comments

Can the power of GPT-4 be leveraged to build a high-performance API? In this article we will instruct GPT-4 to build a high-performance API in GoLang using Protocol Buffers …

[Continue Reading...]

Using GitHub Copilot in IntelliJ IDEA

Marc Lameriks January 31, 2024 Java, Artificial Intelligence (AI) and LLM No Comments

At the beginning of this month, at AMIS, I attended a Special Interest Group (SIG) meeting about GitHub Copilot, given by one of my colleagues. As we do so …

[Continue Reading...]

The Devil’s Dilemma: OpenAI’s Allure Versus the Raw Power of DIY Language Models

Romano Schoonheim January 9, 2024 Artificial Intelligence (AI) and LLM No Comments

In the rapidly evolving world of artificial intelligence, language models stand at the forefront of technological innovation and ethical debate. Among these, OpenAI’s suite of models, including the renowned …

[Continue Reading...]

Quickly creating a geo map to illustrate an article–QGIS and Datawrapper (and the PDOK data sets)

Lucas Jellema January 8, 2024 Data Analytics No Comments

Quickly creating a geo-map illustration for an article. Showing off PDOK as geo data source, QGIS as geo data inspector and editor and Datawrapper to create the actual map. …

[Continue Reading...]

Prepare custom map data with MapShaper and Present with DataWrapper

Lucas Jellema January 7, 2024 Data Analytics No Comments

In this article I take a custom GeoJSON file that defines some 20 custom locations – the office locations for the Conclusion ecosystem – and process it in the …

[Continue Reading...]

PDOK Data Sets in QGIS using QGIS Plugin

Lucas Jellema January 6, 2024 Data Analytics No Comments

PDOK – Publieke Data op Kaart – is a Dutch data service that offers 230+ geo data sets on many different aspects of the public space in The Netherlands. …

[Continue Reading...]

Adding GeoJSON features to a OpenStreetMap in Leaflet

Lucas Jellema January 6, 2024 Data Analytics, Frontend technology No Comments

Leaflet is a great JavaScript library for creating map visualizations in web applications. Leaflet is created with that specific purpose in mind. It caters for most of the things …

[Continue Reading...]

Using CartoDB and OpenStreetMap in QGIS

Lucas Jellema January 6, 2024 Data Analytics No Comments

Using QGIS to gather, visualize, analyze, edit, process and export Geo Data is fun. I only discovered QGIS fairly recently. And now of course I see it popping up …

[Continue Reading...]

Live Inspection of Detailed Railway Data in free QGIS viewer

Lucas Jellema January 3, 2024 Data Analytics 2 Comments

In a recent investigation I stumbled on two valuable things: the free QGIS desktop tool that allows inspection, transformation and analysis of Geo data the ProRail ArcGIS Map Server …

[Continue Reading...]

Visualize Dutch Rail Tracks on Map– QGIS, GML, PDOK, D3 and more

Lucas Jellema January 3, 2024 Data Analytics, Frontend technology No Comments

My challenge: show Dutch rail tracks on a map. This challenge is composed of several steps: find a dataset that contains geographic data on Dutch railways convert that data …

[Continue Reading...]

Map Visualization of (office locations in) The Netherlands –using GeoJSON, D3 and SVG

Lucas Jellema January 2, 2024 Data Analytics, Frontend technology No Comments

The combination of GeoJSON data sets and the JavaScript d3.js for SVG visualizations is quite valuable for creating rich and interactive visualizations of data that is related to geography …

[Continue Reading...]

World Map Data Visualization with d3.js, GeoJSON and SVG–zooming and panning and dragging

Lucas Jellema January 1, 2024 Data Analytics, Frontend technology No Comments

In several recent articles on Data Visualization using a Thematic World Map with color shades assigned to countries based on their value for a specific property in a world …

[Continue Reading...]

Presenting the World in Data using World Map Visualization

Lucas Jellema December 30, 2023 Data Analytics, Frontend technology No Comments

This article is a sequel to my introduction into World Map visualization (using d3, SVG and JavaScript). The previous article introduced the visualization of country specific data on a …

[Continue Reading...]

Create Interactive World Map to Visualize Country Data

Lucas Jellema December 30, 2023 Data Analytics, JavaScript, Web HTML5 CSS3 No Comments

Data associated with countries is fairly common. Presenting such data in the form of world map that shows the countries and presents the associated data through a color is …

[Continue Reading...]

Using ChatGPT as the best virtual IoT domain expert

Robbrecht van Amerongen August 16, 2023 Artificial Intelligence (AI) and LLM No Comments

With the use of ChatGPT, anyone has access to vast domain knowledge. Even a generalist like me (my colleague would call me a dummy) can act as an expert …

[Continue Reading...]

Platys – generate a customized container powered Data Platform environment

Lucas Jellema May 29, 2023 Data Analytics, Platform Technology, Software Engineering No Comments

Platys is a tool created by Guido Schmutz, architect and data specialist at Accenture and frequent teacher at various universities. In order to be able to quickly create an …

[Continue Reading...]

Analyze and Visualize data from any cloud and anywhere with Apache Superset and Steampipe

Lucas Jellema February 25, 2023 Data Analytics No Comments

Apache Superset is an open source end user tool for analyzing and visualizing data. Dozens of chart types are available out of the box, dashboards can be created, finetuned …

[Continue Reading...]

Getting started working with Apache Superset – the open source data exploration and visualization platform

Lucas Jellema February 25, 2023 Data Analytics No Comments

Apache Superset is an open source platform for data exploration and visualization. It can be described as an open, free alternative for Microsoft PowerBI, Tableau, Qlik and Oracle Analytics …

[Continue Reading...]

How Cheppy thrills and acceleraties us at AMIS–and what it does not yet do

Lucas Jellema February 16, 2023 AMIS, Artificial Intelligence (AI) and LLM, Digtial Innovation, Software Development No Comments

ChatGPT is a bit of a “tongue twister” so I will speak of Cheppy. AMIS has a long history of spotting, exploring, embracing and rolling out new concepts and …

[Continue Reading...]

Steampipe–analyze data from cloud, file, platform, IaC using SQL queries

Lucas Jellema May 28, 2022 Database, Cloud, Data Analytics, DevOps 1 Comment

In our daily work we are dealing with data from many sources. Data in CSV files, from Cloud APIs, in mail servers, configuration files, Terraform plans, in logging systems, …

[Continue Reading...]

Quickest way to try out Jupyter Notebook: zero install, 3 CLI commands and 5 minutes to action

Lucas Jellema November 20, 2020 Data Analytics 2 Comments

This brief article shows you the quickest way to trying out Jupyter Notebooks. It will not try to persuade you that you should try them out. You probably know …

[Continue Reading...]

Introducing OCI Data Integration – New cloud native, serverless service for ETL/ELT and Data Pipelines

Lucas Jellema July 9, 2020 Data Analytics No Comments

TL;DR: Oracle offers a new cloud native, serverless service on OCI for data processing and ETL/ELT, called Data Integration. It seems a new incarnation of Oracle Data Integrator or …

[Continue Reading...]

Welcoming the Data Catalog Service on Oracle Cloud Infrastructure

Lucas Jellema February 9, 2020 Data Analytics No Comments

Data is really important to any organization. Data tells us what the organization is doing, and where it is going. And how it can improve quality and efficiency of …

[Continue Reading...]

Ordering rows in Pandas Data Frame and Bars in Plotly Bar Chart (by day of the week or any other user defined order)

Lucas Jellema October 16, 2019 Data Analytics No Comments

I have time series data in my Pandas Data Frame. And I want to present an aggregation of the data by day of the way in an orderly fashion …

[Continue Reading...]

Introduction to Oracle Machine Learning – SQL Notebooks on top of Oracle Cloud Always Free Autonomous Data Warehouse

Lucas Jellema October 12, 2019 Data Analytics No Comments

One of the relatively new features available with Oracle Autonomous Data Warehouse is Oracle Machine Learning Notebook. The description on Oracle’s tutorial site states: “An Oracle Machine Learning notebook …

[Continue Reading...]

Convert Groupby Result on Pandas Data Frame into a Data Frame using …. to_frame()

Lucas Jellema October 11, 2019 Data Analytics 6 Comments

It is such a small thing. That you can look for in the docs, no Stackoverflow and in many blog articles. After I have used groupby on a Data …

[Continue Reading...]

Dissecting Dutch Death Statistics with Python, Pandas and Plotly in a Jupyter Notebook

Lucas Jellema October 10, 2019 Data Analytics 1 Comment

The CBS (the Dutch Centraal Bureau Statistiek) keeps track of many thing in The Netherlands. And shares many of its data sets as open data, typically in the form …

[Continue Reading...]

Loading Data into Always Free Oracle Autonomous Data Warehouse Cloud – from JSON and CSV to Database Table

Lucas Jellema October 9, 2019 Cloud, Data Analytics, Database, Oracle Cloud No Comments

In a number of recent articles, I have described how to provision an instance of Oracle Data Warehouse Cloud in Oracle Cloud’s Always Free tier. I have also described …

[Continue Reading...]

Oracle Data Visualization Desktop Connecting to Oracle Cloud Always Free Autonomous Database

Lucas Jellema October 8, 2019 Cloud, Data Analytics, Oracle Cloud No Comments

Oracle Cloud now offers the Always Free Tier that comes with an always free Autonomous Data Warehouse (up to 20 GB data storage) as well as an free Autonomous …

[Continue Reading...]

Downsizing the Data Set – Resampling and Binning of Time Series and other Data Sets

Lucas Jellema September 16, 2019 Data Analytics No Comments

Data Sets are often too small. We do not have all data that we need in order to interpret, explain, visualize or use for training a meaningful model. However, …

[Continue Reading...]

Prepare Jupyter Notebook Workshop Environment through Docker container image and Bootstrap Notebook

Lucas Jellema September 14, 2019 Data Analytics No Comments

Earlier this week, I presented a workshop on Data Analytics. I wanted to provide each of the participants with a fully prepared environment, right on everyone’s own laptop (and …

[Continue Reading...]

Determine the Language of a Document from the Letter Frequency – using Levenshtein Distance between sequences

Lucas Jellema August 21, 2019 Data Analytics 2 Comments

Even though many languages share the same or a very similar alphabet, the use of letters in documents written in these languages is quite distinct. The letter ” e” …

[Continue Reading...]

Tour de France Data Analysis using Strava data in Jupyter Notebook with Python, Pandas and Plotly – Step 2: combining and aligning multi rider data for analyzing and visualizing the Race

Lucas Jellema August 17, 2019 Data Analytics No Comments

In this article, I analyze the race that took place in stage 14 of the 2019 Tour de France in a Jupyter Notebook using Python, Pandas and Plotly and …

[Continue Reading...]

Tour de France Data Analysis using Strava data in Jupyter Notebook with Python, Pandas and Plotly – Step 1: single rider loading, exploration, wrangling, visualization

Lucas Jellema August 16, 2019 Data Analytics No Comments

In this article, I will show how to analyze the performance of Steven Kruijswijk during stage 14 of the 2019 Tour de France in a Jupyter Notebook using Python, …

[Continue Reading...]

Analyzing the 2019 Tour de France in depth using Strava performance data from Race Riders

Lucas Jellema August 14, 2019 Data Analytics No Comments

This year’s Tour de France was quite a spectacle. Great performances, exciting stages, unexpected events: it had it all. Analyzing the race events as they unfolded during the stages …

[Continue Reading...]

Correlation calls for Common Cause Consideration

Lucas Jellema June 10, 2019 Data Analytics No Comments

Correlation is a powerful thing. When two metrics rise and fall in a similar way, surely that cannot be just coincidence. It has to be meaningful in some way. …

[Continue Reading...]

Jupyter Notebook for retrieving JSON data from REST APIs

Lucas Jellema April 29, 2019 Data Analytics No Comments

If data is available from REST APIs, Jupyter Notebooks are a fine vehicle for retrieving that data and storing it in a meaningful, processable format. This article introduces an …

[Continue Reading...]

The Full Oracle OpenWorld and CodeOne 2018 Conference Session Catalog as JSON data set (for data science purposes)

Lucas Jellema April 26, 2019 Data Analytics No Comments

Oracle OpenWorld and CodeOne 2018 are two co-located conferences that took place in October 2018. Some 2000 sessions presented by over 2500 presenters form the core of these conferences. …

[Continue Reading...]