Apache Superset is an open source end user tool for analyzing and visualizing data. Dozens of chart types are available out of the box, dashboards can be created, finetuned and shared/published. Superset can work with virtually any SQL data source. (I have recently written an introduction to Superset and created […]
Data Analytics
Getting started working with Apache Superset – the open source data exploration and visualization platform
Apache Superset is an open source platform for data exploration and visualization. It can be described as an open, free alternative for Microsoft PowerBI, Tableau, Qlik and Oracle Analytics Desktop. Superset connects (through the SQL Alchemy framework) to dozens of SQL compliant databases and can work with CSV and JSON […]
Steampipe–analyze data from cloud, file, platform, IaC using SQL queries
In our daily work we are dealing with data from many sources. Data in CSV files, from Cloud APIs, in mail servers, configuration files, Terraform plans, in logging systems, source code repositories and many more. Different formats, access methods, tools. And retrieving data from one such source can be challenging […]
Quickest way to try out Jupyter Notebook: zero install, 3 CLI commands and 5 minutes to action
This brief article shows you the quickest way to trying out Jupyter Notebooks. It will not try to persuade you that you should try them out. You probably know that already. It will not show a complex Notebook in detail – plenty of those are available. It simply tells you: […]
Introducing OCI Data Integration – New cloud native, serverless service for ETL/ELT and Data Pipelines
TL;DR: Oracle offers a new cloud native, serverless service on OCI for data processing and ETL/ELT, called Data Integration. It seems a new incarnation of Oracle Data Integrator or even of Warehouse Builder. It provides data flows that can filter, convert, join and aggregate. It currently only supports Object Storage […]
Welcoming the Data Catalog Service on Oracle Cloud Infrastructure
Data is really important to any organization. Data tells us what the organization is doing, and where it is going. And how it can improve quality and efficiency of processes. Achieve better results. Data can be one of the key products an organization delivers to its customers. I am sure […]
Ordering rows in Pandas Data Frame and Bars in Plotly Bar Chart (by day of the week or any other user defined order)
I have time series data in my Pandas Data Frame. And I want to present an aggregation of the data by day of the way in an orderly fashion – sorted by day of the week. Not alphabetically, but sorted the way humans would order the days – starting from […]
Introduction to Oracle Machine Learning – SQL Notebooks on top of Oracle Cloud Always Free Autonomous Data Warehouse
One of the relatively new features available with Oracle Autonomous Data Warehouse is Oracle Machine Learning Notebook. The description on Oracle’s tutorial site states: “An Oracle Machine Learning notebook is a web-based interface for data analysis, data discovery, and data visualization.” If you are familiar with Jupyter Notebooks (often Python […]
Convert Groupby Result on Pandas Data Frame into a Data Frame using …. to_frame()
It is such a small thing. That you can look for in the docs, no Stackoverflow and in many blog articles. After I have used groupby on a Data Frame, instead of getting a Series result, I would like to turn the result into a new Data Frame [to continue […]
Dissecting Dutch Death Statistics with Python, Pandas and Plotly in a Jupyter Notebook
The CBS (the Dutch Centraal Bureau Statistiek) keeps track of many thing in The Netherlands. And shares many of its data sets as open data, typically in the form of JSON, CSV or XML files. One of the data sets is publishes is the one on the number of births […]
Loading Data into Always Free Oracle Autonomous Data Warehouse Cloud – from JSON and CSV to Database Table
In a number of recent articles, I have described how to provision an instance of Oracle Data Warehouse Cloud in Oracle Cloud’s Always Free tier. I have also described how to connect both SQL Developer and Data Visualization Desktop to this ADW instance. In this article, we take this one […]
Oracle Data Visualization Desktop Connecting to Oracle Cloud Always Free Autonomous Database
Oracle Cloud now offers the Always Free Tier that comes with an always free Autonomous Data Warehouse (up to 20 GB data storage) as well as an free Autonomous Database for Transaction Processing. In an earlier article, I described how to provision your own Free Autonomous Data Warehouse in Oracle […]
Downsizing the Data Set – Resampling and Binning of Time Series and other Data Sets
Data Sets are often too small. We do not have all data that we need in order to interpret, explain, visualize or use for training a meaningful model. However, quite often our data sets are too large. Or, more specifically, they have higher resolution than is necessary or even than […]
Prepare Jupyter Notebook Workshop Environment through Docker container image and Bootstrap Notebook
Earlier this week, I presented a workshop on Data Analytics. I wanted to provide each of the participants with a fully prepared environment, right on everyone’s own laptop (and optionally in a cloud environment such as Katacoda). The environment consisted of Python 3.7, Jupyter Labs (for Notebooks), many additional Python […]
Determine the Language of a Document from the Letter Frequency – using Levenshtein Distance between sequences
Even though many languages share the same or a very similar alphabet, the use of letters in documents written in these languages is quite distinct. The letter ” e” is quite popular, but not the most used letter in every language. In fact, the letter frequency is very specific to […]
Tour de France Data Analysis using Strava data in Jupyter Notebook with Python, Pandas and Plotly – Step 2: combining and aligning multi rider data for analyzing and visualizing the Race
In this article, I analyze the race that took place in stage 14 of the 2019 Tour de France in a Jupyter Notebook using Python, Pandas and Plotly and based on the Strava performance data published by Steven Kruijswijk, Thomas de Gendt, Thibaut Pinot and Marco Haller. In this previous […]
Tour de France Data Analysis using Strava data in Jupyter Notebook with Python, Pandas and Plotly – Step 1: single rider loading, exploration, wrangling, visualization
In this article, I will show how to analyze the performance of Steven Kruijswijk during stage 14 of the 2019 Tour de France in a Jupyter Notebook using Python, Pandas and Plotly. Strava collects data from athletes regarding their activities – such as running, cycling, walking and hiking. Members can […]
Analyzing the 2019 Tour de France in depth using Strava performance data from Race Riders
This year’s Tour de France was quite a spectacle. Great performances, exciting stages, unexpected events: it had it all. Analyzing the race events as they unfolded during the stages of this year’s Tour is something I am keen to attempt. Using Jupyter Notebooks, Python and Pandas and Plotly for visualization, […]
Correlation calls for Common Cause Consideration
Correlation is a powerful thing. When two metrics rise and fall in a similar way, surely that cannot be just coincidence. It has to be meaningful in some way. In our minds correlation is easily turned into causality. Our minds are wired to think like that: find the narrative in […]
Jupyter Notebook for retrieving JSON data from REST APIs
If data is available from REST APIs, Jupyter Notebooks are a fine vehicle for retrieving that data and storing it in a meaningful, processable format. This article introduces an example of a such a dataset: Oracle OpenWorld 2018 was a conference that took place in October 2018 in San Francisco. […]
The Full Oracle OpenWorld and CodeOne 2018 Conference Session Catalog as JSON data set (for data science purposes)
Oracle OpenWorld and CodeOne 2018 are two co-located conferences that took place in October 2018. Some 2000 sessions presented by over 2500 presenters form the core of these conferences. Many details are known about each of the sessions and the speakers – from title, abstract, room (size), date and time, […]
What is Apache Drill and how to setup your Proof-of-Concept
Apache Drill is a schema-free SQL query engine. Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A single query can join data from multiple datastores. For example, you can […]
Running an enriched Jupyter Notebook runtime in a Docker Container – locally or cloud side
Data Wrangling is a crucial stage in the data science workflow. Or in any workflow that starts from raw data and hopes to achieve business insight – and perhaps ready to run well trained machine learning models. Data wrangling encompasses various steps and activities -from gathering raw data, exploring, validating, […]
Oracle Analytics Cloud – Data Flow to produce a Date Value and Timeline to Visualize the time related data
In two earlier articles on Oracle Analytics Cloud I have introduced the Oracle OpenWorld 2018 Session Catalog Data Set (First steps with Oracle Analytics Cloud – Gather, Explore, Wrangle, Visualize and Creating a Data Flow in Oracle Analytics Cloud to enriching with Geo Encoding to Map visualization of data) . […]
Creating a Data Flow in Oracle Analytics Cloud to enriching with Geo Encoding to Map visualization of data
In this article, I will show how I have created a Data Flow in Oracle Analytics Cloud to enrich a data set with geocoding data from a different data set, in order to be able to create a map based visualization of data. More specifically: in a previous article I […]
First steps with Oracle Analytics Cloud – Gather, Explore, Wrangle, Visualize
Data analytics is what turns data into business value. Oracle has a long history in Data Analytics – from Oracle Discoverer and its predecessors such as Data Browser through OBI EE and Endeca to several cloud services. Oracle Analytics Cloud is the strategic offering to adopt going forward – not […]
When to use the Oracle Database In-Memory option?
The application and usage of the Oracle Database In-Memory has been described by Pom Bleeksma in this post. Oracle Database In-Memory can result in huge improvement in application query performance. This post will answer the question: “what would be an optimal situation for using the Oracle Database In-Memory feature?” The […]
The Performance-button on Oracle Warehouse Builder Design Center
Hereby a somewhat embarrasing story about a performance problem with an Oracle Warehouse Builder – database (11.2.0.3). Embarrasing, while it took too much time to figure out what was going on. The case: unexpectedly, within a week notice, the performance of an Oracle Warehoude Builder environment decreased drastically: logging in […]
The AMIS Summary of Oracle OpenWorld 2013 is available for download – 60-page white paper
Oracle OpenWorld is a monster event – 10Ks of attendees, thousands of sessions and 100Ks of private conversations that all help convey and define the message about Oracle’s strategy and the roadmap for its close to 4000 thousand products. Concurrent with OOW is the JavaOne conference that – at a […]
OOW13: summarizing one week and 2000 sessions in 3 hours and a bit – the yearly AMIS OOW Review session – 10th October
On Thursday 10th of October, the 12 man strong AMIS delegation at Oracle OpenWorld and JavaOne 2013 will present its findings in a 3 hour session at AMIS HQ in Nieuwegein, The Netherlands. You are welcome to attend this free session (from 16.30 on, food provided). Please register here: http://www.amis.nl/nl-NL/evenementen/technologie-evenementen/oow-review. […]
OOW13: What questions to get answered and plans to see evolved at this year at Oracle OpenWorld Conference
As I am about to start my ninth Oracle OpenWorld Conference, I am wondering what this year’s conference will have in store for me. My schedule is largely filled up, I know who I am going to meet, where I have to speak and where I need to go. Now […]
Het Oracle OpenWorld Preview Evenement (5 september 2013) – 15 sprekers & sessies
Vanaf 22 september vindt in San Francisco de Oracle OpenWorld conferentie plaats: hét evenement waar Oracle haar productstrategie uit de doeken doet en waar Oracle specialisten van over de hele wereld ervaringen uitwisselen. Tegelijk met Oracle OpenWorld wordt ook de JavaOne conferentie georganiseerd, het trefpunt voor de wereldwijde Java gemeenschap. […]
AYTS: Summary of Oracle Business Intelligence Applications – Customizations
Three months ago started the Oracle program: Are You The Smartest. For me it is an opportunity to test my current knowledge level and to extend my knowledge. After every session I follow, I will write a brief summary as part of the preparation for the test. I will continue with the summary of […]
AYTS: Summary of the Introduction of Oracle Business Intelligence Applications
Three months ago started the Oracle program: Are You The Smartest. For me it is an opportunity to test my current knowledge level and to extend my knowledge. After every session I follow, I will write a brief summary as part of the preparation for the test. I will continue with the summary of […]
Training Oracle ADF 11g, 15 tot en met 19 april
Van 15 tot en met 19 april geeft Luc Bors de 5-daagse ADF 11g training op het kantoor van AMIS in Nieuwegein. In 5 dagen leer je de basis van Oracle ADF 11g. De training bestaat basis uit presentatie, demonstratie en hands on, doorspekt met best practices en voorbeelden uit […]
AMIS vat Oracle OpenWorld samen in speciale whitepaper
Als sluitstuk van de jaarlijkse Oracle OpenWorld conferentie brengt AMIS een whitepaper uit. Een handzaam document waarin het volledige verhaal van Oracle OpenWorld 2012 is gebundeld. Een team van AMIS was tijdens de conferentie in oktober nadrukkelijk aanwezig; als sponsor, deelnemer, netwerker en spreker – en als aandachtig luisteraar en […]
OOW 2012: The yearly AMIS Review from Oracle Open World and JavaOne – slides available
Yesterday (16th October), 10 days of the end of the yearly Oracle show in San Francisco, AMIS organized its ‘Review from Oracle Open World 2012’ session with an overview of news, trends, announcements, special finds and interesting rumors . This session was ‘sold out’ (even though it was free). For […]
Kom kennismaken met AMIS en doe mee met uitdagende projecten
Hierbij nodigen we je uit om met ons kennis te komen maken. Ben jij een (junior) Oracle consultant die een stap verder wil maken? Wil je verder groeien en ontwikkelen tot principal consultant? AMIS geeft je de kans om die stap te zetten. Bij ons krijg je de ruimte om […]
OOW 2012 – The Big Stories
The show is over, the visitors are on their way home. The process of digesting the announcements, roadmaps and rumors – confirmed or not – can proceed in full swing. What has become of last year’s plans, what are this year’s plans (for next year and beyond) and what has […]
OOW 2012: Questions to get answered during this conference
The show of the year is around the corner: on Sunday it will all start again, the Oracle Open World conference. Tens of thousands of developers, architects, administrators, project managers, decision makers and others involved with Oracle products one way or another are gathering in and around San Francisco. AMIS […]