Plenty of articles describe this hello world of Machine Learning. I will merely list some references and personal notes – primarily for my own convenience.

The objective is: get a first hands on exposure to machine learning – using a well known example (Iris classification) and using commonly used technology (Python). After this first step, a second step seems logical: doing the same thing with my own set of data.

Useful Resources:

To set up an isolated environment in which to work with Python and friends: How to Create a Linux Virtual Machine For Machine Learning Development With Python 3 (Jason Brownlee) – http://machinelearningmastery.com/linux-virtual-machine-machine-learning-development-python-3/
To work through a well known example of machine learning using Python: Your First Machine Learning Project in Python Step-By-Step (Jason Brownlee) – http://machinelearningmastery.com/machine-learning-in-python-step-by-step/
Machine Learning Notebook – example of step by step data analysis pipeline on Iris data set: https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects/blob/master/example-data-science-notebook/Example%20Machine%20Learning%20Notebook.ipynb

Starting time: 6.55 AM

6.55 AM Download and install latest version of Oracle Virtual Box (5.1.22)

7.00 AM Download Fedora 64-bit ISO image (https://getfedora.org/en/workstation/download/)

7.21 AM Create Fedora VM and install Fedora Linux on it from ISO image (create users root/root and python/python); reboot, complete installation, run dnf update (updates worth 850 MB, 1348 upgrade actions – I regret this step), install Virtual Box Guest Addition (non trivial) using this article: https://fedoramagazine.org/install-fedora-virtualbox-guest/.

8.44 AM Save a Snapshot of the VM to retain its fresh, mint, new car smell condition.

8.45 AM Install Python environment for Machine Learning (Python plus relevant libraries; possibly install Notebook server)

8.55 AM Save another snapshot of the VM in its current state

now the environment has been prepared, it is time for the real action – based on the second article in the list of resources.

10.05 AM start on machine learning notebook sample – working through Iris classification

10.15 AM done with sample; that was quick. And pretty impressive.

It seems the Anaconda distribution of Python may be valuable to use. I have downloaded and installed: https://www.continuum.io/downloads .

Note: to make the contents of a shared Host Directory available to all users

cd (go to home directory of current user)

mkdir share (in the home directory of the user)

sudo mount -t vboxsf Downloads ~/share/ (this makes the shared folder called Downloads in Virtual Box Host available as directory share in guest (Fedora)

Let’s see about this thing with Jupyter Notebooks (fka as IPython). Installing the Jupyter notebook is discussed here: https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch01/README.md . Since I installed Anaconda (4.3.1 for Python 3.6) I have the Jupyter app installed already.

With the following command, I download a number of notebooks:

git clone https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects

Let’s try to run one.

cd /home/python/Data-Analysis-and-Machine-Learning-Projects/example-data-science-notebook

jupyter notebook ‘Example Machine Learning Notebook.ipynb’

And the notebook opens in my browser:

I can run the notebook, walk through it step by step, edit the notebook’s contents and run the changed steps. Hey mum, I’m a Data Scientist!

Oh, it’s 11.55 AM right now.

Some further interesting reads to get going with Python, Pandas and Jupyter Notebooks – and with data:

10 minute intro into Pandas – data analysis library for Python – http://pandas.pydata.org/pandas-docs/stable/10min.html
Tutorials to get going with Pandas – http://pandas.pydata.org/pandas-docs/stable/tutorials.html
Processing JSON with Python and analyzing the data as well ( Including map and heatmap) – https://www.dataquest.io/blog/python-json-tutorial/

The Hello World of Machine Learning – with Python, Pandas, Jupyter doing Iris classification based on quintessential set of flower data

Like this:

About The Author

Lucas Jellema

Share this:

Like this:

Related Posts

About The Author

Lucas Jellema