• Careers @ AMIS
  • About
  • News
  • Contact

AMIS Technology Blog

Menu
  • Internet Of Things
  • Platforms
  • Microsoft Azure
  • Oracle Cloud

Home » Cloud

Continuous Generation and Publication of Docstring Documentation on Azure – using Sphinx, Pydoc, Storage Account and App Service image

Continuous Generation and Publication of Docstring Documentation on Azure – using Sphinx, Pydoc, Storage Account and App Service

Sam Vruggink August 17, 2021 Cloud, Containers, Deployment, Docker, Microsoft Azure, Python, Software Engineering, Technology No Comments

In this blog I will explain how to generate static HTML pages from your projects Pydoc (docstring) comments with Sphinx. Then we are going to host it in an Azure Web App so that everyone in your team is able to access it. Because we use a Storage Mount, when new html files are generated, you just have to replace them in the storage account and it will be reflected on the endpoint.

This way you always have a hosted version of the latest documentation. See figure 1 for the architecture.

Continuous Generation and Publication of Docstring Documentation on Azure – using Sphinx, Pydoc, Storage Account and App Service image
Figure 1: Architecture

Why?

For big data engineering projects we use a lot of Azure Databricks. We created many Jupyter notebooks that live inside databricks. When you want to reuse your code, there is no easy way to do that. Besides that, testing is not easily done with plain Jupyter notebooks. With a library we can unit test all of our functionality before deploying it. This gives our team a lot of confidence while developing.

Because of these issues, we decided to create a library with all functionalities separated by packages. When this is installed on the cluster, we can easily call all functionality in the notebooks.

When you find a bug in a certain function, we don’t have to fix it in every notebook where we implemented the same code (duplication of code). Now we only have to fix the library and install it on our cluster (with CI/CD).

When programming notebooks we can instantly see the definition, parameters and an explanation of every function because we use Pydoc. But when you want to search a certain function, or easily see all functionality, we use the hosted version of our documentation.

Please read below on how to achieve this.

Repository

All code and settings you need for this blog are located in this repository: https://github.com/samvruggink/hosting-sphinx-docs-in-azure-webapp-blog

Step 1: Pydoc (docstring)

First of all we need to document our functions, we are using the industry standard Pydoc for this. Pydoc enables us to document our code in an easy way, please see the code block below for an example.

def plusOne(number: int) -> int:
    """[summary]

    Args:
        number (int): [description]

    Returns:
        int: [description]
    """ 

The first part is the summary where you can give a short description of what the function actually does. Afterwards you can add a description to all the args. If you specify a return value, add a description to it as well. Below is an example of a function where this is implemented.

def read_parquet(spark: SparkSession, path: str) -> DataFrame:
    """Reads a parquet and returns a DataFrame

    Args:
        spark (SparkSession): SparkSession
        path (str): path of the input file/dir

    Returns:
        [DataFrame]: A Dataframe with parquet data
    """
    df = spark.read.format("parquet").load(path, inferSchema=True)
    if isinstance(df, DataFrame):
        logger.info(f"Read parquet : {type(df).__name__}")
    else:
        logger.error(
            f"Is an instance of : {type(df).__name__}, not a DataFrame, exiting now !"
        )
    return df

When you have all your functions documented it’s time to generate Sphinx documentation.

Step 2: Generate Sphinx static HTML from your Pydoc definitions

Sphinx is an amazing library to generate static html files from pydoc. It’s super customizable with endless possibilities. This also makes it a bit more complex, the guide below will explain how to generate static HTML files from your src folder using a standard template.

This is our project structure:

Demo-project-sphinx-doc
|-- src
|   |-- __init__.py
|   |-- foo.py
|   |-- bar.py
|-- test
|   |-- __init__.py
|   |-- test_foo.py
|   |-- test_bar.py
|-- source
|    |--index.rst
|    |--conf.py
|    |-- _templates
|       |-- custom-module-template.rst
|       |-- custom-class-template.rst
| Makefile
| make.bat

We want to generate data from our functions in foo.py and bar.py. First you need to install Sphinx on your computer. You also need to have pip for this. pip is a package manager for Python (same as maven, npm, nuget). You can download and find more information here

In the project root execute the following commands:

pip install sphinx
sphinx-quickstart

sphinx-quickstart will generate basic configuration files, we are keeping the default source name directory, but you can change it. When it asks you to separate source and build directories, type “y“.

This will give you the following structure (see figure 2)

Continuous Generation and Publication of Docstring Documentation on Azure – using Sphinx, Pydoc, Storage Account and App Service image 1
Figure 2: Project structure

Now we are going to change some Sphinx settings in order to generate our static HTML files. Change your conf.py to the following:

import os
import sys

sys.path.insert(0, os.path.abspath(".."))

# -- Project information -----------------------------------------------------

project = "demo-project-sphinx-doc"
copyright = "2021, sam"
author = "sam"

# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.

extensions = [
    "sphinx.ext.autodoc",
    "sphinx.ext.autosummary",
]

autosummary_generate = True  # Turn on sphinx.ext.autosummary
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []

# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages.  See the documentation for
# a list of builtin themes.
#
html_theme = "alabaster"

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]


We are adding some extensions for automatic generation of static html files. Also we give our template path and make sure it knows where to find the src directory.

Now go into the index.rst and replace it’s contents with the following:

Welcome to demo-project-sphinx-doc's documentation!
===================================================

.. autosummary::
   :toctree: _autosummary
   :template: custom-module-template.rst
   :recursive:
   
.. toctree::
   :maxdepth: 2
   :caption: Contents:



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

The :recursive: will make sure that we can have nested structure in our src folder, and it will automatically discover it. For each module, it then summarises every attribute, function, class and exception in that module.

Now we need templates in order to parse our data from autosummary. Add 2 files to the _templates folder:

custom-module-template.rst

{{ fullname | escape | underline}}

.. automodule:: {{ fullname }}
  
   {% block attributes %}
   {% if attributes %}
   .. rubric:: Module Attributes

   .. autosummary::
      :toctree:                                          
   {% for item in attributes %}
      {{ item }}
   {%- endfor %}
   {% endif %}
   {% endblock %}

   {% block functions %}
   {% if functions %}
   .. rubric:: {{ _('Functions') }}

   .. autosummary::
      :toctree:                                         
   {% for item in functions %}
      {{ item }}
   {%- endfor %}
   {% endif %}
   {% endblock %}

   {% block classes %}
   {% if classes %}
   .. rubric:: {{ _('Classes') }}

   .. autosummary::
      :toctree:                                         
      :template: custom-class-template.rst             
   {% for item in classes %}
      {{ item }}
   {%- endfor %}
   {% endif %}
   {% endblock %}

   {% block exceptions %}
   {% if exceptions %}
   .. rubric:: {{ _('Exceptions') }}

   .. autosummary::
      :toctree:                                         
   {% for item in exceptions %}
      {{ item }}
   {%- endfor %}
   {% endif %}
   {% endblock %}

{% block modules %}
{% if modules %}
.. rubric:: Modules

.. autosummary::
   :toctree:
   :template: custom-module-template.rst               
   :recursive:
{% for item in modules %}
   {{ item }}
{%- endfor %}
{% endif %}
{% endblock %}

custom-class-template.rst

{{ fullname | escape | underline}}

.. currentmodule:: {{ module }}

.. autoclass:: {{ objname }}
   :members:                                    
   :show-inheritance:                          
   :inherited-members:                          

   {% block methods %}
   .. automethod:: __init__

   {% if methods %}
   .. rubric:: {{ _('Methods') }}

   .. autosummary::
   {% for item in methods %}
      ~{{ name }}.{{ item }}
   {%- endfor %}
   {% endif %}
   {% endblock %}

   {% block attributes %}
   {% if attributes %}
   .. rubric:: {{ _('Attributes') }}

   .. autosummary::
   {% for item in attributes %}
      ~{{ name }}.{{ item }}
   {%- endfor %}
   {% endif %}
   {% endblock %}

Now we can use the command make clean html to generate our documentation. If you go into build/html/ and open index.html you will see the following:

Continuous Generation and Publication of Docstring Documentation on Azure – using Sphinx, Pydoc, Storage Account and App Service image 2
Figure 3

Now we want to host our static HTML in an Azure Webapp

In our own environment we deploy everything using CI/CD. We deploy resources using Terraform and pipelines.

Because it’s a blog, I will show you how to host documentation while provisioning everything by hand.

What do we need to provision:

  • Storage Account
    • StorageV2 (general purpose v2)
    • Standard/Hot data
  • Linux App Service
    • A cheap tier to test is B1
  • Web App
    • Docker Container running on Linux
Continuous Generation and Publication of Docstring Documentation on Azure – using Sphinx, Pydoc, Storage Account and App Service image 3
Figure 4 Architecture

I created a video on how to actually do this. Please view the video below.

Video on how to deploy and configure the Azure resources

Thanks for reading my blog. Leave a comment if you have any questions.

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...
Tweet Share WhatsApp Telegram

Related Posts

Setting up Oracle Event Hub (Apache Kafka) Cloud Service and Pub & Sub from local Node Kafka client

Setting up Oracle Event Hub (Apache Kafka) Cloud Service and Pub & Sub from local Node Kafka client

Oracle Diagnostics Logging (ODL) for application development

OCI DevOps Deployment Pipeline for Functions–automation on Oracle Cloud

OCI DevOps Deployment Pipeline for Functions–automation on Oracle Cloud

Tags:app service, Azure, docker, docker compose, docstring, file share mount, host static html, hosting documentation, nginx basic authentication, pydoc, sphinx, sphinx generate html from pydoc, storage account, webapp

About The Author

Sam Vruggink

Consultant at AMIS Conclusion.

Sam Vruggink

Consultant at AMIS Conclusion.

View all posts

Azure Integratie Specialist

Azure Integration Specialist Vacature AMIS
  • Java
  • Architecture
  • Big Data
  • Cloud
  • Continuous Delivery
  • Internet Of Things
  • Microsoft Azure
  • Platform Technology
  • Python

Bekijk onze Vacatures

Azure Integration Specialist Vacature AMIS

Contact us

AMIS Technology Blog Copyright © 2025.
 
 

Loading Comments...
 

    %d