Python and OpenCV: Batch resize and pad images

Maarten Smeets
0 0
Read Time:3 Minute, 8 Second

When using AI models like Stable Diffusion, sometimes input images need to be of a specific size. In case of Stable Diffusion, multiples of 64 are required. Stable Diffusion (at least 1.5) works best with images of 512 pixels in width or height. If you have an image that is not a multiple of 64, like 599×205 pixels, you can maintain the aspect ratio by resizing it to 1496×512. However, this results in a size that is not a multiple of 64, so the image needs to be padded to 1536×512 in order to be processed by the model. To automate this process for a large number of images, I wrote a Python script using OpenCV. I chose to use OpenCV because it is a widely-used, easy-to-use library with powerful capabilities.

Batch resize and pad images using Python and OpenCV

Batch resize and pad images

First you need to install OpenCV. This can be done with: “pip install opencv-python” or “conda install -c conda-forge opencv” (whichever package manager you prefer).

The script is straightforward and can be viewed below or here. It demonstrates how OpenCV can be used to easily perform basic tasks such as opening a file, resizing, padding, and writing the output back to the filesystem.

OpenCV does not support the AVIF file format (read here) that’s why I included the option to exclude extensions to allow processing of directories which have mixed content, for example obtained by scraping the web. Most other image file formats are supported though (such as JPEG, PNG, BMP, TIFF, WEBP and others).

Output images are renamed to [original filename without extension]_scaled.png. OpenCV automatically determines the file format based on the extension, so you do not need to specify it explicitly.

I have chosen a color of white for padding since the Stable Diffusion WebUI tool uses a black mask and this way I can easily see what has been masked and what hasn’t. Also I only pad to the right and to the bottom since in my experience, those are the areas for which inpainting is usually most useful.

import os
import cv2 as cv2


input_directory = ''
output_directory = ''
output_ext = "png"
excluded = ['avif']


def get_file_ext(filename):
    # this will return a tuple of root and extension
    split_tup = os.path.splitext(filename)
    file_name = split_tup[0]
    file_extension = split_tup[1]
    return file_extension[1:]


def get_output_filename(filename):
    f_name = os.path.basename(filename)
    new_filename = output_directory + "/" + f_name.replace("."+str(get_file_ext(filename)), "."+output_ext).replace(" ", "_")
    return new_filename


#gets the nearest larger 64, starting with 512
def get_nearest_larger(number):
    nearest_larger = 512
    while nearest_larger < number:
        nearest_larger = nearest_larger + 64
    return nearest_larger


def processFile(filename):
    print("Process image "+filename)
    img = cv2.imread(filename, cv2.IMREAD_COLOR)
    height = img.shape[0]
    width = img.shape[1]
    if height > width:
        preferred_width = 512
        preferred_height = round(preferred_width / width * height)
        pad_bot = get_nearest_larger(preferred_height) - preferred_height
        pad_right = 0
    else:
        preferred_height = 512
        preferred_width = round(preferred_height / height * width)
        pad_bot = 0
        pad_right = get_nearest_larger(preferred_width) - preferred_width

    img_new = cv2.resize(img, (preferred_width, preferred_height))
    img_new_padded = cv2.copyMakeBorder(img_new, 0, pad_bot, 0, pad_right, borderType=cv2.BORDER_CONSTANT, value=0)
    cv2.imwrite(get_output_filename(filename), img_new_padded)
    print("Outputfile: "+get_output_filename(filename))


def processDirectory(dir_name):
    for filename in os.listdir(dir_name):
        f = os.path.join(dir_name, filename)
        # checking if it is a file
        if os.path.isfile(f):
            ext = get_file_ext(f)
            if not ext in excluded:
                processFile(f)


if __name__ == '__main__':
    processDirectory(input_directory)

About Post Author

Maarten Smeets

Maarten is a Software Architect at AMIS Conclusion. Over the past years he has worked for numerous customers in the Netherlands in developer, analyst and architect roles on topics like software delivery, performance, security and other integration related challenges. Maarten is passionate about his job and likes to share his knowledge through publications, frequent blogging and presentations.
Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Next Post

Dynamic Configuration of Terraform plans and Stacks in Oracle Cloud Infrastructure

Some of the data used by Terraform to create cloud resources is sensitive and should not be stored in plain text in source code repositories. Examples are passwords, client secrets, API tokens. Even though the Terraform configurations are (infrastructure as) code – not every element can be treated as code. […]
%d bloggers like this: