When using AI models like Stable Diffusion, sometimes input images need to be of a specific size. In case of Stable Diffusion, multiples of 64 are required. Stable Diffusion (at least 1.5) works best with images of 512 pixels in width or height. If you have an image that is not a multiple of 64, like 599×205 pixels, you can maintain the aspect ratio by resizing it to 1496×512. However, this results in a size that is not a multiple of 64, so the image needs to be padded to 1536×512 in order to be processed by the model. To automate this process for a large number of images, I wrote a Python script using OpenCV. I chose to use OpenCV because it is a widely-used, easy-to-use library with powerful capabilities.
Batch resize and pad images
First you need to install OpenCV. This can be done with: “pip install opencv-python” or “conda install -c conda-forge opencv” (whichever package manager you prefer).
The script is straightforward and can be viewed below or here. It demonstrates how OpenCV can be used to easily perform basic tasks such as opening a file, resizing, padding, and writing the output back to the filesystem.
OpenCV does not support the AVIF file format (read here) that’s why I included the option to exclude extensions to allow processing of directories which have mixed content, for example obtained by scraping the web. Most other image file formats are supported though (such as JPEG, PNG, BMP, TIFF, WEBP and others).
Output images are renamed to [original filename without extension]_scaled.png. OpenCV automatically determines the file format based on the extension, so you do not need to specify it explicitly.
I have chosen a color of white for padding since the Stable Diffusion WebUI tool uses a black mask and this way I can easily see what has been masked and what hasn’t. Also I only pad to the right and to the bottom since in my experience, those are the areas for which inpainting is usually most useful.
import os
import cv2 as cv2
input_directory = ''
output_directory = ''
output_ext = "png"
excluded = ['avif']
def get_file_ext(filename):
# this will return a tuple of root and extension
split_tup = os.path.splitext(filename)
file_name = split_tup[0]
file_extension = split_tup[1]
return file_extension[1:]
def get_output_filename(filename):
f_name = os.path.basename(filename)
new_filename = output_directory + "/" + f_name.replace("."+str(get_file_ext(filename)), "."+output_ext).replace(" ", "_")
return new_filename
#gets the nearest larger 64, starting with 512
def get_nearest_larger(number):
nearest_larger = 512
while nearest_larger < number:
nearest_larger = nearest_larger + 64
return nearest_larger
def processFile(filename):
print("Process image "+filename)
img = cv2.imread(filename, cv2.IMREAD_COLOR)
height = img.shape[0]
width = img.shape[1]
if height > width:
preferred_width = 512
preferred_height = round(preferred_width / width * height)
pad_bot = get_nearest_larger(preferred_height) - preferred_height
pad_right = 0
else:
preferred_height = 512
preferred_width = round(preferred_height / height * width)
pad_bot = 0
pad_right = get_nearest_larger(preferred_width) - preferred_width
img_new = cv2.resize(img, (preferred_width, preferred_height))
img_new_padded = cv2.copyMakeBorder(img_new, 0, pad_bot, 0, pad_right, borderType=cv2.BORDER_CONSTANT, value=0)
cv2.imwrite(get_output_filename(filename), img_new_padded)
print("Outputfile: "+get_output_filename(filename))
def processDirectory(dir_name):
for filename in os.listdir(dir_name):
f = os.path.join(dir_name, filename)
# checking if it is a file
if os.path.isfile(f):
ext = get_file_ext(f)
if not ext in excluded:
processFile(f)
if __name__ == '__main__':
processDirectory(input_directory)