I recently spent quite some time on getting to know Playwright – a great tool for automating virtually any operation performed in a browser. Playwright is great for automating testing web applications as well as for screen scraping (for data), tactical integration, RPA, prototyping and even customizing third party applications for my personal use. I am now looking to extend my NodeJS/Playwright programs – that execute in the context of browser based applications – with a complementary tool for automating tasks on Windows.
One thing I would like to be able to do is schedule my NodeJS/Playwright applications to be executed periodically, automatically. I have learned about Windows Task Scheduler that can be used for this. Another thing I want to be able to do is take automated control of Windows Applications. I cannot or do not want to program all actions in NodeJS. I want to be able to make automated use of the tools I use on Windows myself. To for example automate editing Powerpoint sides, Word documents or Video files.
I learned about AutoIt. An automation tool for Windows GUI interactions. AutoHotKey is another option, a fork from AutoIt with better support for “hot keys” (or short cuts). UI.Vision (fka Kantu) is another option. I decided to first give AutoIt a try.
Download and Install
From this site, you can download AutoIt with a single installer https://www.autoitscript.com/site/autoit/downloads/ . The second download option is for an extended editor for AutoIt scripts. I have not yet worked with that editor.
The downloaded file is an .exe file
After installation, these options are available:
On the file system, a directory with examples has been installed:
These scripts are ready to run. They also provide clear, concrete examples of what the AutoIt scripts look like.
A fairly extensive help system is installed as well, accessible in the AutoIt3/AutoItX directory:
Online Documentation – same contents – is available at : https://www.autoitscript.com/autoit3/docs/ .
I got started by going through the tutorials in the on line help:
These tutorials help to create a few simple first scripts that for example display a message window, run notepad to create a new file, install winzip – all fully automatically. The help system provides many more script snippets – from creating screen captures to creating, opening and manipulating Word and Excel documents, from interactions with the file system and commands to play sound files. AutoIt also provides a set of functions to create your own GUIs – for gathering input from a user for running a script and providing feedback during the execution of a script. These GUIs range from extremely simple to potentially quite sophisticated.
I am still in my very early days with using AutoIt. I have been able to create a few simple scripts that can still do nice things – more than on Windows I was ever able to accomplish. For example:
- take a screenshot of a specific Window, create a new Word document and paste the screenshot into that document – just a few lines of code
- open a screen recording application, have it point at the right Window and start recording; again, just a few lines of code
I can think of many more things that could be interesting – involving combinations of Desktop Automation (with AutoIt) and Browser Automation (Playwright would be my first option)
Create and Run a First Script – Create Hello World document with Notepad
After running the AutoIt installer, the SciTE Script Editor has been installed. This editor is configured for AutoIt: it provides auto completion and has menu options to run the script your working on . This makes development quite convenient.
So our first script should have a Hello World flavor, should it not?
Open the SciTE Script editor.
Type this script:
Press F5 (or from the menu: Tools | Go) to execute the script. Here is the result (as expected):
To also save this file and close the application, add a few lines to the script:
Note how the characters ^ for CTRL and ! for ALT are used to insert shortcut keys in calls to the Send() function. Whenever an application makes its controls – fields, buttons, ..- accessible through shortcut keys, it is easiest to use that route for programmatic manipulation. Execution of this script results in:
Script to take screenshot and paste into freshly created Word Document
My second challenge is a little bit more involved than the first one. It consists of:
- take a screenshot of a specific window and save that screenshot to a file
- create a new Word application object (run Word) and create a new document
- add a picture from the file created from the screenshot to the Word document
- remove the screenshot-image-file from the file system
The code that performs these actions is shown below:
Hopefully the commands with the comments are self explanatory.
The resulting Word document with the message box still showing:
There are several ways to make a screenshot for a more specific area. The command ScreenCapture_Capture can be passed coordinates to specify a region : the starting x,y position and the width and height, for example 0, 0, 200, 500. Alternatively, to capture only contents from a specific window, the handle for that window can be passed to function _ScreenCapture_CaptureWnd. A window handle can be created using several functions that open applications, wait for windows etc. The function WinGetHandle can also be used.
For example, to capture the contents of a running Notepad application, the code would be:
See how the function WinWaitActive – that waits for the Notepad application window – returns a handle. This handle is used in the call to _ScreenCapture_CaptureWnd to restrict the screen capture to only this window.
The resulting Word document now becomes:
Identifying Windows using AutoIt Windows Info
Sometimes, getting the right identification for a window can be a little bit hard. What exactly is the window title? In case of multiple windows with the same title, how do I get a handle on the right one (add a snippet of text to be found in the window you want)?
AutoIt ships with a convenient utility, called AutoIt Windows Info:
This utility provides an overview of details of various types of components: windows, fields, buttons and other controls as well as the text displayed in the window (and useable in AutoIt to identify the right window). Simply click on the Finder Tool icon and drag to the window or button or field you are interested in. Then release the mouse button. The properties are refreshed as shown below. When you double click a property value, it is copied to the clipboard and can easily be pasted into the AutoIt script you are coding.
Here you can see the exact title of a window. And the various ways in which controls can be identified: by name but also by class, instance and ID – and finally by position. AutoIt WIndows Info is very useful to quickly gather the information you need to pinpoint Windows objects you want to manipulate in the AutoIt scripts you create.
I have also added another tool to my toolkit: MPos. This is a tiny tool that constantly displays the current mouse position. For determining the X and Y coordinates where AutoIt should click on component in case these components cannot easily be identified in a diffrent way, MPos is very convenient.
Introduction to Windows Task Scheduler – https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10
Comparing AutoHotKey and AutoIt: https://ui.vision/blog/ahk-vs-autoit/ – also comparing Kantu, now known as UI,Vision – and (a long time ago)- https://stackoverflow.com/questions/1686975/choosing-a-windows-automation-scripting-language-autoit-vs-autohotkey#:~:text=AutoHotkey%20includes%20a%20DLL%20file,all%20in%20its%20initial%20download.
Download AutoIt: https://www.autoitscript.com/site/autoit/downloads/
Online Documentation for AutoIt: https://www.autoitscript.com/autoit3/docs/
UI.Vision (fka Kantu) – https://ui.vision/rpa – OpenSource, limited number of scripts and number of actions per script
MPos – desktop tool for locating mouse cursor – https://sourceforge.net/projects/mpos/