TL;DR; Run demonstrations or instructions of browser actions. Allow the user to pause and skip acts, and to reset and switch scenarios. Allow the user to interact with the browser before, after and during the scenarios. The open source Playwright library is used from a custom NodeJS application in which the scenarios are defined – using a fair amount of CSS selectors and DOM manipulation.

The demonstration in this article shows three scenarios (The Netherlands, France, UK). Each country is introduced – using specific pages and sections on Wikipedia as well as through supporting sites. A callout is used to explain the scenario and each act. Balloon texts are used to further guide the user,

This screenshot shows the beginning of the scenario Tour of The Netherlands, act one. The user presses Play, the callout is shown, the balloon text is shown for the query field and Playwright starts typing the characters of “The Netherlands” in the search field.

The animation below shows the full scenario for The Netherlands – from starting the first act from the Play button. The scenario has four acts – opening, history, sports and culture. Each act is started from the play button. Acts can be skipped, The scenario can be reset (to the beginning) and other scenarios can be selected. The only user interaction here is pressing the play button to trigger each act. However: the user could be browsing Wikipedia between the acts – the browser session is freely available to the user.

Note: this demo was created with vanilla Wikipedia. The unchanged site was loaded in the Chromium browser instantiated through Playwright. All further manipulation was applied from the NodeJS application.

An act can use Playwright from NodeJS for manipulation of the website – and it also has full access to the browser DOM and the JavaScript context. An act can for example: fill in fields, highlight text, press buttons, scroll the page, open links, make selections, hover over elements, switch between tabs.

All sources for this example are available on GitHub: https://github.com/lucasjellema/playwright-scenarios/tree/main/step-by-step. Note: this article is intended as inspiration, to show you what is possible with Playwright. It is certainly not a ready to use solution or a great example of professional clean coding. Please interpret the right way for what it was intended for. And please share your ideas. What do you think of what I have described? Does it make sense? Can you see a way of applying it yourself? Do you have suggestions for me? Please let me know!

Introduction

In the last few weeks, I have done many interesting things with Playwright. The option of programmatically interacting with a browser running any web application or web site is powerful. It opens up many opportunities for automated browser actions, both headless (for testing, RPA, integration, screen scraping, health checks, automated reporting) and headful (prototyping, instructions, demonstrations, deeplink shortcuts, customized SaaS).

I have described some of the things I have achieved using Playwright in several earlier articles. These include: injecting a shortcut key handler into any web application running in the Playwright embedded browser (for example for taking screenshots or downloading all images), adding a toolbar into any web page, creating an API for working with WhatsApp on top of the WhatsApp Web UI, create a translation REST API on top of Google Translate, create Deepmark Booklinks that navigate into a fully initialized context in a SaaS web application, retrieve JSON reports for movies from IMDb.

The next challenge I had identified – and ended up discussing in this article – : using Playwright, I want to create a demonstration of a web application or web site. Playwright commands are used to perform the actions in the web page. What is special, is that these actions are to be grouped in steps (aka acts). Together, these steps form a scenario. Using the toolbar – the user can play an act, pause execution, skip (an act), reset (return execution to the first act). In between the execution of the automated steps, the user can interact freely with the web page. The demo steps (aka acts) do not have to be executed or they do not need to be the only thing that is done. Each step can have a title and a description that could be shown in a callout. The figure shows a callout that describes the current act and scene.

Additionally, associated with steps in an act (aka scenes) can be a a text balloon or an arrow with text – that can be positioned on the page near the element that is manipulated in the step. An example is shown in the figure.

In this way, demo scripts or tutorials can be created that take a user by the hand in a live web environment. The user can choose to let the prepared scenario play out or to intervene, contribute or event take over (in part).

I have created multiple scenarios for Wikipedia (Netherlands, France, UK) and the user can choose a scenario to execute. At any point, the user can decide to run a different scenario.

Implementation

At the heart of the implementation is a very simple piece of code that leverages the Playwright library to start a browser, create a browser context and open a web page.

Step 1 invokes a function to add the toolbar to every page in the browser context, every time after the page has navigated or reloaded and the DOM has been rebuilt. This toolbar has controls to run a scenario (as well as to switch between scenarios, pause execution, reset a scenario). The toolbar is passed the director function; this function handles toolbar events regarding scenario execution.

Step 2 is for injecting the callout object to the current page. This done through direct manipulation of the DOM in function injectCalloutIntoPage. The title and description of the initial scenario are passed to display in the callout.

The scenarioStatus object contains all scenarios and keep track of which scenario is the current one and how far along in that scenario the user has come. The pause state is also recorded in this object.

The scenarios are defined with a title and description and nested steps or acts. Each act also has a title and a description as well as an action. The action is a function that is invoked when the act is executed. This function manipulates the callout, bubble or balloon help and the browser UI controls, DOM elements and JavaScript context. Here is a small example of the NL scenario:

define title and description for the scenario
define the array with scenes (aka acts) in the scenario; each act has a title and description – these are displayed in the callout
each act has an action. this action is server side JavaScript function (NodeJS context) that receives the current (Playwright) page object as input; frequently this function will evaluate selector expressions and JavaScript statements in the context of the web page inside the browser. The first action in this example writes a balloon help text, types the string “”The Netherlands” into the search field and presses the search button. When the page has reloaded – with the details for The Netherlands – a fresh balloon help text is displayed. Note: the calls to waitForUnpause() are made to verify of the user has paused execution; these calls will block when this is the case and until the pause is ended
In the second action is an example of calling scrollToElement – a custom NodeJS function – to scroll the page to the element handle republic that was retrieved using function page.$() and the selector for a link element with specific title attribute value. This scrolling is performed in the browser and in a smooth way so the user sees the scrolling happening.

The function scrollToElement is simple enough:

This function uses the Playwright function page.evaluate() to pass the DOM element for a specific element handle into a JavaScript function that is executed in the context of the browser. The function leverages the DOM Element function scrollIntoView() to do the hard work.

Another example of what the scenarios can do is highlight text. This too turns out to be fairly simple. A link is located with a specific text content (regarding Max Verstappen, hence the element name maxText). Subsequently, a browser side snippet is executed that takes the <P> parent of this link, wraps its entire innerHTML in <mark> tags and scrolls this <P> element into view.

It is a little bit crude. Working with Range and Selection objects is the more precise approach. However, it does the trick for me in this example:

The director() function is at the heart of things: it handles the toolbar events regarding the scenarios (play, reset, skip, pause,switch). These events are captured in the browser context and passed to the NodeJS context – from the onclick event handlers on the toolbar links. This statement creates the bridge from Browser back to NodeJS:

The call to context.exposeBinding ensures that NodeJS function director will be available anywhere in the browser on the window object as directorFunction. This function is invoked from the onclick handlers in the toolbar.

And now for the director() function itself. It receives the source object – which contains the page and browser context – from Playwright and the instruction input parameter from the onclick handler, to indicate which action was triggered (next aka play, skip, reset, pause, switch).

Depending on the value of instruction, the function will manipulate the scenarioStatus object that keeps track of the current scenario and its status (next act, paused or not). For skip for example, next act is simply incremented. Pause means either pause or unpause (it is used as a toggle) and does nothing but manipulate a flag in scenarioStatus. Perhaps I should add a visual indication as well. Reset means resetting the next act to 0 or the beginning of the scenario. Switch is interpreted as select the next scenario. The call to populateCallOut() is made to synchronize the callout with the current scenario and its next act that is coming up.

Finally next aka play is the trigger for executing the action for an act meaning invoking its function. The director cannot stop an action once it is executing. However, the action itself may check with the scenarioPaused status and honor it by waiting for a pause to be concluded.