“Can we let go of that office building?” (we pay rent, utilities, furniture – is that still necessary or justified)
“Well, I believe Chris is still making use of that building.”
Chris: “my team meets Wednesday morning every other week in one of the conference rooms in that building.”
(Chris’ team of five are the only people still making use of a five story building the company started renting four years ago when the world and the company were a different place).
Of course it is silly in this case to continue holding on to this building and all its associated costs.
(I asked StableDiffusion: can you create a painting based on the phrase “if you do not use it, you should lose it”? This is what it came up with)
Now let’s change the perspective – just a little:
“Can we let go of that application?” (we pay license and support fees, compute resources, operational management – is that still necessary or justified? Or when it comes to custom built applications: instead of license and support fees, we have maintenance / life cycle management costs)
“Well, I believe Chris is still making use of that application. Or maybe someone else?”
Or perhaps even more fine grained: “Can we let go of that specific application feature /extra option?” – is that still necessary or justified)
When it comes to software we seem better at thinking up new applications and features than at managing the ones we have. Keeping track of the applications in our organization – both custom built and bought from external vendors – is hard. Monitoring the actual usage of these applications – and the value we are getting out of that set off against the ongoing costs associated with these applications – is not commonly done.
We may have an idea about who is using an application – and sometimes even more tangible information (because of registered users) – but frequently not about the frequency or intensity of the usage and certainly not of the value.
Just as a building has many floors, rooms and facilities, an application can consist of different modules and options and provide many different features. To know that a person is using the application does not tell the entire story: they may use 90% of what the application offers but they may also use only a very small subset of the features. Can we let go of a building that is still used if all we need to cater for is one conference room for one morning every two weeks? We probably can and should. So what about applications?
It seems clear to me that it is imperative that we know about the usage of our applications – and the meaningful constituents of these applications (modules, options, features). And that we use that information to manage the application portfolio and the individual products.
If we pay license fees for applications or options that are hardly used, we should consider if we have to continue spending money (perhaps we should, because the value of even that limited usage outweighs the associated costs). Instead of getting rid of an application altogether, perhaps we can trim costs by scaling down the number of licenses and/or the compute resources assigned to the application.
If our custom built application – that we do maintenance on, use compute resources for and spend Ops effort on – is not used much or has specific features that are not used we should consider decommissioning applications or specific features in applications.
When talking about cloud migrations, we commonly talk about the 6R strategy. The 6Rs represent six alternative ways forward with the components in our existing IT landscape. Although the first R (for Retire) is not really a way forward – it is the decision to discontinue a component altogether.
It is not necessary of course to wait for a cloud migration to decide whether or not to stop or continue using an application or a feature. That is a decision we can make every day. A careful consideration of applications and features and whether or not to continue to use them is in order perhaps not every day, but surely periodically – and at least yearly. (the actual frequency depends on the rate of change and the costs involved with specific applications and components.
Note: it is easy to overlook the infrastructure and platform components that support the applications. A server or storage unit or database used for one or more applications can be reused or retired itself when some or all of the applications are retired. Perhaps we can retire entire technology stacks – and with it the need to have people with dedicated knowledge.
Keeping track of the applications and features, the supporting technology components for each of them and the actual usage should be part of our IT operations.
What starts out as a neutral overview of applications, features and technology components:
may quickly turn into a todo-list when the (lack of) usage of these is charted:
Trickle down value of decommissioning?
Crucial for the decision to decommission an application or platform or infra component is of course knowledge about the usage and the dependencies. What is the impact of shutting down a functional component – for the current users and perhaps somewhere down the line in the end to end chain? What is the purpose of each component?
Part of this is straightforward configuration management. Each application should be known and recorded and should have an owner – not someone from IT (unless the application is an IT tool) but someone from the business. This application or product owner needs to be able to justify the continued existence of the application and the corresponding costs. This owner should be able to decide if options/features should still be continued or can be retired. Anything that is marked as retired should be included on the IT backlog and should be considered “debt”.
Assessing the impact of the retirement of the application or some of its features on the platform and infrastructure is the responsibility of the architect (or architecture team). It may be that retiring an application (module) mean that a platform component can (finally) be decommissioned, that infrastructure components can be repurposed and that we can finally start forgetting everything we have had to learn and remember about that unloved, legacy technology. The value derived from removing one application feature can be considerable through this trickle down effect (and reversely: the cost of not retiring it is substantial and should be justified)
Beware: classifying a crucial component as no longer used because it is an end of year report and your investigation did not cover the December-February period is something you should avoid. Assessing the actual status of components has to include an understanding of the purpose of the component.
The actual decommissioning is a delicate operation – just like the release of a new application version, the decommissioning is a process with tests, checks, approvals and rollback options in case of failure. Typically, it is a multistep process with staged switching off and subsequent removal.
How to assess usage? Tactical Observability
In order to evaluate the usage and derived value of an application and it features, we have to know how much (and by whom) it is used. When, why, how often, how useful and valuable?
How do we know about the usage of applications and features? We can go round and ask people – but that is not an efficient nor effective approach that
may will not give reliable outcomes.
Ideally, the application will let us know about how it is used. Similar to a car that tracks the distance it has traveled, an application should be able to record and report its usage. Not just at the level of the application, but down to modules, options and meaningful features.
We frequently talk about observability of software and about instrumentation in order to achieve observability. Typically our immediate concerns are availability, issue resolution, performance, scalability, bottleneck analysis, trend analysis etc. We do not talk as much about observing simply the usage made of our application and its constituents and features. We do not need that information for operational management (the main motivation for observability) – but we do need it for tactical (application) management. Product owners and application managers should demand such metrics: who (at least a count of individual users) is using my application (features | modules) = how frequently and at what times (and perhaps in what context or coming from where)?
This usage information also comes in handy to determine the coverage of automated tests and the thoroughness of manual (acceptance) testing: which features have actually been touched in the test? (you may think that not so useful, Allow me to share this anecdote: I was part of a project that created the replace of a port management system that was targeted at seven ports on four continents; each local organization had been mandated to thoroughly test each iteration of the new system. Their feedback and acquiescence was essential: the new system would be rolled out and everyone would have to adopt it. We had a suspicion that most sites did not actually do much testing at all. However, we could not be sure and they did not admit to it. Having a mechanism to track which testing had been done and which areas of the product had not been tried out would have been really useful.)
If instrumentation to report usage is not embedded in the application – as it probably is not in existing and commercial applications – we can hope to achieve something along the same lines using external instrumentation. Depending on the nature of the application (browser based web application, mobile app, API, desktop application, binary executable, ..) we can make use of various sources of information, agents or wrappers.
In the world of monitoring, many agents have become available (for example as part of the OpenTelemetry project) that can add some level of tracking of application activity and interactions to existing applications (treating the application as a black box). API Gateways, Load Balancers, Service Meshes, Proxies, Sidecars and WebServers are all components that route requests to applications and act as wrapper around applications; log files produced by these components or explicit instrumentation added to them can help provide insight into when specific operations were performed regarding these applications – and where the requests originated from. Operating system monitoring can at least reveal if application processes are active at all – and perhaps to what extent (variations in memory/CPU usage). Recent monitoring tools that leverage eBPF instrumentation of the Linux kernel can extend our reach of non-invasive observation of activities in and calls to or from applications. Somewhat more crudely but potentially very useful is the information we can learn from job scheduling tools (Cron, Control M, IBM Workload Automation, ..) and the jobs they trigger.
What we are after is insight – for each application and each part (module, option, feature) of an application that we should discern (because of costs and security) – into when, how much and by whom and in which context (from where, as part of which process) it is used. The metrics will not be able to tell why and with what value the usage takes place. But we will know when and how much usage takes place.
Non Functional Requirement: usage observability
One non functional requirement I would like to suggest for any new application or application feature: usage observability. The solution design for any new application or substantial feature should describe how the product owner & application manager will learn about the usage. There should be instrumentation that reports on when the feature is used and what the context of the usage is (a relevant subset of user, (calling) system, transaction, process, location, …).
The True Cost of a Feature – Kill your darlings (Lucas Jellema, July 2021) https://lucasjellema.medium.com/the-true-cost-of-a-feature-kill-your-darlings-6733445ebbad
The True Cost of a Feature (András Juhász, May 2022) https://productprinciple.co/p/the-true-cost-of-a-feature
What is Feature Usage and When to Use It – https://chartio.com/learn/product-analytics/what-is-feature-usage/