Scaling [down] the Mountain of Debt – Four dimensions of IT Debt image

Scaling [down] the Mountain of Debt – Four dimensions of IT Debt

WYSIWYG may not be true for all IT systems. What you see may be more than what you actually get. As a Product Owner, you have to be very careful to not deceive yourself. The latest iteration of your product may be shiny and new – but all may not be well. Because of the hidden debt in your product and your team. Shortcomings that may haunt you – perhaps from day one or usually in a the slightly longer term. Shortcomings that you should be aware of, do something about and in the future should try to prevent altogether.

image

This article discusses several types of debt that most teams and product accrue. Without proper attention and action, this debt may cause various problems possibly even leading to the complete write-off of the product and the severe disillusionment of the team and other stakeholders. Of course the entire team participates. Final responsibility rests with the Product Owner, the Architect and the Scrum Master/Team Lead to take control over the debt and scaling down the mountain to molehill proportions.

The table shows an overview of which types of debt I have identified (with two more to follow below the table). For each type, the table describes some symptoms, the risk and potential impact, the time range for when the impact will be experienced, the situation in which the debt is most likely to be (first) identified and the owner – the person most responsible (for managing that type of debt).





Type of Debt Functional Technical OPS (operational) Team Process
What (examples, symptoms) •Some of the desired behavior and features
are not available

•Specific situations and conditions are not supported

•errors, unknown behavior

•Style deviations (color, font, layout)

•Accessibility shortcomings

•Specific browsers or devices not supported (well)

•End User experience suffers from poor performance

•(Too) Simple passwords allowed

•Hard coded styles in web pages

•Old library versions

•Low (to no) test coverage

•Meaningless or even confusing variable names

•High degree of coupling

•Low quality documentation

•Magic numbers (hard coded values) in program code

•Use of deprecated technologies

•Manual steps in CI/CD

•Frequent substantial redesign of architecture/ platform / tech stack

•Incomplete working instructions, checklists and operating guidelines

•Lack of SLA objectives and KPI definitions

•No regular “fire drill” (to test recovery, fail over, re-provisioning, …)

•No log file [management]

•No monitoring (on critical user experience or business metrics)

•No purging of temporary files or outdated data

•No backups taken (or recovery tested)

•No fail-over done to standby environment
when an incident happens

•Incomplete definition of ready or DoR not applied

•No glossary of business terminology

•No clarity on business objectives, stakeholders

•Peer Review not [thoroughly] performed

•Lack of ownership from the team of the product[s]

•Dependency on individuals regarding specific components or tasks

•No automated functional [regression] testing

•Limited automation (in build, code QA, delivery/deployment)

•No coding standards (applied)

•Incomplete intake for products to go to Production
(and be put under Ops)

•No onboarding instructions for new team members

•No register of design decisions

•No Continuous Improvement cycle based on periodic evaluation

•Frequent and repeated discussions (to realign, refocus)

•Culture in which pointing out deficiencies in someone else’s work is not done

Impact & Risk •Incidents – requiring costly resolution

•Missed business opportunities

•Dissatisfied users

•Public image suffers

•Loss of user productivity

•Security breaches

•Changes increasingly become harder (lengthy, costly, risky) – low agility

•Hard to keep/find & motivate technical staff

•Drop in Team Productivity (velocity)

•Production incidents

•Increased Vulnerability

•High OPEX

•High number of incidents

•Slow response to incidents

•Unclear communication regarding incident status

•Incidents only handled when explicitly submitted (no auto detection)

•Loss of performance and availability

•Loss of data

Accelerated build-up of all debt (this is the root cause of much other debt)

•Loss of productivity

•Loss of synergy

•Loss of work pride and joy

•High impact of team changes

How long until impact Immediate (this type of debt is visible almost from the go live) Longer Term (this may go unnoticed for quite some time – and then hit you hard) Short Term (at go live, there may not be any cracks visible; but shortly after, incidents will occur in production – with way too much impact) Short to Mid-term. It may take some time to realize Functional and Operational debt can be traced back to Team Process Debt. It may also take time for a team to call out this debt – in the frenzy of creating functionality and operating systems.
When Identified Development, Test, Peer Review, Production Usage Development, Code QA, Peer Review Development, Peer Review, Ops Intake of Release, Operations, Production Usage Retrospective, (external) Audit, Refinement of stories, Development, Peer Review
Owner Product Owner Architect (well, the team actually; but represented by the person in the role of architect) Product Owner Scrum Master/ Team Lead (well, the team actually; the one role most eligible for managing this type of debt would be the person coordinating the team process

In order to get a good feel for how serious managing (and preventing( debt really is, I would suggest you go through the row labeled “Impact & Risk” and let it sink in. Unchecked debt will cause serious harm. And only when these risks and impact does not faze you can you now stop reading.

image

Moral Debt – to the Organization and the Community

I have identified two other categories of debt. Both of them are not harmful to the team or the product. At least not in the foreseeable future. However, these types of debt may and perhaps should weigh a little on the team’s conscience.

The team benefits from the Organization and all it has in place: reusable components, automation scripts, frameworks, platforms, infrastructure, standards, ways of working, tool instructions. It is only right that the team should give back to the organization by improving existing artefacts and creating new ones; the organization should grow as a result of the growth in the team. That in turn helps other teams and of course the contribution from other teams in the past provided the stepping stone for our own team.

Slightly further afield is the Community. The team makes good use of community resources: Wikipedia, StackOverflow, blog articles, open source tools, libraries and frameworks. The community is not a separate entity. We all are the community. In order for the community to thrive, all its members – including the team – should make community contributions. These can range from Kudos (Tweets about using community resources, comments on Community forum, ..), Posting Enhancement Requests & Bug Reports, publishing articles on using resources (why, how), preparing actual Pull Requests for code improvements or making financial contributions. If no teams would contribute to the community in general or to sub communities for specific products or technologies, then the community and its technology are in danger of stagnating and foundering altogether.

Of course your one blog article is not going to make the difference between the continued survival of the Spring Framework. However, we are all in this together. And even seemingly small contributions can make a difference – if only for the feeling of appreciation and the ensuing motivation of committers to (smaller) open source projects or even commercial product teams.

Organization debt and community debt are not typically recognized as debt in a team. And perhaps it is asking too much to have explicitly included in the team’s debt register. However, consider from time to time what you gain as a team from the larger organization and from the community. And think about spending if only a fraction of what you have gained on a reciprocal contributi0n. Expressed in a semi-biblical way: Do unto others as you would have them do unto you.

Why is there Debt at all?

No one wants to introduce shortcomings, no one on purpose does sloppy or incomplete work. Yet debt happens to the best of us. Time pressure is probably the most important factor in creation of debt; we know we are not quite done – yet there is pressure to ship the product all the same. We realize there is some work left to be done and we are committed to doing it. And then the next sprint starts and there is pressure for new features.

Another cause of the introduction of debt is lack of clarity in procedures and standards – we have not explicitly defined our way of working and the rules to abide by. Or we have, but there is not enough knowledge or discipline to follow these rules and check on the products and their compliance. Then again, we may have culture of avoiding confrontation that makes it hard for team members to point out product deficiencies to fellow team members; as a result, debt loaded products get accepted as done and are rolled out to production and Ops.

When there is debt, it forms a breeding ground for more debt. Just like the first few items of garbage seem to attract wide spread littering, the first occurrences of debt may lead to more debt – if we do not explicitly address the debt. It should be clear to all that while sometimes debt is almost unavoidable for our pragmatically oriented team, we do not accept such debt in the longer run – to become structural.

How to deal with Debt?

I do not pretend to have the final answer for dealing with debt. It is large issue with financial, cultural, technical and more dimensions. There are a few things I would recommend:

Stimulate the identification of debt. It is a good thing to find and call out debt. This should be a natural element in processes that do some form of evaluation, such as story refinement, review, test, Ops intake, retrospective, (external) audit. The on boarding of new team members is a perfect moment for finding debt , especially Team Process debt: fresh pair of eyes to take an objective, close and critical look on existing processes and products. Noticing debt during more regular activities such as analysis, development and day to day production usage and operations should also be part of every team member’s routine.

Make debt explicit and visible – record all occurrences of debt in a debt register and also include any debt that carries noticeable risk in the team’s risk log (not having a risk log in itself is a form of Team Process Debt). The debt register should indicate:

  • when was the debt first noted and optionally by whom and how
  • what is the debt and where is it located
  • what is the severity, how do we assess the risk and impact and what – if any – is the immediate running cost of (not fixing)this debt
  • what does it take to resolve this debt: how and estimated effort
  • what is the plan? (initially, there will probably not be a plan for newly discovered debt)

Discuss debt – the risks and running cost of not fixing & plan actions to resolve debt. Talk about debt at every opportunity – make sure that it is a nagging problem for everyone on the team. At the very least, debt should be standard agenda item in every sprint planning and in every steering committee session.

Debt status should be a Team KPI – “only things that are measured, reported on and evaluated by get true focus” is the perhaps somewhat cynical or highly realistic assessment of some of my colleagues. If you want to assess the performance of a DevOps Team, then the way they handle debt should definitely be part of the assessment. Controlling debt – knowing about the debt (through processes and tools), actively managing the debt and ensuring the debt and its negative effects do not structurally increase – is measurable and should be used in a Key Performance Indicator. Perhaps separate KPIs are useful for the different types of debt and their respective owners.

Continuously work on reducing debt – in small steps.  Every long trip starts with a single step. The mountain of debt may be overwhelming at some point. Do no despair. Start whittling the debt away, one small step at the time. Every sprint (or at least most sprints) should reduce the debt a little. Even if it takes long time to really scale the mountain back to the molehill, each small step helps. Both by reducing the debt as well as by instilling a way of working and cultural stance on debt in the team.

One easy to adopt rule is the “boy scout principle” – paraphrased as:  improve everything you touch, leave it a little better (in terms of debt) than you first found it. This can mean: improve documentation, refactor some of the code, increase the test coverage, update to the desired version of the library, write a checklist for an operational procedure. Anything that makes the products a little better.

Another thing to consider: explicitly set sprint budget aside for reducing debt. For example: make the architect the product owner of a backlog with technical life cycle management and technical debt reducing stories – with 10% of the sprint capacity at their disposal. Similarly, task the scrum master with reducing team process debt and allow a percentage of the sprint capacity to be allocated to this.

Final Word

One last word of advice: focus on one of the common root causes of various occurrences of debt: Team Process Debt. Poorly executed Peer Reviews, very unready user stories, only manual build processes and lack of regression testing are in my opinion among the most important factors contributing to functional and technical debt.

2 Comments

  1. Frank Brink August 3, 2023
    • Lucas Jellema August 3, 2023

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.