Unfit Platforms

Unfit Platforms

TL;DR:

Platforms often grow too big to respond to change. They slow your value-delivering engineers down. They grow too large to maintain due to some combination of over-optimising, having unclear ownership and being technology-led rather than user-led.

Introduction

One of my favourite books in recent times has been Team Topologies. I felt compelled to write about an area I identified a lot with.

The book spends a fair amount of time covering the idea of a modern platform and the "Thinnest Viable Platform".

If you're unfamiliar, this video from the authors is a useful introduction.

Matthew Skelton and Manuel Pais discuss the concept of a Thinnest Viable Platform (TVP)

The concept of thinnest viable platform has parallels to "Hire When It Hurts", from the Rework book/podcast. Maintain the smallest platform you can get away with. Only build when the cost of not building is worse.

My experience

I've never worked with a platform I love. In contrast to the thinnest viable platform, most of my experience as an engineer has been platforms that aren't fit for purpose: platforms that are too large, too complicated and don't really solve the problems I wish were being solved for me.

The video above might dismiss this "a platform of the past", but even in the most developer-friendly environments I've worked with have been plagued with some issues around their platform - a good idea two years ago no longer seems like a good idea today.

This isn't the same thing as having a large platform - a "World's Strongest Platform" that can pull aeroplanes, lift cars and press giant logs. If your business specialises in lifting heavy things, that sounds perfect.

However, change is the only constant. Change could come as new product ideas, slow-moving industry trends or a large lurch forward in technology that means new competitors can easily get into your market. Actually building a platform is a bet that you can solve a problem globally to reduce the amount of work and cognitive load locally by individual teams.

An unfit platform is one that can't respond to those changing circumstances fast enough.

How do I know if I have an unfit platform?

Here are some early warning signs that you might have an unfit platform - If you have more, please tweet me!

Over-committing

You see this often in companies built around single programming languages, single architectural approaches, or single database technologies.

A breaking change somewhere in the platform becomes a massive adoption hurdle because you haven't got the capacity to fix the resulting issues. It snowballs into other risks - a security vulnerability is detected or a vendor refuses to support an old version.

A team falls back to the established way of doing things after experimenting with other approaches because it's too difficult to iterate on the platform itself. Other teams give up on experimenting with alternative approaches at all.

In both cases you're over-committed - platforms need on-going nurturing and a healthy respect for avoiding technical debt.

Lack of clear ownership

6 teams maintaining a platform has a different dynamic to 1 team trying to maintain a platform for 5 teams.

A single team can prioritise a backlog of work as it sees fit and spend it's efforts on the most important thing. Sometimes the value-stream-aligned teams might be unhappy with the direction the platform goes in but they can trust that the platform team has the org's best interests at heart.

Multiple teams always have to juggle their own priorities against what's best for everyone as a whole. Disagreements over what should be implemented as a platform, how it should be implemented or who should do the work create extra friction and usually end up with nothing implemented.

Platform teams not listening to their users

Fundamentally, your platform is a product. If you were outsourcing whatever your platform specialises in, would your developers actively choose it? Are developers actively avoiding your platform even when you think it fits their needs?

Even with clear ownership and really deliberate specialisation, It can be easy to get distracted by what seems like the latest and greatest tech. Moving to Kubernetes won't necessarily improve your deployment practices, and but even if it did, was that really the most valuable thing you could have done with your effort and attention? Maybe your users hated Jenkins, rather than how the application ran. Solving a problem your users don't have just creates work for you and them.

Reflections

I've been involved in lots of "new platform"-like initiatives. In the most successful, I've done the exact opposite of what I've identified above. My team has had the autonomy to focus on a identified user need without over-specialising on any particular use-case.

However, they've often failed over longer timescales due to a lack of continuity in ownership. A one-time 're-platforming' is a myth. Like any product, a platform needs ongoing care from engineers that have empathy for the end users.