Nuclear catastrophes, speeding tickets and agility

How do you stop your stock of nuclear weapons accidentally blowing up the world? How do you devise a straightforward system for recording penalties on driving licences? Those sound like very different questions, but they turn out to have some unexpected similarities.

Let’s start with nuclear weapons. Eric Schlosser has written an extraordinary book about the command and control systems for the US nuclear arsenal. He describes the deep structural weaknesses in those systems, illustrated with a seemingly endless stream of examples of how those weaknesses came close to causing disaster, for reasons ranging from operational carelessness to fundamental design flaws, and with potential consequences ranging from contamination to conflagration.

The book is well worth reading but you can also watch Schlosser speaking at a recent RSA event in this video (the whole thing is almost an hour, but the meat starts at 3:30 and runs for about 20 minutes, the rest is Q&A – or there is a shorter version here).

One of the themes which came through very strongly from the book was the importance of maintenance and improvements. It turned out to be relatively easy to get eye-wateringly large budgets for the development and deployment of new weapons and almost impossibly difficult to get any money at all to improve control and safety systems for existing weapons. That’s partly a reflection of a military preference which underplays safety (given the risk of a bomb which doesn’t go off when it should, and one which does go off when it shouldn’t, some may see the first as more important than the second), but it’s also a reflection of a much more general political issue: it’s much more attractive to be responsible for delivering a new thing than for doing maintenance on an old one.

Meanwhile, rather less cataclysmically (except perhaps for him), Matthew Cain got a speeding ticket. He didn’t contest it, paid the fine and accepted the points on his licence. That’s an apparently simple and apparently online transaction which, at the end of a terrific blow by blow account he summarises as involving ‘three public bodies, three different websites, four outbound letters, eight pieces of post in total’.

The problem is not that it’s impossible to go through the process. Indeed, the problem (in this form) only exists because it is possible to complete the transaction: if it weren’t, somebody would fix it. The problem is that it isn’t anybody’s priority (or anybody’s budget) to improve, streamline and integrate the current fragmented process. As Matthew points out, the obstacles to improvement are very real:

1. West Yorkshire Police has higher priorities. I suspect no senior manager will be held accountable for a slow, inefficient money-making service

2. Left to its own devices, West Yorkshire Police would probably redesign the service inefficiently, either relying on contractors to build a unique service or purchasing a proprietary service

3. The opportunities to improve the service are only incremental. West Yorkshire Police could redesign its part of the service but lacks control over payments or licensing issues (and, it appears, speed awareness courses)

4. Probably only the MOJ has the convening power to bring together its payment service, the DVLA’s licencing service and a police force’s processes. But to do so across 42 police forces would be a considerable hassle

5. The current incentives government digital services prize redesigning existing high volume, central government services. The redesign of speeding fines is probably low on attractiveness and achievability for the MOJ — although of all departments it’s probably best placed to make progress

Every description of agile development ends with ‘iterate’, and the GDS service design manual is explicit about what they call the live phase:

[Going live] is not the end of the process. The service should now be improved continuously, based on user feedback, analytics and further research.

You’ll repeat the whole process (discovery, alpha, beta and live) for smaller pieces of work as the service continues running. Find something that needs improvement, research solutions, iterate, release. That should be a constant rhythm for the operating team, and done rapidly.

Those are good principles, but they don’t on their own solve the problem. When budgets are tight, it’s easy enough to slip back to thinking that what is there is is good enough, to finding workarounds rather than solutions, to accepting what works rather than looking for ways to make it better.

It’s not just the money, of course. Perhaps a little counter-intuitively, the opposite can be a problem too. We have all seen systems where features have been added and designs tweaked in ways which reduce utility, rather than adding to it.  Continuous improvement is virtuous if it delivers improvement, not if it is only continuous. Getting – and keeping – people with the right skills and the right attitudes to maintain or enhance a service may be more difficult than assembling the team to build it in the first place.

Nor of course is this just about IT. There are new public buildings which struggle to cover their running costs, new buses designed to run with a second crew member who will increasingly be absent, new phones with patchy network coverage.

But in a sense all of those are consequences of the deeper problem, that the wholly new is grander, more exciting and generally better rewarded. Slowly and painstakingly reducing risk and increasing resilience has much less obvious benefits. Except, perhaps, for avoiding nuclear devastation.