Building bigger software

When we learn to program computers, the typical course focuses on small, single-purpose programs that demonstrate a single technique. This works pretty well for teaching purposes: keeping software small this way makes it easy to learn from and easy to assess in terms of quality.

Outside of the classroom, software is neither small nor single-purpose. New developers are thrown into a large, poorly specified projects equipped with tools only designed for building small projects. Applying these small-scale techniques tends to result in software with a criss-crossing maze of dependencies, too large to be fully understood and too fragile to be easily maintained. This situation is made worse by the pressures of commercial software development: to a business the code quality matters a lot less than the software's purpose, so so long as the software works there is unlikely ever to be time provided to really fix its problems.

Fixing this issue is something of a holy grail in software engineering. There's been quite a few attempts to do so: starting with object-orientation, going through the component frameworks of the '90s, then design patterns, TDD and microservices (not to mention countless other techniques). Something interesting emerges: in the hands of an experienced developer team all of these techniques can be very effective but once they are widely implemented, they start to be the source of architectural problems more often than they are the solution.

The problem is a general one of programmer psychology. There’s a tendency to complicate things and the more elaborate, featureful solutions appear ‘better’ than the simpler ones. Those who have seen Indiana Jones (or have spent any time with CORBA) should be well-aware of what happens when the shiniest option is chosen.

Simplicity is hard: the reasons are complicated but boil down to a couple of problems. Firstly, there is often a checklist of things that developers want to achieve and some of them seem obvious to build in at a high level. Secondly, it’s difficult to know where a big project really needs to go so a lot of discovered requirements get bolted on at the last minute (generally when time pressure has created an environment where ‘works’ beats ‘good’).

Take microservices for example. The idea seems sound: instead of creating big monolithic applications, we build much smaller ones. As we’re already building for the web, each small application runs on a webserver and talks using REST so we can easily interface with them from the front end.

Problems start take make themselves known when you try to implement that large application - a certain familiar feeling of ageing 1000 years and blowing away as dust on the wind comes over you. The step of starting a webserver for each service is not simple at all - it just sounds simple. Webservers have all sorts of issues: using SSL for secure communications is enough of a pain all by itself that the addition of needing to deal with extra reliability issues, authentication, the need to build a REST interface for everything and the fact that sometimes the services turn out to need to talk to each other as well as the front-end really takes the shine off of the idea.

Worse, if you complain about it there’s always someone around to tell you that you’ve never had it so good: the last project was a monolith and it caused everyones face to melt off and explode so you should be glad you’re just turning to dust. (There may also be an example where the idea actually worked the way it was supposed to, but that’s almost never turns out to be how things are going now).

What happens often is that a simple, good, idea (eg: small tools that talk to each other using JSON) is corrupted by a terrible one (eg: everything should be its own webserver and talk HTTP at all times). Don’t think I’m just picking on microservices here: pretty much every ‘universal’ architecture acquires this issue at some point or another. The problem arises because ‘tool that talks JSON’ and ‘webserver’ seem to be about the same sort of size when they’re written on a whiteboard.

There’s no easy fix to this: it requires thought and care. One thing that can help make the problems more visible at the start is to always expand the abstractions so what’s actually required is visible. ‘Webserver’ implies a whole raft of things, some of which I listed before. ‘Tool with JSON’ doesn’t actually imply that much: some way to get JSON in and out and some way to interpret that. Maybe the webserver element doesn’t have to be integral.