Change Engineering
In software design, we’re often obsessed with perfecting the qualities software has right now, or the qualities of how the software should be under ideal conditions in the future. We don’t often focus on designing changes to software – we mostly focus on the start and end points.
Imagine different components of a software system dancing together – each step of the dance is a change to the component. In an ideal situation, when software development is complete the dance stops and players remain perfectly still in their final state of interlock. No change is required, because the software is “finished”.
But no software is like this. When a software project first commences the dance floor is empty. Gradually new players/dancers are introduced to the dance floor. Dancer’s positions and poses are change to accommodate each other. Bug fixes and refactorings change the position or pose of dancers that needed fixing.
But by today’s software design practices this dance is nothing but a crowd of players, each one rigid in his/her shape. We go to great lengths to fix solid the interface between the players (components) – their interlocking posture – and then we hope never to change it again. Instead of a dance, this is like a weird game of twister. Players have to hold their position as steadily as possible. It’s considered a design flaw if changes to one player cause changes to another player – this is software coupling.
Let me emphasize this point: todays design practices emphasize immobility and decoupling between components of the system. This is very valid way of dealing with the dancing problem: limit the movement of the dancers so that they don’t run into each other. Give them each enough space so that when they do need to move a little they won’t influence others around them.
But we know all this comes at enormous cost. The interfaces have to be clearly defined and agreed upon – never to be changed – so they better be correct right at the beginning. We have to use complicated interfacing patterns – facades, indirection, visitors, injection. No longer can we just call function foo
. We now have to add foo
to the interface, and inject the concrete object through the construct, saving it until we need to call it, and then call it through an adapter because it didn’t quite have the right structure1.
So these principles solve half the problem: slow down the game so that it’s easier to keep everyone separate. But there’s another solution which also deserves attention: formalize methods of change. That is, formalize the dance moves.
For example, here’s a situation I’ve encountered. Let’s say a program X calls a web service Y. So X and Y hold a single dance position with each other. Now lets say we need to add a new parameter to the service. What moves can we make?
- We can simply change X and Y, and deploy them at the same time.
- We can change Y first to accept the new parameter, but not require the new parameter, and deploy Y. Then we can change X to provide the new argument, and deploy it
- We can change X first to provide the new argument, and deploy X. Then we can change Y to use the argument, and deploy Y.
- We can create a new service Z which supports the new parameter, and then update X to use Z.
The order of footwork is important: we can’t change X to require that Y have the parameter if it doesn’t yet. We can’t change Y to require the new argument, if X isn’t supplying it yet. We can’t deploy X to require the new service Z, if the new service hasn’t been implemented yet. There are a number of steps in this part of the dance, and the order is important.
The above example is simple, and could just be solved by reasoning it out, but you can see it could get complicated quite quickly. If some data had to be passed back from the service as well, then the moves become more intricate.
Do any of these dance moves have a name? Not that I know of (but correct me if I’m wrong). We don’t have names for them, and we also don’t really have a standard notation to describe the moves. And what about tools, and what about static analysis? What if the development tools could give a compilation error if we make an illegal dance move?2
This doesn’t just apply to distributed systems. It applies equally to self contained programs, or within modules and components themselves. For one thing, things don’t get developed atomically. A developer can only write one line of code at a time, and each new line of code is a small dance move. If the move is illegal, you might get a compilation error or perhaps your unit tests fail. It’s much better to be able to develop in small increments, rather than having to go a week with failing tests. I’ve certainly had times like that: where I started a “small” refactoring but it was weeks before the code even compiled again. Instead of engineering a sequence of small legal moves, I made one multi-week leap – flying dangerously through the air. This is the same thing that I’ve seen happen with producing new major versions of software: it takes months or years to develop the new major version from scratch, and only then can it be introduced to the dance floor with the customers. It would be better to engineer a sequence of small changes to get from the old version to the new one, even if the new one has a completely different architecture.
This is also important for collaboration, since software teams don’t make changes atomically either. There are dependencies between different people’s development, and order is often important. Every time you commit code to the common repository you’re making a dance move with your team mates, and you don’t want to tread on their feet. A formal discipline of change engineering would make it easier to coordinate this dance without having to resort to tons of scaffolding or indirection code.
Coupling can be good
I’d like to make one last point and say that coupling can be a good thing. The reason is perhaps obvious: the more connected two pieces of code are, the easier it is to directly access information between them. All the rules we have for decoupling generally add layers of indirection, which in turn makes it harder to see what’s really going on (the runtime structure looks less and less like the code structure), and harder to add new functionality because you have to weave new pieces of data through all the decoupling layers and interfaces. The “god-class” anti-pattern is, practically speaking, a great pattern to use if you have a program that’s only a few hundred lines of code.
The apparent problem with coupling is that solidifies the program. You can’t change one thing because it requires a change to something else. Each coupling link is an iron beam that connects the two pieces: when one moves, so does the other. Enough iron beans and the whole system becomes completely rigid. You don’t want to change anything because you would risk breaking everything else.
Although decoupling addresses this issue, so does dancing. That is, if we can define formal rules for changing components that are coupled together, we are expanding the number of coupled pieces that we can handle in the system. We’re replacing the iron beams with hinges, and allowing the dancers to interact directly with each other without being rigid.
Conclusion
In software engineering and design, there is perhaps too much emphasis on the fixed state of code. You generally design software with a single endpoint in mind, rather than the sequence of atomic changes required to get there without breaking the system at each checkpoint. These are almost independent of each other, and there should be a new discipline for change engineering, so that we can build more robust systems, minimize risk, and reduce code. For large systems there should perhaps even be dedicated engineers focusing on the problem of change design, formally engineering the sequences needed to move the software towards long-term goals of the company, rather than letting the dancers to wonder undirected into a dead-end of rotted software before starting again on a new version.