Structure-driven design.


Running Wildebeasts

"Before the trauma of coupling faded coupling faded."
(After Gertrude Stein.)

Prior to releasing the book of the same name, Larry Constantine and friends penned in 1974 one of the great papers of software development: Structured design.

The paper considered programs in terms of modules and connections, a module being a set of, "Contiguous program statements having a name by which other parts of the system can invoke it," and a connection a, "Reference to some label or address defined (or also defined) elsewhere." The paper then defined coupling as, "The measure of strength of association established by a connection from one module to another."

The modern programmer might find these terms quaint and little is to be gained by seizing dogmatically on old texts like monks poring over crumbling papyrus fragments; instead, if we wish to read this paper today we must acknowledge technological shifts unimaginable at its time of writing and interpret it accordingly before exhuming whatever value still lies buried after thirty-nine years. Certainly the passage of time will have antiquated some ideas horribly. Others, however, will remain fresh and worryingly so. For these ideas describe problems vexing programmers a year before the agonized fall of Saigon ended the Vietnam war. That such problems prowl the collective consciousness of programmers still today hints both at a rather puzzling technological stagnation and at a programmer mindset thoroughly inoculated against learning from past experiences. The very timelessness of these problems suggests fundamental strata the neglect of which discredits our profession.

At any rate, the paper forged ahead, dispensing a passage of inimitable insight:

"The fewer and simpler the connections between modules, the easier it is to understand each module without reference to other modules. Minimizing connections between modules also minimises the paths along which changes and errors can propagate into other parts of the system, thus eliminating disastrous, 'Ripple effects,' where changes in one part causes errors in another, necessitating additional changes elsewhere, giving rise to new errors, etc."

As a matter of interpretation, most programmers would perhaps agree that the, "Modules," of the paper might equally apply to packages, classes and methods today, and that the, "Associations," have grown fur and evolved into dependencies. This, however, only deepens the mystery. If the paper can unfurl the problem of coupling in terms understandable to all, why then is coupling still with us? Why has it not been solved?

Some consider the underestimation of three problems to have stirred widespread and unending confusion, those problems being: the pathogen of transitivity, the inefficacy of second-order controls and the anesthetic of invisibility.

The pathogen of transitivity.

An unfortunate logic has settled over the programming community, one equating coupling with dependency. On the minds of those holding this belief a mental interpreter runs. This interpreter intercepts all information concerning coupling, performs a series of invalid translations and produces a clanging dissonance guaranteed to perplex. Seeing, "Reduce coupling," the programmer registers only, "Reduce dependencies," and will search for a dependency - between, say, two classes - to be eliminated. But invariably none will be found. For removing even a single dependency from a program usually renders it hopelessly broken. If B depends on C, that is: BC, then B requires the services of C, a requirement that no coupling-reduction exercise can wish away. The programmer will give up, dissatisfied but faced with a seemingly irreducible tangle of necessary dependencies.

The only way to unpick this thinking is to avoid the terms on which the interpreter latches, to banish the words, "Dependency," and, "Coupling," at least temporarily. Their replacements lie in the passage above. Ask software engineers not to reduce coupling, nor to hunt for mythical superfluous dependencies but instead to identify, "Paths along which changes and errors can propagate." That word, "Path," generally stimulates an interesting response, expanding focus from the connection BC to the elongated ABCD. Paths, after all, naturally run between and through milestones; roads do not terminate in each and every city encountered but wind through to the expansive distances in-between. The path introduces a more appropriate metaphor which helps eradicate binary thinking in terms of the connections between two classes and encourages analysis in terms of the transitive dependencies stretching between sequences of classes. It is the ripple-effect that underpins the perniciousness of coupling but it is transitivity, not the dependency per se, that underpins the ripple-effect. The war on coupling is the war on transitivity.

Figure 1 highlights the difference between binary and transitive dependencies.

Spoiklin Soice image composition

Figure 1: Three small classes.

The figure presents the historical development of a single class. On the left, the class has nine methods and this increases by just three methods each in the two later snapshots to the right. With relatively few extra methods added in each case, figure 2 shows the corresponding growth in the number of binary and transitive dependencies.

Comparison of dependency types

Figure 2: Binary and transitive dependencies.

Figure 2 highlights how, though the number of binary dependencies grows approximately linearly with the number of methods, the number of transitive dependencies explodes. And though the figure depicts a class clearly contrived for demonstration, that the phenomenon unmasks itself even under such controlled conditions presages its capacity for mayhem when elaborated to industrial scales. Ant, Apache's flagship compilation and deployment tool, wheezes under the weight of half a million transitive dependencies, a tremendous opportunity for a young ripple effect looking to make a career for itself.

The inefficacy of second-order controls

Given the elusiveness of ripple effects, one might expect the counter-measures deployed against it to be brutally excessive: to the city suspected of harboring a secret weapons factory the drone of an approaching thousand-bomber raid signals an air marshal's uncertainty as much as immanent obliteration. Yet this is not the case.

Today, ripple effects delight in finding themselves treated with energy-sapping indirectness. Great plumes of developer ingenuity rise not in the service of dependency management but in the paving of detours that take programmers miles off course. Software engineers fight ripple effects by making packages more modular, classes more focused, interfaces more segregated, services more flexible, features more testable. And these are noble aims but as ripple-effect constraints they remain lamentably second-order. They may reduce ripple effects. But they may not. Worse still, they may lull the programmer into mistaking the detour for the destination, creating programs of inarguable flexibility and impressive test-coverage but nonetheless writhing with transitive dependencies. Nor does this afflict only junior ranks. Consider the package diagrams presented in figure 3, one a system (FitNesse) created by a world-famous programmer and author of numerous principles of coupling management, the other a system of structure-driven design. Which, do you think, is which?

Spoiklin Soice image composition

Figure 3: Two systems of between 2100 and 2600 methods.

The only direct first-order means of managing ripple effects is the identification and ruthless elimination of the paths - necessarily transitive in nature - through which these ripple effects radiate.

The anesthetic of invisibility.

The uninitiated often entertain the oddest opinions of programming, seeing it as merely typing without realizing that the waggling of fingers over a keyboard is just its end result and most visible component. The truth lies deeper. For when programmers bend to their work they overwhelmingly engage in the insertion, into existing tracts of logic, of fresh new logic the mental carving of which marks the moment when programming takes place. This sandwiching between vast and shifting logical structures triggers a claustrophobia that pervades the programmer's work. With insufficient time to inspect the code into which a class, say, is being inserted, the programmer must focus, attention contracting to illuminate only the immediate surfaces in contact with which the new class will come to rest. There simply isn't time for much else, certainly not for back-tracking far into the legacy to see how classes use the classes that will use the new class, nor how services to be used use further services. Yet in this unlit territory, beyond the mental horizon, coupling gathers.

Something of this mindset helps explain the weak claim ripple effects have on programmer priorities and their startling rise to dominance among the characteristics that describe vast swathes of modern source code. As mentioned previously, if programmers are to fight transitive dependencies then they must feel their pain long before their threat becomes critical and to do this they must observe these transitive dependencies. This involves seeing past immediate surfaces and appreciating that source code can change without ever its text changing. Ripple effects are statistical beasts: if A depends on B, again: AB, and a programmer updates B alone to depend on a third class such that ABC, then even though the text of class A has not changed, its statistical properties have. Previously, it could be updated by a change to B; now changes to C might cause disruption. The transitive dependency is a fact, the ripple effect a probability. Though minuscule in isolation, such probabilistic subtleties play out and inflate among the tens of thousands of transitive dependencies pulsing through a system; eventually, they matter and when they matter little else matters.

Solution

The solution to ripple effects is achieved most efficiently through focusing on ripple effects. Not testing. Not modularity. Not flexibility. Not responsibility. Not performance. Not segregation. Ripple effects. Efforts spent straining towards other goals yield their own rewards and may solve ripple effects but hardly efficiently and mostly, as the unfortunate state of our industry suggests, not at all. Programmers must analyze the paths along which changes and errors can propagate into other parts of the system. These paths must then be curtailed. The principle of depth asserts precisely this, where the depth of a system is simply the average length of all its transitivity dependencies: this figure should be minimized.

A system can minimize its depth in many ways but two stand out.

Firstly, and most obviously, the system's elements should ideally be organized such that long transitive dependencies shatter into multiple shorter ones, thus preferring sunbursts to serpents, see figure 4.

Spoiklin Soice image composition

Figure 4: Serpent and sunburst.

Figure 4 shows a system of seven methods (or classes, or packages, ripple effects favor no scale). On the left, the system arranges itself into a single transitive dependency whereby an impact on e(), say, could ripple back to d(), c(), b() and a(). On the right, however, the methods are organized such that one method calls all others; now an impact on e() - or on any, for that matter - can only ripple back to a(). This, of course, is just an example; functionalism constrains the structures from which programmers compose their systems. Nevertheless, given the statistical nature of ripple effects, a statistical posture - such as that of the right-hand side - offers the finest defence. (By the way: which of the above structures would you consider more testable, flexible or modular?)

The second solution is age-old and thoroughly respected and widely touted and oh-so very, very dull. It is, in a word, encapsulation: systems should decompose into sub-units access to which is gained through well-defined interfaces only. Poorly encapsulated systems where implementation depends on implementation depends on implementation, etc., provide the richest imaginable breeding grounds for ripple effects. Pundits have thumped keyboards flat in writing similar advice so little more remains to be said here except - for those in any doubt - to recommend a safari through the wilderness of GitHub where, if you wait just a short while and keep your binoculars handy, you can enjoy one of the great migrations of digital nature and stare as masterpiece after wildly coupled, encapsulation-oblivious, complexity-soaked masterpiece commits by your Jeep.

Summary.

The title of this post may be somewhat tongue-in-cheek, the message somewhat trite, but matters are greatly amiss in a field that roars passionately of quality yet lies supine before the one true enemy. At a time when personal computing leaps app-fueled from desktop to pocket, ripple effects remain unrecognized, unstudied and unvanquished.

Photo credit attribution.

CC image Running Wildebeasts courtesy of Brandon Daniel on Flickr.