The refactorer spectrum.

The three groups.

Fifteen years ago, the programming world swooned before the publishing of Fowler's seminal, "Refactoring: improving the design of existing code." Refactoring, today, plays a foundational role in many software design processes, all of which promote the primacy of satisfying functional requirements only marginally before the addressing of structural inadequacies.

Some programmers, however, refactor better than others. A spectrum of refactorer exists.

On the left of this spectrum sit those who equate, "Refactoring," with, "Unmotivated, mood-dependent churn," of whom little remains to be said.

Towards the center sit those more skilled in the lethal arts who understand refactorings and target them, with assassin-like detachment, at those structural weaknesses cataloged by Fowler in his book. These hits usually take place ad-hoc, as the programmer encounters those weaknesses, though sometimes a crisis will prompt the hiring of a shady specialist from out-of-town to eliminate a particularly troublesome mark. Such work - good work - contains a strong localized element in the belief that the accumulation of all such localized improvements inevitably leads to overall excellence. Small quality improvements tend to accrete just so. But not always.

The problem with this approach is that some code smells - Fowler's unforgettable term for structural weaknesses - are the opposite of others and so what one programmer might refactor away, another might refactor back.

Data Clumps occur, Fowler tell us, when a programmer sees, "... three or four data items together in lots of places." The solution? "Use Extract Class on the fields to turn the clumps into an object." Another smell is Lazy Class, "A class that isn't doing enough to pay for itself ..." The solution? "Nearly useless components should be subjected to Inline Class." Fowler identifies Divergent Change as another smell which, "... occurs when one class is commonly changed in different ways for different reasons." The solution? " ... use Extract Class to put them all together." Then there is Shotgun Surgery, "... similar to Divergent Change but is the opposite ... every time you make a kind of change, you have to make a lot of little changes in a lot of different classes." The solution? "Often you can use Inline Class to bring a whole bunch of behaviour together." With Middle Man, a programmer will eliminate a class via Replace Delegation with Inheritance; with Inappropriate Intimacy, a programmer will introduce a middle man via Replace Inheritance with Delegation. Many more examples pepper the book.

With such code smells arriving in opposing pairs, refactorings too necessarily occur in pairs. Like matter and anti-matter, when such refactorings collide they annihilate one another in a sizzling flash of wasted energies and leave on the underlying code no trace of their passing. This ad-hoc approach thus suffers from a potential lack of direction. Among groups of programmers it may lead to better overall quality, but it may not or perhaps only inefficiently. If refactoring were driving, then good source code structure would be the destination and unless all involved agree on the nature of good source code structure, many programmers will drive out of the parking lot at the sprint's start only to arrive alone, at many different destinations, at the sprint's end, each believing themselves to have taken the correct route.

Those who sit on the right of the refactoring spectrum avoid this problem. Here sit the principled refactorers. These programmers perform their refactoring as do the center-group, that is, ad-hoc, as encountered, as needed. They also, however, agree on those global source code properties that constitute good source code structure. More importantly, they know that for every desirable source code property there exists a corresponding principle through which it asserts itself. If refactoring is driving and good structure the destination, then principles are the crisply creased map scrawled with helpful directions. It is the conscious sharing of these global principles that sets these programmers apart from the rest of the spectrum and that ultimately builds systems of such striking structural integrity.

Semantic and syntactic structure.

Given that refactoring relates entirely to source code structure - and not user-experience realisation, that is, it has nothing to do with runtime correctness - it should be acknowledged here that source code structure grows in two splendid varieties: semantic and syntactic. Semantic structure requires human interpretation and there are two reasons why it fails to excite interest among principled refactorers.

With no other information available, programmers assume a Pear class to model a type of fruit and expect it to display fruit-ish dependencies such as implementing an Edible interface. They would not expect this Pear class to implement a Boeing747 interface. Such jagged dissonance springs from semantics, our expectations building on subjective understanding of what a pear and a passenger jetliner are. Being subjective, semantic structure is ungeneralizable, depending instead on the concepts involved in the particular case under study. It therefore does not lend itself to widely applicable principle and so here rests the first reason why - despite its unquestioned prowess in any given scenario - it will not be considered here.

The second reason to discount semantic structure leaps from Fowler's tome. Seventy six refactorings cram the its pages. None of them is a semantic refactoring. None of them relates only mobile phone apps or washing machine controllers. They are all syntactic, relating to classes and methods only.

Syntactic structure builds from a set of elements and their relationships, shorn of correspondences with pre-existing entities (or at least of correspondences with entities outside the programming vocabulary on which they graze). Thus a Pear class implementing a Boeing747 can be seen merely as generic class relationship X → Y. Lacking specificity, principles relating to syntactic structure enjoy grand generalizability and thus figure highest on the altars of the principled refactorers.

So the loftiest principles are syntactic, but what of the source code properties they champion? Do these properties also reflect the syntactic/semantic divide?

Semantic and syntactic properties.

Though Fowler wisely constrained his refactorings with tight syntactic straps he did not, alas, similarly constrain the properties which he thought characterize great source code. For just as structure fractures along semantic and syntactic dimensions, so too do source code properties. Fowler asserts the goal of refactoring to be the changing of the, "... internal structure of software to make it easier to understand and cheaper to modify," and therefore, presumably, that the properties, "Easier to understand, cheaper to modify," exemplify good source code structure. He is probably right. Unfortunately, the property, "Easier to understand," requires subjective evaluation: that which some find easy, others find difficult. Being subjective, it lacks generalizability and, though beloved of centrists, usually finds itself demoted by principled refactorers. "Cheaper to modify," however, is deliciously, gloriously syntactic.

The squabbling software industry has never reached agreement on a complete set of syntactic source code properties that minimise potential development costs, but two at least find broad acceptance. The first property is source code duplication: the less, the better. The second property concerns that old chestnut, ripple effect, whose description has not been bettered since its inception in 1974, "Minimizing connections between modules also minimises the paths along which changes and errors can propagate into other parts of the system, thus eliminating disastrous, 'Ripple effects,' where changes in one part causes errors in another, necessitating additional changes elsewhere, giving rise to new errors, etc."

Again, these properties walk hand-in-hand with their associated principles. The first syntactic principle that guides refactoring, therefore, is the principle of duplication, which implores the reduction of duplication where practicable. Fowler himself elevates Duplicated Code to the first code smell of the book. The second offers greater intellectual challenge for this property is compound, enjoying many contributing factors. Given that ripple effects cause potentially more damage along long transitive dependencies than short, however, one clear principle that can be extracted is that of depth: minimize the average length of transitive dependencies with which a system is woven.

Many more syntactic principles exist. A pragmatic lot, principled refactorers simply codify those they agree upon before embarking on their labour. Nor do they fall prey to rigidity, abiding by their principles mostly but acknowledging that rare cases can be made for violation: the point is that they can deliberate those cases precisely because they have a global framework against which localized cases stand out.

Principled refactorers are not more, "Right," than centrists, they just take greater pains to acknowledge the difficulties involved in building a system-wide structure via an essentially particularist mechanism. Refactoring is difficult because it seems so easy. Some programs tout refactoring throughout their entire development cycle yet end up structurally fragmented. Principled refactorers merely accept this possibility and take mediating precautions.

Summary.

Programmers used to attempt the fixing of a system's structure largely in advance, before coding, with plenty of hope. Nowadays, the spatula of the modern design process folds the creation of structure in with user-experience realization. This doughy mix affords great benefits but also risks programmers' focusing their structuring skills piecemeal, moving blinkered from case to independent case without global consideration.

To battle this tendency, some programmers agree upfront on those global syntactic principles according to which their final product will rise. They then evaluate their refactorings against these principles as they proceed, thereby avoiding the parochial inconsistencies that render other systems that much more costly to modify.

Photo credit attribution.

CC image Sinclair ZX Spectrum 48k courtesy of Inaki Quenerapu on Flickr.

CC image ZX81 courtesy of Barney Livingston on Flickr.