Thursday, June 08, 2006

Epistemology and Agile Software Development

Non programmers probably will not get anything out of this post.

Epistemology has long been a favorite subject of mind. I’m fascinated generally with how the mind works, and reading Introduction to Objectivist Epistemology was an electrifying read for me. As a software developer, I continually strive to find better ways to write code. For the last six years, there has not been a single week when I have not looked at code I wrote just a couple of months before and thought, “that’s crap.” About a year and a half ago, thanks to my friend the Philosophical Detective, I read Agile Software Development by Robert C. Martin. It was the Atlas Shrugged of my software development career. Lately, I have been devouring books on design patterns and refactoring. Mostly I’ve been focused on the Martin Fowler related books due to the respect I’ve gained for him while reading his blog. During all this reading, I have been intrigued by the relationship between software design and technical epistemology. I’m going to talk a little about two elements of Agile Software Development, Design Patterns and Test Driven Development, in this blog.

Design Patterns

In his book Patterns of Enterprise Application Architecture, Martin Fowler quotes architect Christopher Alexander on what a pattern is: “Each pattern describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use the solution a million times over, without ever doing it the same way twice.” To rephrase, a pattern is a conceptual solution to a conceptual problem, where both the problem and the solution differ in their particulars from concrete instance to concrete instance. Martin Fowler continues: “a pattern is a chunk of advice, and the art of creating patterns is to divide up many pieces of advice into relatively independent chunks so that you can refer to them and discuss them more or less separately.” Further, “Once you need a pattern, you have to figure out how to apply it to your circumstances. A key thing about patterns is that you can never just apply the solution blindly… you see the same solution many times over, but it’s never exactly the same.” Fowler then emphasizes how important it is to give patterns a name: “patterns create a vocabulary about design, which is why naming is such an important issue.”

Let’s step back for a moment. Software development is an inherently logical and creative process. As developers, we are given some basic tools to work with that are constants in just about every programming language. These are variables, statements, conditional constructs, and loops. In languages such as COBOL, that’s all there is. All variables are global, and processing proceeds in sequence from one instruction to the next. This is great for small programs that just do one thing—but if the program achieves a certain level of complexity, it begins to overload the mind—to bust the crow, so to speak.

In more advanced languages, variables, statements, conditionals, and loops began to be organized into self-contained reusable code constructs called functions. Functions can be called over and over again—they always do the same set of instructions in the same order, but those instructions are only written once. Functions had the advantage of allowing you to consider a single set of instructions as a unit. This had the effect of organizing and clarifying the logic of the code. It had a further side-effect of enabling programmers to write ever more complicated programs: by organizing our logic into functions, we can treat whole pieces of logic as if they were single statements. The relationship to first-level concepts should be clear to any Objectivist.

In OOP Languages, you can go a step further and organize functions into objects. An object in an OOP language is of a higher order of complexity than functions, much as abstractions from abstractions are of a higher order of complexity than first level concepts. There are a few kinds of objects than can be created, and a few ways to create them—but there are millions of programming problems to solve. In my first couple of years as a developer, I thought objects were the conceptual height of software development. I didn’t know about Design Patterns at the time, and had yet to struggle with the decisions of when to use inheritance, or polymorphism. I would have to say, as an aside, that I overused inheritance.

Design Patterns make programming fully conceptual. Patterns focus not only on the abstractions describing programming constructs, but also on abstractions that describe common software problems and solutions. Since patterns are not limited to particular software constructs, they are not limited to any particular programming language, environment, or time. They assume an OOP language as a foundation. This is a requirement since you cannot have a conceptual solution to a problem without a conceptual tool. Patterns have a name, an intent (think definition), and are usually described with an example—much like a dictionary provides you with a sample usage of a word it has just defined for you. The example is a simple one so that the essential idea can be easily seen, but must be adapted to your particular solution. The result for the programmer is a new concept to use to solve development problems. Patterns can themselves be organized into more complex patterns, thus making them open-ended for extension and discovery.

Test Driven Development

Test Driven Development is the practice of writing a unit of code that generates an error because some feature of the program is not present, or does not work correctly. It is important to write the test code prior to attempting to implement the feature. Thus, when you write the test, it will immediately fail. Now that you have a test for the feature, it is time to write the feature. When the test for the new feature passes, and all other tests for all other features also pass, you are still not done. You must now look at your code, isolate similarites to other code, and refactor. A unit test should be no more than a few lines of code (I try to limit myself to 10 lines) so that what is being tested can be easily understood (in other words, don’t bust the crow of the reader).

Since a requirement of unit tests is clarity, you are forced—before you write any code—to think about how the code should look so as to “read” clearly. I actually read the code aloud so that I can hear how it sounds. The code must read almost like English. You have to think carefully about how you name objects, functions, and variables, so that the name itself identifies what it does. Gone are the days when you abbreviate everything--long function and variable names are fine if they're needed to describe what they do. In short, you have to encapsulate a piece of programming functionality in a “unit” of code. This makes every feature of your system executable in only a few lines of code. This requirement alone has huge implications for how you design your class libraries: you must keep your classes small and lightweight, and make interaction between your classes easy and open-ended. Each class should have one responsibility. All of these pressures are in place before you ever write the first line of code! The unit test “defines” how the code will work.

During the implementation phase, you are just trying to get the test to pass. You’re not worried if the code is pretty, if it’s organized, if it’s the best design. You’re just trying to get something together that actually works. This is analogous to “chewing” a concept, in my opinion.

During the refactoring phase, you are looking for ways to more deeply integrate your new code into the existing code base. This may mean breaking your new code into smaller functions or objects, extending existing objects to handle some of the functionality of your new code, etc.. Regardless, the refactoring phase is analogous to integrating the concept after it has been defined and chewed. The refactoring phase could be a tempting one for some programmers to overlook--after all, at this point the code works, so why toy with it? Because software must have integrity. If you cheat your design on this feature, then you'll cheat on your next one. Before you know it, your program will be a mass of repetitive, slightly different, unmanageable spaghetti-code. Refactoring is the principle that keeps your code integrated.

Another benefit of unit testing is automated regression testing. This is the easiest benefit for project managers to see, but I think the further conceptualization of the code is by far the best benefit. I make heavy use of the automated regression testing--but I find that because the code is conceptualized so well before I write it, I rarely have to make large structural changes to the code base. Of course, since all the features of the app are unit-tested, I can make those changes freely when necessary.

Closing Thoughts

One of the most interesting changes that has taken place in the last couple of years, since I started practicing TDD and studying design patterns is how seldom I use inheritance. When I first started programming in OOP languages, I used inheritance to add functionality to existing objects a lot. The result was that I would have an inheritance chain 5 and 10 layers deep. One of the things that TDD and Patterns forced me to learn was when a class has too many responsibilities. I had to learn to create new, smaller classes, and define interaction patterns between them instead of relying on inheritance to solve my problems. I’ve reached a point now where inheritance is pretty near the last solution I reach for when trying to solve a programming problem. I’m not saying that inheritance is bad—I’m just observing that it’s not nearly so great a tool now that I have access to more powerful abstractions to use in my code.

No comments: