Tuesday, December 18, 2007

Why coupling is bad.

There's a large difference between a 1000 line program and a 100000 line program. Most of it has to do with making an application's architecture scalable.

Uh-oh, someone asked me to introduce a new feature into my established codebase!

When programming there's twos sorts of new features. There's the kind of feature I like to call a "vertical" new feature and there's a "horizontal" new feature.

A vertical feature is one that doesn't really interact much with other code. If you picture the code for the feature as being coded as a stack of layers that only interact with one another you end up with a vertical stack. When this stack is added to the established codebase it doesn't make it much more complicated. I mean there's the complexity in the feature itself but that complexity is nicely contained it's own stack. It's almost as if it's another program that just happens to be compiled with the established codebase. If you were to make a change somewhere in the code of the existing codebase you wouldn't need to worry about breaking the new feature because there's practically no chance that what you're doing will affect that feature. Vertical features are self contained and are therefore lightly coupled with the established code.

Horizontal features are those that affect a large cross-section of the application. The best example of a horizontal feature in InteleViewer would be key images.

InteleViewer is a application that shows medical images like CTs or magnetic imaging or X-rays etc.. The thing is, key images are images that aren't really images. They are references to images. From the Viewer's perspective, it gets a command that says download image X. The Viewer goes "OK", downloads it and then is surprised to find out that image X is actually a reference to three other images Y, Z and Q. The Viewer then has to go out and transfer them.

Now this all sounds perfectly simple. In your head you can imaging that all you'd need to do is have the KeyImage code transparently get the image it's referring to and download it. Well, yeah.. except for that fact that KeyImages came relatively late in the history of the Viewer and the Viewer was making lots of assumptions about the nature of an images to implement other existing features. Here are a few issues:

- The KeyImage might refer to an image that's already loaded and these things are huge so we don't want to load them twice. We have to add some code to make sure we're not loading the same thing twice in the loading code itself instead of in the code calling the loader.
- We have multiple different protocols which we can use to download images. Some of them have their own constraints as to what's possible to do vis-a-vis downloading images. We have to be aware of this and deal with each source individually or try and build the key images code out of abstracted operations that already exist.
- If we had any protocols that we wrote that assumed we were only sending images we need to re-write them a bit.
- Since key images, on their own, are a file, but don't contain any image data, you can't blindly send the files themselves to image manipulation routines.
- We cache all images on disk but key images don't have any image data so we need to be aware of this in the cache code too.
- key images have filtering operations that apply to the underlying images, so these filtering operations have to be combinable.

..the list goes on. key images have introduced constraints all over the code and the more constraints you have the more likely the next feature you add will need to know about key images. Key images adds constraints across the application's loading and caching systems and therefore makes any code in those systems more complex and subtle than before.

Here's a question for you? Can we abstract away the annoyances of key images by clever use of layers and factories and such?

Well, every programmer should try and some of the things I mentioned can be hidden by interfaces and abstractions. The simple fact that adding KeyImages was possible, was because we'd worked hard trying to hide the complexity of the loading system. The thing is, there's a fundamental limit to what you can do with abstractions.

Consider this: You're abstractions are shaped by what is possible to do with thing(s) you're abstracting. It's fairly easy to come up with a case that makes abstraction impossible. Here's an example:

You want create a program that downloads a movie file from an existing website and then plays it. The system must allow for the movie to start playing as it is coming in. Unfortunately, one of these movie formats puts some important information at the end of the file. It's not possible to actually play the movie until it's arrived and the server you're talking to doesn't allow you to asks for specific bytes before others. Net result: you're doomed. No abstraction can save you because it's not possible to provide an implementation that will do what you want.

This exact same thing can happen in more subtle ways with horizontal features. If you have a feature that adds on a requirement that's a contradiction of an existing requirement there's no abstraction you can do that will fix it.

The first rule here is to make horizontal features into vertical features whenever humanly possible. If you don't you're application will get old before its time. I would go further and say "no" to horizontal features or even look for little used horizontal features to remove from your app. Functionality not being used? Every piece of functionality has some horizontal component. If it's not used it should be removed.

What we're doing here is actually reducing the feature's coupling with other components. Being paranoid about coupling is a very powerful idea. When programmers discover it they jump for joy and then go onto make applications that are much larger and more complicated then ever before.



Then they run into the next wall.. More about that later.

Part 2 - When reducing coupling goes bad.

No comments: