Home > API and Patterns > Minimizing ripple effects

Minimizing ripple effects

In the good old times when racing was dangerous and sex was safe and real programmers coded machine language we were used to use jumps and branches to addresses or labels if you had more than just a ML-monitor. BASIC just took this paradigm to a higher language level and the art of programming involved tracing all the jumps and conditions at the same time until Edsger Dijkstra spoiled it all – and rightly so! There are better ways to waste your testosterone and show you’re a real man than working successfully with untraceable spaghetti-code no one else understands – including yourself some months later. Sadly you can still create a tangled mess if you’re programming in Java and other OOP languages alike. Often you find yourself in a situation where you have to change something which would break a lot of code, because that small change you need to make ripples through several classes. The higher the fan-in-complexity of your code is, the worse it gets. This is why some code analysis tools punish a high fan-in-complexity, which may be misleading if you take the DRY-principle seriously. Especially if you work on frameworks you need to take countermeasures to avoid ripple-effects so you can continue to develop your framework while it’s being used. Let me introduce you to a coding style that helps to reduce ripple effects.

Not exposing inner state

How often do you see code like this?

class Foo{
 private final List<Data> list = new ArrayList<Data>();
 public List<Data> getData(){
  return list;
 }
}

This code is simple and looks good, but even if it looks completely harmless – it isn’t! The first thing I notice is that this class exposes inner state, it exposes its list, which is stateful. Any class could get the list and modify it, potentially breaking your class. Your class can, in no way, control its inner state. Which also means that you cannot get a class like this thread safe. You cannot work on this list safely from within the class. Modifying the list from inside your class will potentially break every class that uses it. Even switching to another List-implementation can ruin the performance of any algorithm that works on this list. And what happens if you decide to use a set instead? Every class that expected a list would need to adapt! As a user of this class you would need to copy the result of the list to your favourite data-structure to be safe.

class Bar{
 private final Foo foo = new Foo();
 public void doSomething(){
    //might break foo:
    foo.getData().add(new Data()); 

    //runtime is sensitive to changes in the list type:
    final Iterator<Data> itr = foo.getData().iterator();
    while(itr.hasNext()){
     final Data data = itr.next();
     if(condition(data)){
      itr.remove();
     }
    }

    //what foo returns might not be what you like,
    //so you start copying or wrapping:
    final Set<Data> set = new TreeSet<Data>(foo.getData());    
  }
}

Now let’s imagine you had to change Foo from using lists to using sets:

class Foo{
 private final Set<Data> set = new HashSet<Data>();
 public List<Data> getData(){ //legacy API
  return new ArrayList<Data>(set);
 }
}

Wait… this one will break every class which relies on the fact that the getter always returns the same list instead of a copy! So you need to do this:

class Foo{
 private final List<Data> list = new ArrayList<Data>();
 public List<Data> getData(){
  return list;
 }
 private void doSomeWork(){
  final Set<Data> set = new HashSet<Data>(list);
  //work with data
  list.clear();
  list.addAll(set);
 }
}

You can see the ugly creeping into the source code, can’t you? No matter what you’re trying to achieve, problems are all over you. You can’t change the return type of the method without hassles, you can’t even change the inner workings of your class without running into problems. Imagine a full blown, complex application code with thousands lines of code all built on classes like this and now watch your Eclipse workspace. Do you see something familiar? I’m sure! And I guess that would be a safe bet, because I see this happening everywhere. Can a programming style suck more than that? Why is it, that even experienced programmers keep on using this kind of style, even though they must feel the pain every time they need to change their code? I guess this pattern is used a lot because beans are used a lot, because it is quickly written without a lot of boilerplate code and because it’s just like a bad habit. The worst thing is: I can’t deny that I frequently fall into the same pit over and over again. I’m trying to improve though, as the solution is astonishingly simple.
Let’s get rid of the problems by simply moving the responsibility for copying the list from outside of the class to the inside, pepper it with generics and serve it:

class Foo{
 private final List<Data> list = new ArrayList<Data>();
  public <T extends Collection<? super Data>> T dataTo(final T target){
   for(final Data data : list){
    target.add(data);
   }
   return target;
 }
}

Any user is now free to chose the type he likes, he can control which collection instances share the data and which don’t and the user-side code doesn’t look too bad either:

class Bar{
 private  Foo foo = new Foo();
 public void doSomething(){
    final Set<Data> set = foo.dataTo(new TreeSet<Data>());
    //...    
  }
}

The class Foo can be thread-safe and changing the collection implementation in Foo won’t affect the performance of any class that obtained data from it, it will not break the build of dependent classes and you don’t need ugly workarounds inside of Foo to honor a legacy API: you could even make Foo use a HashMap internally and all you needed was to change the method to copy the HashMap’s values instead. And if you need to share state with Foo, you can still use listeners, at the expense of some annoying boilerplate code though.
But wait! Doesn’t „copy on get“ mean you waste performance? Sure. But how often does performance matter and how often do ripple effects matter? If you measure a real performance problem somewhere, you can still tune this particular class, use a caching proxy or build a workaround. It’s better to have a small amount workarounds to fix performance problems than to have hundreds of them to cope with structural problems.

Hiding Structure

This is very similar to the stuff above, but rather concentrating on the data structure. Let’s take a look at some piece of code you might see once in a while, after we have learned something from the stuff above:

class Foo{
 public void doSomething(final Fizz fizz){
    final Iterable<Data> datas
     = fizz.getBar().getBuzz().dataTo(new ArrayList<Data>());
    for(final Data data : datas){
     if(attrib.equals(data))){
      //...
     }
    }
  }
}

The problem is almost the same as with the list-example. As you can easily see, Foo depends on three other classes: Fizz, Bar and Buzz. As soon as any of these classes needs a change in its API, for example if Bar needs a list of Buzz, the class Foo is subject to a change. This means Foo is a target of ripple effects. To avoid that, the Law of Demeter should always be in the back of your mind. The Law Of Demeter roughly says that an object should only talk to its closest friends, which are own fields, own parameters and objects it created on its own. It’s a rule, which is sometimes hard to follow and you shouldn’t waste your time hacking up your code like mad just to follow it perfectly. It’s enough to follow its intent: keep dependency-chains very short!
The example above would violate the Law Of Demeter, because Foo talks to Bar, even though it has to ask Fizz for it. It would be better if Foo only talked to Fizz and let Fizz handle it’s business with Bar and Buzz.
As a result the code could look like this:

class Foo{
  public void doSomething(final Fizz fizz){
    final Iterable<Data> datas= fizz.dataTo(new ArrayList<Data>());
    for(final Data data : datas){
     if(attrib.equals(data))){
      //...
     }
    }
  }
}

Alas, the example above means that Fizz would have to delegate the call to Bar and Bar would have to delegate the call to Buzz which obviously creates a lot of boilerplate code. If you like to avoid some of that it may be a good idea to just eliminate some of the dependency but not everything of it. But it may be worth it. Since Foo will have no business with any of the intermediate objects any more, we’re rid of some of the ripple effects with this kind of style.
Unfortunately, Foo still has to iterate over a copy of Fizz’ data, which means that this approach will fail if the Data objects are stored in a structure that is not easy to iterate, say, because it’s so huge that it won’t fit into memory. Therefore it might be a good idea to just pass the algorithm and let the data owner do the iteration. We wouldn’t need to copy anything either, which is good for the application’s performance and memory consumption. This is how it works:

interface Function<R,A>{
  R execute(A param);
}

class Foo{
  public void doSomething(final Fizz fizz){
    final Function<Void, Data> something = new Function<Void, Data>(){
      public Void execute(final Data candidate){
        if(attrib.equals(data){
         //...
        }
        return null;
      }
    };
    fizz.forEachDataDo(something);
  }
}

Alas, this solution requires a new Interface which means even more code and the creation of a class that implements the interface is rather cumbersome and hard to read.

Enter Closures

With Java 7 Sun promised to bring Closures to the Java language. Closures, or Lambda Expressions, will in most situations save you from writing just another interface. Following the recent Java Closure proposal by Alex Buckley the code above would look as follows:

class Foo{
  public void doSomething(final Fizz fizz){
    final #void(Data) something = #(Data data){ 
     if(attrib.equals(data){
      //...
     }
   };
   fizz.forEachDataDo(something);
  }
}

Or shorter:

class Foo{
  public void doSomething(final Fizz fizz){
   fizz.forEachDataDo(#(Data data){ 
    if(attrib.equals(data){
     //...
    }
   });
  }
}

The code with closures is almost as short as the naïve original but it has none of it’s shortcomings, apart from the delegation. So there we are stuck between a rock and a hard place: you can clearly see that this code doesn’t look quite as readable and requires more boilerplate than the first, naïve example, but it decouples way better which will pay off when you have to apply changes later. Since we don’t have closures yet, thus with the interface still necessary for this coding style, it is especially hard to break the habit of writing it the way you always do. You should consider it though.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: