Thursday, 29 March 2012

"Copy-Paste" is really a problem? DRY-"Don’t Repeat Yourself" Principal

We’ve all seen it. We’ve all done it. Duplication! But why? In this post I will discuss some ways duplication happens in code. But first, why do we care about duplication?


It has been my experience that a majority of my time is spend maintaining code –even when writing a new application. Maintenance it’s not going away anytime soon. Therefore, the way to keep maintenance costs down is to speed up the task. The key to that is to make sure that “every piece of knowledge must have a single, unambiguous, authoritative representation within a system”.

The opposite of the statement above, is to express the same knowledge in multiple places. Which creates parallel maintenance –when one is changed the other must change. Can you say, “Hello bugs!”

This is where the “Don’t Repeat Yourself” (DRY) principle gets its roots. There are several ways that duplication happens, here are a few: 
  • When we are rushed or impatient. Sometimes eliminating duplication requires more time and effort. To remove duplication we have to modify our design. Whether it be extract a method, extract a class, etc –something has to change. But we say “it can wait, because it’s working and we don’t want to break it”. 
  • When we follow examples or ‘patterns’ that already exist in code. This is the easiest duplication to spot, but sometimes the hardest to remove. “Copy and paste is a design error”, says David Parnas. Instead, when possible, we should follow Object Orientated principles and use code generation tools to remove duplication. 
  • When we are unfamiliar with the code, which is caused by working in a team or being a new developer to a language. Sometimes we don’t know what classes and methods are available. Without due diligence during research we create duplication because we WILL write it ourselves. 
We must head off this maintenance nightmare by treating duplication as evil. Remove duplication as soon as it’s spotted.

Happy Coding!

Sunday, 11 March 2012

Dependency Inversion Principle. Is it IoC?

Today we are going to talk about one of the most confusing topics of all and see if we can unravel the mess of Dependency Inversion, Inversion of Control and Dependency Injection.

It’s not completely important that we understand the specifics of each of these names, because many people end up using them interchangeably. It’s pretty unlikely that we are going to correct all that in this blog post.

What is important is that we understand the concepts and ideas behind each of these topics and understand what architectural problems each one is trying to solve.


Today, we are going to focus on Dependency Inversion and get a little bit into inversion of control.


Dependency inversion is the root

It all started with this simple concept introduced by Uncle Bob in his article in the C++ Report of May 1996, The Dependency Inversion Principle.

This principle is actually very simply stated:
  • High-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend upon details. Details should depend upon abstractions.
  • I think this concept is easily misunderstood and misapplied, because the reason why we apply this principle is often neglected.

If you are familiar with IoC and Dependency Injection, you can see how both are based from this definition of Dependency Inversion.


What problem does dependency inversion solve?

Dependency inversion solves the problem of higher level modules being dependent on and coupled to the interfaces of lower level modules and their details.

Let me give you a real world example of a current problem that could be solved by dependency inversion.

Take a look around your house and count up all the devices that have batteries that must be charged somehow.

Things like:
  • Digital camera
  • Cell phone
  • Camcorder (flip cam)
  • Wireless headphones
  • Game controllers

What do all of these things have in common? They don’t have a charging interface in common. Some use micro-usb, some use mini-usb, some use their own funky plug.

So as a result, you can’t just have one thing that charged all your devices. You have to have a different thing for each device. Your home’s “charging mobile devices module” is dependent on the device. If you change your cell phone, you need a new charger. If you upgrade your camera, you need a new charger.

The dependency is going the wrong way. Lower-level appliances are defining the interface that your home has to use to charge them. Your home’s charging capability should define the interface the devices have to use. The dependency should be inverted. It would make your life a lot easier.

Let’s look at one more example of a place where I would bet dependency inversion is used. Now, I don’t know about Walmart’s IT structure, but I would venture to guess that when they receive invoices from all of their many distributors the invoices come in the file format that Walmart specifies and not the other way around.

I would bet that Walmart specifies the schema for all of the data it receives from its many business partners. Let’s assume they do. In this case they have inverted the dependency, actually they have inverted the control. Instead of their vendors controlling their interface, Walmart controls the vendors through their interface.

What this means is that every time the vendor changes their internal system they have to still conform to Walmarts interface instead of Walmart having to make changes to accommodate each vendors changes to their format.

Back to your code…

Now let’s look at a code example to see how dependency inversion helps us out. Let’s say you are creating a high level module for parsing log files and storing some basic information into a database.

In this case you want to be able to handle several different log files from a number of different sources and write some common data they all share to a database.
One approach to this kind of problem is to have your module handle each kind of log file based on what kind of data and format it contains and where it is. Using this approach, in your module you would handle various kinds of log files based on the interface those individual log files present to you. (When I use interface here, I am not talking about the language construct, but the concept of how we interface with something.)

Using this approach, in our module we might have a switch statement or series of if-else statements that lead us to a different code path depending on what kind of log file we are processing. For one log file we might open up a file on disk, and read a line, then split that line based on some delimiter. For another perhaps we open a database connection and read some rows.

The problem is the log files are defining the interface our higher level code has to use. They are in effect “in control” of our code, because they are dictating the behavior our code must conform to.

We can invert this control, and invert the dependencies by specifying an interface that the log files we process must conform to. We don’t even have to use a language level interface.

We could simply create a data class called LogFile that is the input to our module. Anyone who wanted to use our module would first have to convert their files to our format.

We could also create an ILogFileSource interface that classes could implement to contain the logic of parsing log files from different sources. Our module would depend on ILogFileSource and specify what kind of methods and data it needs to parse the log files instead of the other way around.

The key point here is that our high level module should be controlling the interface (non language construct kind) that the lower level modules need to adhere to instead of being at the whim of the interfaces of each lower level module.

One way to think of this is that lower level modules provide a service to higher level modules. The higher level modules specifies the interfaces for that service and the lower level module provides that service.

One thing I want to point out in this example is that we knew there would be more than one log file source. If we were writing a log file parsing module that was only ever going to work against one source it might not be worth trying to invert this dependency because we wouldn’t see any benefit from doing so. It isn’t very hard for us to write out code as cleanly as possible working with one source and then refactor it later to invert the dependencies once we have additional sources.

Just because you can invert dependencies doesn’t mean you should.

In this case since we are always writing to a database, I don’t feel any particular need to invert our dependency on writing out the log files. However, there is some real value in encapsulating all of our code that interacts with the database into one place, but that is for another post.

Notice we haven’t talked about unit testing yet

You see the problem of dependency inversion and inversion of control has nothing specifically to do with unit testing.

Simply slapping an interface on top of a class and injecting it into another class may help with unit testing, but it doesn’t necessarily invert control or dependencies.

I want to use the log parsing example to illustrate my point. Let’s say we had created our log parser to have a switch statement to handle each type of log file, and now we want to unit test the code.

There is no reason why we can’t create IDatabaseLogFile, ICSVFileSystemLogFile, IEventLogLogFile andIAnNotReallyDoingIoCLogFile, pass them all into the constructor of our LogFileParser as dependencies and then write our unit tests passing in mocks of each.

That in an extreme example for sure, but the point is slapping an interface onto a class does not an IoC make.

We shouldn’t be trying to implement this principle to make it easier to write unit tests. Difficult to write unit tests should give us hints like:
  • Our class is trying to do too much
  • Our class has lots of different dependencies
  • Our class requires a lot of setup to do work
  • Our class is just like this other class that does the same thing only for a different input

All of these kinds of hints tell us that we might want to invert control and invert dependencies to improve the overall design of our class, not because it makes it easier to test. (Although it should also make it easier to test.)

Ok, ok, so is dependency inversion the same as inversion of control or what?

Short answer: yes.

It depends on what you mean by control. There are three basic “controls” that can be inverted.
  1. The control of the interface. (How do these two systems, modules, or classes, interact with each other and exchange data?)
  2. The control of the flow. (What controls the flow the program? This control inversion happens when we go from procedural to event driven.)
  3. The control of dependency creation and binding. (This is the kind of inversion of control IoC containers do. This inversion is passing the control of the actual creation of and selection of dependencies to a 3rd party which is neutral to either of the other 2 involved.)
Each of these 3 is a specific form of dependency inversion and may even involve multiple kinds of dependencies being inverted.

So when someone says “inversion of control”, you should be thinking “what control is being inverted here?”

Dependency inversion is a principle that we use in architecting software.

Inversion of control is a specific pattern that is applied to do so.

Most people only think of inversion of control as #3 above, inverting the control of dependency creation and bind. This is where IoC containers and dependency injection take root.

What can we learn from this?

My goal is that we stop grouping the concepts of inversion of control and dependency inversion automatically with dependency injection.

We have learned that dependency inversion is the core principle that guides many of the other practices that have derived from it.

Whenever we apply a pattern we should be looking for the core principle it is tied to and what problem it is helping us solve.

With this base understanding of dependency inversion and inversion of control, we have the prerequisite knowledge to look at dependency injection and understand better what specific problem it tries to solve.

Saturday, 10 March 2012

The Wizard Design Pattern


The Wizard Design Pattern
We all love wizards.... (Software wizards I mean). We are always happy to jump on those ''Next" buttons like we were dancing the funky chicken on our… well you get the point. So today we bring you your beloved wizard into your coding experience. Let's jump right into an example.

Say you want to design a ConservativePerson class.

import java.util.List;

class ConservativePerson{
    private boolean isVirgin;
    private boolean isMarried;
    private List<string> children;
 
    ConservativePerson(boolean virgin, boolean married, List<string> children) {
        this.isVirgin = virgin;
        this.isMarried = married;
        this.children = children;
    }
 
    public boolean isVirgin() {
        return isVirgin;
    }
    public boolean isMarried() {
        return isMarried;
    }
 
    public List<string> getChildren() {
        return children;
    }
}

As such it has some constrains.

  • He must be married before he can be... well, not a virgin.
  • He can't be a virgin before he can have children (as far as we know).

In the old days, which is basically all days until today..., you would probably define all kinds of modifiers methods for this class which will throw an exception in case of invariant invalidation such as NotMarriedException and VirginException. Not anymore.

Today we will do it by using the Wizard Design Pattern. We use a fluent interface style and utilize the power of a modern IDE to create a wizard-like feeling when building a ConservativePerson object. We know, we know, stop talking and show us the code... but before we will present the wizard code we will show you it usage so you will get a grasp of what we are talking about...

public class Main {
public static void main(String[] args) {
    ConservativePersonWizardBuilder wizard = new ConservativePersonWizardBuilder();
    ConservativePerson singlePerson = wizard.
            createConservativePerson().
            whichIsSingle().
            getObject();
    ConservativePerson familyManPerson = wizard.
            createConservativePerson().
            whichIsMarried().
            andNotVirgin().
            andHasChildNamed("Noa").
            anotherChildNamed("Guy").
            lastChildName("Alon").
            getObject();
  }
 
}

Now, it may look just like an ordinarily fluent interface, but the cool thing here is that a method is available for calling only if the current object state allows it. This means you will not be able to call the method andNotVirgin if you haven't called the method whichIsMarried.
See the following set of screen shots:


and after we state he is married we can:

Here is the wizard code. I urge you to copy/paste it to your IDE and give it a try by building an object with it.

import java.util.ArrayList;
import java.util.List;
   
   public class ConservativePersonWizardBuilder {
     private boolean isVirgin;
     private boolean isMarried;
     private List<String> children = new ArrayList<String>();
       
     public SetMarriedStep createConservativePerson(){
         return new SetMarriedStep();
     }
   
     class SetMarriedStep {
        public SetVirginStep whichIsMarried(){
             isMarried = true;
             return new SetVirginStep();
         }
   
         public FinalStep whichIsSingle(){
             isMarried = false;
             return new FinalStep();
         }
     }


     class SetVirginStep {
         public AddChildrenStep andNotVirgin(){
             isVirgin = false;
             return new AddChildrenStep();

         }
         public FinalStep butStillAVirgin(){
             isVirgin = true;
             return new FinalStep();
         }
     }
 
     class FinalStep {
         public ConservativePerson getObject(){
             return new ConservativePerson(isVirgin, isMarried, children);
         }
     }
 

     class AddChildrenStep {
         public AddChildrenStep andHasChildNamed(String childName) {
             children.add(childName);
             return new AddChildrenStep();
         }
         public AddChildrenStep anotherChildNamed(String childName) {
             children.add(childName);
             return new AddChildrenStep();
         }
         public FinalStep lastChildName(String childName){
             children.add(childName);
             return new FinalStep();
         }
     }
 }

As you can see the wizard consists of several steps. Each step is represented by a dedicated inner class. Each step reveals the legal available operations by its methods. Each method will then return a new step according to the change it has made. This way an attempt to create an illegal object will be detected at compile time instead of runtime.

This pattern is actually being used in our production code. One example that comes to mind is the MediaJob class. This class describes a manipulation on some media files. In order to submit a job to the system, one has to create aMediaJob object. The problem is that this object has many parameters that could be assigned with contradicting values that create an illegal object state. By using the Wizard pattern, one can easily build a legal job without the need to know the entire (and complicated…) set of constrains.

That is all for now. Hope you'll give it a try..... We plan to write a more formal description of it (GOF style) in the near future.

Sunday, 4 March 2012

Do it short but do it right !! (Java Best Practices)

Writing concise, elegant and clear code has always been a difficult task for developers. Not only will your colleagues be grateful to you, but you would also be surprised to see how exciting it is to constantly look forward to refactoring solutions in order to do more (or at least the same) with less code. One used to say that good programmers are lazy programmers. True, true... But really good programmers are adding beauty to it.
You can easily improve the readability of your code, exploiting the power of the Java language, even for pretty basic things.

Let's start with a concrete example:
1String color = "green";
2...
3if  ( color!=null && color.equals("red") ) {
4    System.out.println("Sorry, red is forbidden !");
5}
One of the first lessons you probably learned from your Java (or Object-Oriented programming) teacher is the importance of testing the nullity of an object before invoking a method on it. Null pointer exceptions (NPEs) are indeed among the most common (and irritating) faults raised in the code of object-oriented languages.

In the above example, it is safe to ensure the 'color' String object is not null before comparing it to a constant. I personally have always considered this as an unnecessary burden on the programmer – especially for modern OO languages such as Java. As a workaround, there exists a (really stupid) trick for rewriting the condition without having to test for nullity. Remember, the equals() method is symmetric (if a=b then b=a).
1if  "red".equals(color) ) {
2    System.out.println("Sorry, red is forbidden !");
3}
At first glance, it might be seen a bit contra-natural when read, but eliminating the polluting code is certainly not worthless.

Let's continue our example, and imagine we now want to compare our color with multiple values. Java beginners would usually write something like:
1if "red".equals(color) ||
2     "yellow".equals(color) ||
3     "blue".equals(color) ) {
4    System.out.println("This is a primary color");
5}
Sometimes met more experienced Java programmers shortening such long if statement with: 
1if "red|yellow|blue".indexOf(color)>=0 ) {
2    System.out.println("This is a primary color");
3}
Smart isn't it ? Not that much actually. Playing with substrings can be a dangerous game. For instance the following code might not give the expected results, especially if you are a man:
1String type = "man";
2...
3if "woman|child".indexOf(type)>=0 ) {
4    System.out.println("Women and children first !");
5}
If you are looking for a good balance between elegance and readability, you had better opt for one of the following alternatives.
01import java.util.HashSet;
02import java.util.Set;
03
04public static final Set<string> PRIMARY_COLORS;
05static {
06    PRIMARY_COLORS = new HashSet<string>();
07    PRIMARY_COLORS.add("red");
08    PRIMARY_COLORS.add("yellow");
09    PRIMARY_COLORS.add("blue");
10}
11...
12if ( PRIMARY_COLORS.contains(color) ) {
13    System.out.println("This is a primary color");
14}
Few people know it, but there is still a way to reduce code verbosity when initializing the Set of primary colors:
1public static final Set<string> PRIMARY_COLORS = new HashSet<string>() {{
2    add("red");
3    add("yellow");
4    add("blue");
5}};
In the event concision of code becomes an obsession, the Java Collections Framework can also come to the rescue:
1import java.util.Arrays;
2import java.util.Collection;
3
4public static final Collection<string> PRIMARY_COLORS = Arrays.asList("red""yellow", "blue");
5...
6if ( PRIMARY_COLORS.contains(color) ) {
7    System.out.println("This is a primary color");
8}
The final keyword prevents the PRIMARY_COLORS variable from being re-assigned to another collection of values – this is particularly important when your variable is defined as public. If security is a major concern, you should also wrap the original collection into an unmodifiable collection. This will guarantee a read-only access.
1import java.util.Arrays;
2import java.util.Collection;
3import java.util.Collections;
4
5public static final Collection<string> PRIMARY_COLORS =
6   Collections.unmodifiableCollection( Arrays.asList("red""yellow", "blue") );
7</string>
It must be noticed that, though more readable, using a collection of values (especially with large collections) will generally remain slower than classical lazy OR's (ie using '||' instead of '|') because of theshort-circuit evaluation. Nowadays such considerations become futile.

After 16 years of complaints, Java 7 has - at last! - introduced the support of String in switch-case statements. This allows us to code things such as:
01boolean primary;
02switch(color) {
03 case "red":
04     primary=truebreak;
05 case "green":
06     primary=truebreak;
07 case "blue":
08     primary=truebreak;  
09 default:
10     primary=false;
11}
12if (primary) System.out.println("This is a primary color");
Let us finally end with what is probably the most object-oriented solution to our (so to say) problem. Java enumerations are primarily classes, and can therefore have methods and fields just like any other classes. By applying the Template Method design pattern, one can define an abstract method (modeling the test) which has to be implemented by all subclasses (modeling the response of the test applied to a particular item of the enumeration):
01Color c = Color.valueOf("RED");
02if ( c.isPrimaryColor() ) {
03  System.out.println("This is a primary color");
04}
05
06public enum Color {
07     RED() {
08          @Override
09          public boolean isPrimaryColor() {
10              return true;
11          }
12     },
13     BLUE() {
14         @Override
15         public boolean isPrimaryColor() {
16               return true;
17          }
18     },
19     YELLOW() {
20         @Override
21         public boolean isPrimaryColor() {
22               return true;
23          }
24     };
25     GREEN() {
26         @Override
27         public boolean isPrimaryColor() {
28               return false;
29          }
30     };
31     public abstract boolean isPrimaryColor();
32}
The resulting code is clear and self-documenting. Using this pattern is a great alternative in many cases to the more common “if - else if” logic since it is easier to read, extend, and maintain.

To conclude, as very often - and this is the power of the Java language – for one problem, there exist so many different solutions in terms of implementation. But deciding which one is the best is another story...