The Agile Architect

Enforcing Unique Keys with NULL values

Users of MySQL and some other databases often find themselves in this situation: You need to create a unique key as a combination of table columns but some of them are nullable. The problem is that uniqueness is not enforced in the way you would expect.

For example, suppose you have a table called called SONG and it has columns TITLE, ARTIST, ALBUM. You postulate that the combination of these 3 columns should uniquely identify a song and you want to enforce this constraint in your database model. Suppose however, that sometimes the ALBUM column does not have a value because some songs were never released in an album (say). In this case MySQL would let you insert two columns such that TITLE=’song’, ARTIST=’Band’, ALBUM=null. This is not the desired behavior as you would like to have only one record for a particular song and artist and no album.

In a simple situation like this, an easy workaround would be to alter ALBUM so that it is not nullable and supply some special value that represents the absence of an album. The empty string for example would work in this case. A unique key containing these columns produces the desired effect.

Consider however the trickier situation where the columns participating in the unique key are foreign keys to other tables. In the case the SONG table has columns (TITLE, ARTIST_ID, ALBUM_ID) with foreign keys to the ARTIST and ALBUM tables. As before, ALBUM_ID is nullable. But in this case, the technique of populating a “dummy” value for ALBUM_ID no longer works because it would break the integrity of the foreign key.

Discussion of Unsatisfactory Solutions

In this section I discuss some solutions that have been proposed and I did not find satisfactory. If you are anxious to get to the ultimate solution, you can skip to the next section.

Solution 1: Drop the foreign key constraint so you can populate a dummy value such as 0 or -1.

This solution would clearly work as far as the enforcement of the unique constraint is concerned. I find this solution unsatisfactory because it opens up some other, possibly harder, problems. The first problem is that you are trading one aspect of database integrity with another. Allowing references to other tables to point to non-existent rows can create problems that are much harder to debug than a duplicate entry. Another serious problem, that is very hard to work around, is that this solution will not work (without significant effort) with some ORM layers such as Hibernate. With or without a foreign key, when you define an association with Hibernate, it expects to retrieve an Album value for every value in ALBUM_ID. Missing values will cause an error.

Solution 2: Create a dummy row in the ALBUM table corresponding to an otherwise invalid value of the ALBUM_ID, such as -1.

The immediate problem with this solution is that, if you adopt it, then you must create special code in your application to deal with these dummy values. Furthermore, if ALBUM has foreign keys to other tables, then you will find yourself in a situation where you have to populate and manager more dummy rows in several tables across the database. The ideal solution should not require introducing any special logic in the code.

Solution 3: Do not enforce the unique key in the database, but rather enforce the uniqueness in the application logic (which is to say, the code).

This solution is obviously a last resort, if you cannot find a way to enforce uniqueness in the database. The obvious problems with this approach is that it is always possible for a codepath to bypass the uniqueness check. It is also possible for someone to insert a duplicate record in the database directly.

A Satisfactory Solution

Here is a solution that works and I find superior to the other proposals for reasons that I will soon discuss. The solution involves a certain degree of “hackiness” but it is mostly benign. Here it is:

For each nullable foreign key that you need to include in a unique index, create a dummy column that is not nullable. In our example, this would look something like this:

ALTER TABLE SONG ADD COLUMN ALBUM_ID_REQ int(11) NOT NULL;

The idea is that this column is going to mirror the foreign key ALBUM_ID but it is going to have a zero where the foreign key has a NULL. To make sure the new column is properly populated, the existing data (if any) needs to be updated as follows:

UPDATE SONG SET ALBUM_ID_REQ = 0 WHERE ALBUM_ID is NULL;
UPDATE SONG SET ALBUM_ID_REQ = ALBUM_ID where ALBUM_ID is NOT NULL;
There is one other step that’s necessary: We need to make sure the new column is correctly populated whenever a new row is inserted. We could enforce this at the application (code) level, but this solution would suffer from some of the drawbacks of the previous solutions. In particular, it is open to the possibility that some code path will simply forget to do it. It is also possible to insert a record in the database with the wrong values for the dummy column.

To make sure that our little hack stays in the database and does not leak into the application code, we need an insert trigger that will populate the dummy column with the correct value. This trigger is straightforward and inexpensive:

CREATE TRIGGER SONG_ALBUM_TRIGGER BEFORE INSERT ON SONG
FOR EACH ROW BEGIN 
  IF NEW.ALBUM_ID is null THEN SET NEW.ALBUM_ID_REQ = 0; 
  ELSE SET NEW.ALBUM_ID_REQ = NEW.ALBUM_ID; 
  END IF; 
END;

At this point, we can go ahead and create a unique index:
CREATE UNIQUE INDEX SONG_U1 ON SONG (TITLE, ARTIST_ID, ALBUM_ID_REQ);

Note that it is the required dummy column ALBUM_ID_REQ and not the original nullable column that participates in the unique index.

And voila! We have a unique constraint that works exactly as expected without having to change a single line of application code.

Solution Justification

Here is why I consider this solution a better solution than the ones discussed earlier:

1. No code changes are necessary: You do not need to introduce the dummy columns into your SQL queries or in your ORM layer. Any existing code will work as is. All the other solutions we discussed require code changes to some extent.

2. Low failure probability: Any solution that tries to enforce uniqueness in the code suffers from two problems: a) It is always possible to create code that bypasses the solution. b) You can always bypass the constraint by inserting directly in the database. Our solution does not suffer from these drawbacks. The only way to fail is for someone to manually update the records in the database, and this is pretty hard to do by mistake.

3. No superfluous data: Unlike solution 2, you don’t need to maintain dummy data in your database.

4. It is simple: The solution consists of four steps comprised of very straightforward SQL.

 

An Encapsulation-Preserving Builder Pattern in Java

The Builder pattern is used to construct complex objects. Typically, it is used to construct objects that would be impractical to construct with a single constructor call due to the large amount of parameters of many different ways than an object can be constructed.

Here is a usage example of a builder from the Quartz Job Scheduling Library.

JobDetail job = JobBuilder.newJob().
                withDescription("A quick and dirty job").
                withIdentity("RushJob").
                usingJobData("checkTwice", false).
                build();

A good builder must satisfy the following requirements:

  1. It must not allow the creation of invalid, or partially created, objects.
  2. It must hide the implementation details of the object being built.
  3. It must hide the implementation details of the builder itself.

The Quartz builder presented above is a pretty good builder. It guarantees the constructed object is complete, it has simple intuitive build steps, each returning a Builder object in order to allow for convenient method chaining. Where it falls a bit short is in satisfying requirement #2. The reason is that, as a user of the library I can cheat and explicitly create a JobDetailImpl object (the implementation of JobDetail) and call setter methods to create a possibly invalid Job. In other words, the Quartz Builder does not preserve the encapsulation of the JobDetail object.

The root of the problem is that in order to satisfy #2, the constructor and setter of the object must be private, which consequently means that no external object can use them.  This, in turn, implies that if the Builder were an external object it could have no access to the constructors and setters. If you are really serious about encapsulation, you need to work around this problem, and this is what I am demonstrating in this article.

Recipe for a Builder the preserves encapsulation

Here is a simple recipe for creating a fully encapsulated builder:

  1. Define an interface for the object you are creating. The interface should expose only methods necessary to use the object, not build it.
  2. Define an interface for the object’s builder.
  3. In the implementation of the target object, make the object’s constructor and setters private.
  4. Define the builder implementation as an inner private class of the target object.
  5. Provide a static method  that gives you an instance of the builder.

To put this recipe in action, let’s see how it can be used to create a Finite State Machine (FSM). An FSM is a good example of an object that requires a builder since there is an infinite number of ways to create them.

1. Define the interface of the Finite State Machine:

public interface FiniteStateMachine<S>  {

    S getInitialState();

    S getNextState(S from, char symbol);

    boolean accepts(String word);
}

2. Define the interface for the builder:

public interface FSMBuilder<S> {

    FSMBuilder<S> setInitialState(S initialState) 
                   throws BuilderException;

    FSMBuilder<S> addFinalState(S state) throws BuilderException;

    FSMBuilder<S> addTransition(S from, S to, char c)
            throws BuilderException;

    FiniteStateMachin<S> build() throws BuilderException;

}

This interface is characteristic of builder interfaces. The first three methods provide building blocks, whereas the last method, which must always be present, returns the final constructed object – or throws an exception if the object is not complete.

The following snippet of code illustrates steps 3, 4, and 5.

3. FSM is the implementation of FiniteStateMachine. Note that its constructor and internal variables are private.

4. PrivateFSMBuilder, which implements FSMBuilder, is an inner private class, only providing access to valid build steps. This is the key to making the builder have access to the inner workings of the class without exposing them to the outside world.

5. Note that in the last line of code we provide the only possible way to construct a new FSM, by getting a hold of a builder interface.

public class FSM<S> implements FiniteStateMachine<S> {

    private StateTransitionTable<S> transitionTable = new StateTransitionTable<S, C>();

    private FSM() {}

    private class PrivateFSMBuilder implements FSMBuilder<S> {

        @Override
        public FSMBuilder<S> setInitialState(S initialState)
                throws FABuilderException {
            if (null != FSM.this.initialState) {
                throw new BuilderException("Initial state already set.");
            }
            FSM.this.initialState = initialState;
            states.add(initialState);
            return this;
        }

        @Override
        public FSMBuilder<S> addFinalState(S state)
                throws BuilderException {
            FSM.this.addFinalState(state);
            return this;
        }

        @Override
        public FSMBuilder<S addTransition(S from, S to, char c)
                throws BuilderException {
            FSM.this.addTransition(from, to, c);
            return this;
        }

        @Override
        public FiniteStateMachine<S> build()
                throws FABuilderException {
            if (FSM.this.getInitialState() == null) {
                throw new BuilderException("Initial state is not specficied.");
            }
            if (FSM.this.finalStates.isEmpty()) {
                throw new BuilderException(
                        "There must be at least one final state");
            }
            return FSM.this;
        }
    }

    //More FSM code goes here...

    private FSMBuilder<S> buidler = new PrivateFSMBuilder();

    public static <S> FSMBuilder<S> newFSM() {
        return new FSM<S>().buidler;
    }

Here is an example of how to use the FSM builder to build an FSM:

FSMBuilder<String> builder = FSM.newFSM();
FiniteStateMachine<String> machine =      
    builder.setInitialState("A")
           .addFinalState("B")
           .addTransition("A", "A", '0')
           .addTransition("A", "B", '1')
           .addTransition("B", "A", '1')
           .addTransition("B", "B", '0')
           .build();

Conclusion

In this article we visited a well known Design Pattern, termed the Builder. We demonstrated how to implement a Builder pattern in Java in a manner that preserves the Encapsulation principle both for the object being built and the builder itself.

You can see this builder pattern in action for an actual Finite State Machine by downloading the source code for the JavaFSM project in Sourceforge http://sourceforge.net/projects/javafsm/

Brief Introduction to JUnit

Once upon a time, testing used to be an isolated step of the development process, traditionally the last stage before the release. As software development evolved through the hard lessons of delays and failures, testing has been progressively moved into earlier stages of the development process. Modern development philosophies, such as RAD, XP, and Agile, advocate the development of tests in parallel with the development of the code.

Although the quest for a software development process that works is ongoing, unit testing has been provably one of the necessary ingredients for the optimal mixture. Unit tests facilitate and encourage modular development by testing small modules (methods or classes) of code. Unit testing is ideally suited for Object Oriented Programming where objects are semi-independent functional modules.

There are two factors that set unit testing apart from other types of testing such as functional, acceptance, and integration testing. The first is that unit tests, as the name suggest, aim to test each unit of code (module or function) independently of one another and not as an integrated deployment. The second factor is that unit tests are written by developers and not by QA engineers.

Quick Overview of JUnit

JUnit is the most widespread unit-testing tool for Java. The principal value of JUnit does not lie in an extensive testing API – in fact it’s API is quite small – but in giving the development team the ability to create easily re-runnable test cases. In particular, JUnit integrates with all standard IDEs and popular building frameworks such as Ant and Maven, so it can be easily incorporated in the daily development process. Let me illustrate this point with an example.

Consider a common legacy way of testing code, which is to write a main method and print out the result:

public static void main(String[] args) {      
     System.out.println(NumberUtilities.numDecimalPlaces(-123.03993232)); 
}

This code snippet will execute the function under testing and print out the result. The developer examines the result and determines if the function does the right thing.
Here are some immediate problems with this approach:

  1. Suppose you have a hundred such tests. Can you run all the tests every time and verify each individual result?
  2. Suppose a number of months elapses since the test was written. Will you be able to remember what the desired output of this particular method was supposed to be?

Let’s contrast this with the equivalent JUnit snippet:

@Test
public class NumberUtilitiesTest {
    public void testNumDecimalPlaces() {  
         assertEquals(10,NumberUtilities.numDecimalPlaces(-123.03993232));
     }
}

The @Test annotation flags the method as a unit test. This tips your IDE, ant, or maven script to execute this as a unit test. The assert statement, rather than displaying the result, compares it against an expected value. The first thing to observe is that there is no extra effort whatsoever in writing the test case as a unit test rather than a main method. Let’s see how doing this resolves the problems we discovered:

    1. You don’t have to run the tests manually because your IDE or ant script can run all of them as a testing suite. Furthermore, you don’t need to inspect any results because they are compared against expected values.
    2. You don’t have to remember the expected output because it is part of the test.

As long as your code works as expected, JUnit does not make you do any extra work. The tests run as part of your build and if they pass JUnit remains silent. You only get a report if any test fails, in which case you are given information about the test that failed and a full stack trace. For example, suppose that someone introduced a bug in the code that truncates to 8 decimal places instead of 10. JUnit would produce the following exception in the report:

junit.framework.AssertionFailedError: expected: but was:
	at junit.framework.Assert.fail(Assert.java:47)
	at junit.framework.Assert.failNotEquals(Assert.java:282)
	at junit.framework.Assert.assertEquals(Assert.java:64)
	at junit.framework.Assert.assertEquals(Assert.java:201)
	at junit.framework.Assert.assertEquals(Assert.java:207)
	at com.adaptiveinternational.wsg.util.NumberUtilitiesTest.testNumDecimalPlaces(NumberUtilitiesTest.java:8)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

Exceptional Programming

How you manage exceptions in a Java program can significantly affect the maintainability and stability of your code. To a programmer transitioning from a different language, managing exception may seem like unnecessary overhead but to the experienced programmer they are a necessary part of a professional API.

Interestingly, I am observing that the favorite way for novice programmers to deal with exceptions is to catch them and report an error (frequently to the console) which happens to be in general the worse way to deal with them. I think this behavior stems from a tendency to associate exceptions with code problems. Indeed, in C where exceptions are uncaught and uninformative they almost always indicate a catastrophic code failure. In Java on the other hand, exceptions are a standard way of communicating failures from one class to another in a controlled way. Throwing the right exceptions safeguards your code against misuse.

Types of Java Exceptions

The Java exception hierarchy starts at Throwable which has only two direct subclasses: Error and Exception. Application code should not throw Error nor try to catch it. The type of Throwable that most programmers have to deal with is Exception. There are two flavors of Exception: Checked and Unchecked. Every exception that subclasses RuntimeException is unchecked whereas every exception that subclasses Exception is checked.

Here are the compiler rules regarding checked and unchecked exceptions:

Unchecked Exceptions: A method does not need to declare any unchecked exceptions that it throws. Unchecked exceptions do not need to be caught or explicitly declared as thrown by the code that uses them. If an unchecked exception is not caught explicitly it will propagate through the call stack and be thrown by the entry point. If this happens on a standalone JVM, the JVM will exit.

Checked Exceptions: A method that throws a checked exception must declare it in a method signature using the throws keyword. Any code that calls that method must either catch the exception or throw it by declaring it with the throws keyword.

Exceptional Programming Practices

When to throw unchecked exceptions

Application code should only throw unchecked exceptions only when it encounters a condition under which the application is not going to be able to continue operation. For example, if your application depends on a database and you fail to establish a database connection, this is a good candidate for an unchecked exception. There is no way to handle this failure so the best thing to do is fail fast.

There are however some programmers, and some frameworks, that relax the conditions for throwing unchecked exceptions. For example, the Spring framework will throw unchecked exceptions for any database failure even if it is a single query. The rationale for doing it is that there is that there is nothing the calling code can do to recover from the error, therefore even if it is not a problem that affects the whole application, it should still be propagated. It is true that 90% of the time, there is nothing to do with these exceptions but propagate them. Nevertheless, the framework clearly documents the unchecked exceptions that it throws so that programmers can still catch and handle them if they so wish.

When to catch unchecked exceptions

Unchecked exceptions should only be caught under the following two scenarios:

1. Catch unchecked exceptions in the application tier that interacts with the User. For example, if you have a Swing or Web client, displaying a stack-trace to the end-user who can’t do anything about it is unadvisable. The code should catch the exception, ideally send a notification to support (via email for example) , and produce a friendly (or at least friendlier) message for the user informing them that there was a problem.

2. Catch a unchecked exception if your code knows how to handle it gracefully. For example, suppose your code is using an Object argument whose real type belongs to one of three classes (not necessarily a good design but a good example). You can try casting it to each of the possible classes, catch the ClassCastException and proceed (see code fragment below).

 public void convertAndUse(Object object) {
        try {
            BigInteger bi = (BigInteger) object;
            //TODO use it
            return;
        } catch (ClassCastException cce) {
            //continue
        }
        try {
            String s = (String) object;
            //TODO use it
            return;
        } catch (ClassCastException cce) {
            //continue
        }

When to catch checked exceptions

You should catch unchecked exceptions in two circumstances:

1. To re-throw them as part of a new exception. We will discuss when and how to do this in the next section.

2. If your code knows how to recover from the exception. This is identical to the second scenarion in the case of unchecked exceptions.

The Art of Throwing

When you call a method that throws a checked exception and you don’t know how to handle the exception, you have two choices:

1. Throw the exception as it is.

2. Create a new exception to throw.

If you decide to throw a new exception, you should preserve the original exception by passing it into the constructor of the new exception. This is called chaining exceptions and it makes it much easier to determine the original cause when looking at a stack trace (see code sample below).

 public void retrievePassword() throws PasswordAccessException {
        try {
            InputStream fis = new FileInputStream(passwordFile);
            //TODO get password from stream
        } catch (FileNotFoundException e) {
           throw new PasswordAccessException("The password could not be retrieved.",e);
        }
    }

Under what circumstances should a programmer create new exceptions rather than propagate existing ones? The answer should be sought in the Object Oriented Programming principles of Encapsulation and Abstraction. These two concepts are similar and what they amount to in practice is sparing the user of class (or interface) the technicalities of the underlying implementation.

Consider for example the following interface for sending an email message:

public interface EmailNotificationService {

    void sendEmail(String from, String subject, String[] to, String content);

}

The interface affords any number of implementations. For example, Java Mail, Spring’s JavaMailSender, or a lower level implementation that explicitly manages the communication with the mail server. The point is that the interface hides the implementer’s choices from the user.

Now how about exceptions? Depending on the implementer’s choice,each different technical decisions will throw a different exception. For example, Java Mail may throw the (checked) MessagingException whereas Spring would throw the (unchecked) MailSendException. Throwing either of these will reveal the implementation, therefore proper encapsulation dictates that a new exception is created to abstract the specific error message, as follows:

public interface EmailNotificationService {

  void sendEmail(String from, String subject, String[] to, String content)
            throws EmailNotificationException ;

}

Summary

In conclusion, Java Exceptions are part of the API of a class or interface and are used to communicate the fact that a method failed to complete its intended function in a controlled manner. Exceptions (both checked and unchecked) should only be caught if the code that catches them can recover from the original problem. In all other cases, exceptions should be propagated. Exceptions can be chained and should be chained as part of a higher-level exception whenever code encapsulation must be preserved.

Coding and Development Standards

I use the term Coding Standards to refer to guidelines pertaining to the appearance of the source code, and the term Development Standards to refer to more general programming guidelines that an organization agrees to adhere to. In this sense, coding standards are “easy” given that there are numerous code-checking and formatting tools that can be applied to make adherence to coding standards a mechanical practice. Development Standards on the other hand, require some conscious discipline on the part of the developer and, although their validation can be partially automated, probably need some additional enforcement mechanisms such as pair programming, peer review, and code review. Often, Development Standards consist of avoiding practices that have been proven over time to result in poor quality code.

Let us examine the pros and cons of having standards in an organization. The primary reason for not having coding standards is that is nearly impossible for everyone to agree on them. Over time developers have evolved their own personal programming style and their are loath to modify it. Consequently, an enforced development standard will displease some people to some extent.

Now let us explore some of the main benefits that can reaped from adhering to a standard that will make it adopting it worthwhile.

Quality
Code Quality is the most obvious benefit of development standards. To give a simple example, consider several nested loops or conditionals. If there are no enclosing brackets it’s hard to tell where each one starts and ends leading to the possibility of a bug. Even if the developer who wrote it is adept at this sort of thing, it will not be as obvious to another developer who tries to understand or modify the code.

Readability
Code Readability is another benefit of Coding Standards. Large classes and methods with deeply nested loops are notoriously hard to understand and even harder to test. It is also much harder to understand a piece of code, as you leap from class to class, if each file has its own programming style.

Maintainability
Last but not least comes the necessity of maintaining the code over time. Software applications have surprisingly long lifetimes, usually surpassing the employment span of developers who work on it. Code that is hard to read and understand is harder to maintain and more bug-prone.

Tools for Monitoring Standards

There are several tools that can help developers adhere to coding standards, measure deviations from the standards, and report on code quality in general.
To begin with, standard IDEs such as Eclipse and NetBeans have formatting and style-checking tools and they come packaged with the Java conventions. Checkstyle is a tool that lets an architect or manager define any coding standards and reports violations. It comes with ant tasks and plugins for Eclipse and NetBeans. PMD is an excellent code quality tool that detects possible bugs, dead code, sub-optimal code, code duplication, and more. It is also highly configurable like checkstyle. Findbugs is a tool similar to PMD. Cobertura and Emma are tools that measure unit test coverage. Finally,
Hudson is an excellent continuous integration tool for Java with plugins for incorporating reports produced by all of the
tools mentioned here.

Architected By: