Monthly Archives: January 2010

Order, Order

This post is about how to impose order on instances of Java classes. The context of the post will be centred around reading in some text and then counting the number of times each word appears in the text. The output will then be sorted either alphabetically or in order of the number of times each word appears etc. Links to the source code are provided if you want to take a look.

Some instances of classes have an implicit order – a natural ordering. For example, in Java, String objects are ordered lexicographically; Integer objects are ordered numerically. If you find you are writing a value class whose instances have an obvious natural order, you should consider implementing the Comparable interface:

public interface Comparable {
  int compareTo(T t);
}

The String class is an example of a class that implements Comparable; instances of it are ordered lexicographically. If you choose to implement Comparable, your class will be able to take advantage of many of the generic algorithms and collections available in the Java APIs, assuming you need to of course! The following example reads in some text and counts the instances of each word:

public class WordCountTest {
   public static void main(String[] args) throws Exception {
	WordReader reader = new WordReader(new InputStreamReader(System.in));
	Map wordMap = new TreeMap();
	String word;
	while ((word=reader.readWord()) != null) {
		Integer count = wordMap.containsKey(word) ? wordMap.get(word) : 0;
		wordMap.put(word, ++count);
        }
	// Print out using the natural order of the key
	for (Map.Entry entry : wordMap.entrySet()) {
		System.out.println(entry.getKey()+" : "+entry.getValue());
	}
   }
}

A TreeMap sorts entries based on the natural order of the key, in this case, a string, but what if you want to order your instances in an unnatural order? e.g. ordering integers in decreasing order from largest to smallest. Well, in Java, you would use a Comparator:

public interface Comparator {
    int compare(T o1, T o2);
	boolean equals(Object obj);
}

Comparators also allow you to provide an ordering for objects that don’t have a natural ordering, e.g. classes that don’t implement Comparable. Let’s take a look at an example that uses a comparator – as defined in the Order class – that sorts the words based on the number of times they appear:

public class WordCountTest2 {
   public static void main(String[] args) throws Exception {
	WordReader reader = new WordReader(new InputStreamReader(System.in));
	Map wordMap = new HashMap();
	String word;
	while ((word=reader.readWord()) != null) {
		Integer count = wordMap.containsKey(word) ? wordMap.get(word) : 0;
		wordMap.put(word, ++count);
	}

	// Sort by greatest occurrence first
	Set entries = new TreeSet(Order.INCREASING_COUNT_COMPARATOR);
	for (Map.Entry entry : wordMap.entrySet()) {
		entries.add(new WordCount(entry.getKey(), entry.getValue()));
	}

	// Print
	Iterator result = entries.iterator();
	while (result.hasNext()) {
		WordCount e = result.next();
		System.out.println(e.getWord()+" "+e.getCount());
	}
  }
}

It’s also quite common to instantiate a Comparator as an anonymous class and pass it in to the constructor of a sorted collection.

It’s worth noting that in the compare method in the Comparator used in the example, we first compare the word counts and if they are equal, we then return the result of comparing the value of the words. This is essential because sorted collections in Java use the compareTo method – or in this case, the compare method of the Comparator – in place of equals. What this means is that if we were to compare just the counts and return zero because they are equal – even though the words may be different – the WordCount instance would NOT get added to the (sorted) collection if an instance already exists in the collection with the same count. There’s a whole section in Joshua Bloch’s book – Effective Java (2nd edition) – that discusses this in greater detail (Item 12) for those that are interested.

That’s it really. There’s not much to it but it is worth taking the time to understand the difference(s) between implementing Comparable and creating a custom Comparator. Happy sorting!

By the way, if anybody has any better/alternative ideas for how to implement this, let me know. For example, my initial implementation used a TreeMap but that only allows you to sort on the key and not on the value(s) so wanting to sort in order of greatest number of occurrences won’t work with this particular data structure. In the second example, after constructing the hash table, I then add each entry to a (sorted) set then print out each value. Is there a way of doing this and avoiding the second step? I can’t think of one but maybe I’m missing something 🙂

Reading List

At the beginning of last year (2009) I planned to maintain a list of books that I had read during the course of the year. Of course, true to form, I didn’t! Anyway, this year I plan to do the same thing; however, unlike last year I have actually created a page for the list of books that I have read so far in 2010. I’ll update it as I go along. If you have any suggestions for books that I should read, leave a comment. Happy reading!