Simon Buckle's Weblog

Random thoughts for random people

Order, Order

Comments

This post is about how to impose order on instances of Java classes. The context of the post will be centred around reading in some text and then counting the number of times each word appears in the text. The output will then be sorted either alphabetically or in order of the number of times each word appears etc. Links to the source code are provided if you want to take a look.

Some instances of classes have an implicit order – a natural ordering. For example, in Java, String objects are ordered lexicographically; Integer objects are ordered numerically. If you find you are writing a value class whose instances have an obvious natural order, you should consider implementing the Comparable interface:

public interface Comparable {
  int compareTo(T t);
}


The String class is an example of a class that implements Comparable; instances of it are ordered lexicographically. If you choose to implement Comparable, your class will be able to take advantage of many of the generic algorithms and collections available in the Java APIs, assuming you need to of course! The following example reads in some text and counts the instances of each word:

public class WordCountTest {
   public static void main(String[] args) throws Exception {
	WordReader reader = new WordReader(new InputStreamReader(System.in));
	Map wordMap = new TreeMap();
	String word;
	while ((word=reader.readWord()) != null) {
		Integer count = wordMap.containsKey(word) ? wordMap.get(word) : 0;
		wordMap.put(word, ++count);
        }
	// Print out using the natural order of the key
	for (Map.Entry entry : wordMap.entrySet()) {
		System.out.println(entry.getKey()+" : "+entry.getValue());
	}
   }
}


A TreeMap sorts entries based on the natural order of the key, in this case, a string, but what if you want to order your instances in an unnatural order? e.g. ordering integers in decreasing order from largest to smallest. Well, in Java, you would use a Comparator:

public interface Comparator {
    int compare(T o1, T o2);
	boolean equals(Object obj);
}


Comparators also allow you to provide an ordering for objects that don’t have a natural ordering, e.g. classes that don’t implement Comparable. Let’s take a look at an example that uses a comparator – as defined in the Order class – that sorts the words based on the number of times they appear:

public class WordCountTest2 {
   public static void main(String[] args) throws Exception {
	WordReader reader = new WordReader(new InputStreamReader(System.in));
	Map wordMap = new HashMap();
	String word;
	while ((word=reader.readWord()) != null) {
		Integer count = wordMap.containsKey(word) ? wordMap.get(word) : 0;
		wordMap.put(word, ++count);
	}

	// Sort by greatest occurrence first
	Set entries = new TreeSet(Order.INCREASING_COUNT_COMPARATOR);
	for (Map.Entry entry : wordMap.entrySet()) {
		entries.add(new WordCount(entry.getKey(), entry.getValue()));
	}

	// Print
	Iterator result = entries.iterator();
	while (result.hasNext()) {
		WordCount e = result.next();
		System.out.println(e.getWord()+" "+e.getCount());
	}
  }
}


It’s also quite common to instantiate a Comparator as an anonymous class and pass it in to the constructor of a sorted collection.

It’s worth noting that in the compare method in the Comparator used in the example, we first compare the word counts and if they are equal, we then return the result of comparing the value of the words. This is essential because sorted collections in Java use the compareTo method – or in this case, the compare method of the Comparator – in place of equals. What this means is that if we were to compare just the counts and return zero because they are equal – even though the words may be different – the WordCount instance would NOT get added to the (sorted) collection if an instance already exists in the collection with the same count. There’s a whole section in Joshua Bloch’s book – Effective Java (2nd edition) – that discusses this in greater detail (Item 12) for those that are interested.

That’s it really. There’s not much to it but it is worth taking the time to understand the difference(s) between implementing Comparable and creating a custom Comparator. Happy sorting!

By the way, if anybody has any better/alternative ideas for how to implement this, let me know. For example, my initial implementation used a TreeMap but that only allows you to sort on the key and not on the value(s) so wanting to sort in order of greatest number of occurrences won’t work with this particular data structure. In the second example, after constructing the hash table, I then add each entry to a (sorted) set then print out each value. Is there a way of doing this and avoiding the second step? I can’t think of one but maybe I’m missing something :)

Written by admin

January 15th, 2010 at 8:48 am

Posted in Uncategorized

Reading List

Comments

At the beginning of last year (2009) I planned to maintain a list of books that I had read during the course of the year. Of course, true to form, I didn’t! Anyway, this year I plan to do the same thing; however, unlike last year I have actually created a page for the list of books that I have read so far in 2010. I’ll update it as I go along. If you have any suggestions for books that I should read, leave a comment. Happy reading!

Written by admin

January 9th, 2010 at 3:59 am

Posted in Uncategorized

Invalid Certificate Error with the Fonera 2.0 Downloader

Comments

This is a brief note about how to fix an invalid certificate error when using the Fonera Downloader add-on for Firefox.

I installed the latest version of the Fonera Downloader plugin (0.1.5) for Firefox (3.5.6) on my Mac but I kept getting an “invalid certifcate” error whenever I tried to use the downloader, which Firefox helpfully kept reminding me about every 2 minutes even when I did press cancel! This blog post explains how to fix the problem but unfortunately it didn’t work for me – I used the (correct) WAN address for my Fonera as instructed in the blog post but to no avail. What did work was the following:

  1. Open up Firefox and go to https://fonera/
  2. Click on “Add Exception” and follow the rest of the instructions

That’s it! No more annoying pop-up window. And now you can save files directly to your Fonera providing, of course, you remember to insert a USB thumb drive into the Fonera, which I forgot to do but that’s another story …

Written by admin

December 17th, 2009 at 8:03 am

Posted in Uncategorized

New News

Comments

While at the Over The Air conference just over a week ago, I got talking to a couple of people who were publicising the NHS web services (yes, they have web services). Anyway, I signed up for an account and decided to play around with the news API by building a very simple Mobile Safari application using Apple’s Dashcode tool.

If you have an iPhone or iPod Touch you can view it in Safari here: http://nhsdemo.webteq.eu/

There were a couple of things I wanted to do – such as adding a topic filter – but due to the fact that Dashcode is rather annoying, I haven’t been able to figure out how to do want I wanted to do …. so I didn’t. It’s just a proof of concept anyway, although I’m not entirely sure what concept I was trying to prove.

It wasn’t as straight forward as it looks though. Due to the fact that XML HTTP requests are restricted to the domain that the page was initially loaded from, I had to write a proxy (in Java) to proxy the request through to the NHS web service – I had to do this anyway because you have to pass your username and password to the NHS web service(s) as URL parameters (very secure!) and I didn’t want to put them directly into Javascript for obvious reasons!

Anyway, as soon as I recover from my first encounter with Dashcode I intend to play around with some of the new HTML 5 features that Safari supports such as client side storage etc, and further explore the NHS web services. Until then you have no excuse for not finding out when the next wave of pig flu is going to bring the nation to its knees!

Written by admin

October 6th, 2009 at 4:28 am

Posted in Uncategorized

Sieving Numbers

Comments

Here’s a problem that you some times come across in problem sets and job interviews:  

Q. Write a program to generate all the prime numbers up to N.

The simplest algorithm I can think of is the Sieve of Eratosthenes.
Here’s my attempt:


public class PrimeSieve {

     public static void main(String[] args) {
	int N = Integer.parseInt(args[0]);

	boolean[] isPrime = new boolean[N+1];
	Arrays.fill(isPrime, true);

	int max = (int)Math.sqrt(N);

	for (int i=2; i<=max; i++) {
	   if (isPrime[i]) {
	      // Remove all of the multiples of i
	      for (int j=2; i*j<=N; j++) {
		 isPrime[i*j] = false;
	      }
	    }
	}

	// List the primes
	for (int k=2; k<=N; k++) {
	    if (isPrime[k]) System.out.print(k+" ");
	}
     }
}

That's all. You can leave now.

Written by admin

September 9th, 2009 at 4:10 am

Posted in Uncategorized

Mobile Mentality

Comments

During the course of last week I have been getting up to speed on the iPhone SDK; yes, I did eventually get around to buying a new MacBook Pro! The more I learn, the more doubts I have about developing for the iPhone. Nothing to do with the technology – although Objective-C is an interesting extension to C – but everything to do with the (closed) platform, app store etc. For example, I discovered that in order to test my application on a real device I need to stump up $99 to get onto the developer program; the SDK only has an iPhone simulator. I have highlighted other concerns that I have in previous posts that become more real when you read posts such as this one.

What particularly irks me is the fact that the price of apps are perpetually being driven down to zero! Sure, there have been some hits but these are exceptions to the rule; the rule being: you aren’t going to make any money off your app! If you are a developer (or a company) trying to make a living out of this and you sell your app for 59p, you work out how many copies of it you are going to have to sell in order to make a living off of it. The numbers don’t add up!

I came across a good example of this over the weekend. There is a website called Breaking News Online and they have an iPhone app that they charge $2 for and there is a recurring $1 a month subscription to receive breaking news via push notifications. You can read about it on ReadWriteWeb here. Anyway, if you look at the comments, some people are bitching about having to fork out $1 a month! What? “Is anything worth that?”, they cry! $1 a month buys you half a latte from Starbucks (depending on the size of course) yet even this seems to be too much for some people! The bottom line is this: if you are in the business of just selling applications for the iPhone then you are soon going to be going out of business! You can blame cheap users for that.

I see two primary ways of making money from mobile applications: you can offer a mobile version that supplements whatever it is you happen to be selling, e.g. Salesforce does CRM and they have an iPhone app but it’s not their main line of business; or you can write mobile applications for other companies. There are a number of apps that I have come across recently that have inline ads but your app really needs to become popular before you can start making any serious money from it.

Myself, I am focusing on both approaches. I have done Cocoa development in the past (on the desktop) so focusing on mobile right now is just another string to add to my bow. I hope to have something in the app store during the course of the next couple of weeks months years … assuming of course that it doesn’t get blocked. We’ll see!

Finally, if you do want an iPhone app building then let me know. Just don’t ask me to try and sell it for you!

Written by admin

August 10th, 2009 at 4:53 am

Posted in Uncategorized

Opportunity Knocks

Comments

What follows is a brief list (without much of an explanation I might add) of what I perceive to be gaps in the market. Opportunities if you prefer. Areas where the current state of the art sucks! Not good enough. Please fix. You’ll be making my life much easier, as well as countless others. You’ll probably make a few billion on the side too!

  1. Create a search engine that works – Yahoo, Google, Bing etc. Err, no. I’m not suggesting it’s an easy problem to solve (it isn’t) but with some of the biggest companies in the search game you would think they’d be able to come up with something that didn’t involve me still having to type in hundreds of different combinations of words in the hope of finding what I’m looking for (and still not finding it!). I find myself having to do this all the time.
  2. Develop a decent desktop mail client – I currently use the Mail application on Mac OS X. It’s woeful! I can’t tag individual messages so they are easy to refer back to and I still can’t figure out how to mark all items (in the RSS feed reader) as being read. Windows users’ have Outlook. Enough said. And the response, “But I use Gmail all the time …”, is not a valid one! I want something that works on my desktop.
  3. System for delivering targeted adverts – Firstly, I very rarely click on adverts. Period. Some people do, judging by the amount of revenue Google makes! Anyway, I have lost track of the number of times I have come across adverts on sites that have nothing to do at all with what the site is about. I once saw an advert for Crucial memory on some fashion site! It has been shown that if you have adverts that are relevant to the user, click-through rates are much higher. Now Google attempts to do this AdSense by looking at keywords in the page content but it doesn’t work well in my opinion. I used to have Google Ads on this blog but it kept showing adverts for belt buckles! I don’t tend to write about belt buckles much on this blog so not really relevant.

Anyway, these are just a few ideas to get you started. Now go forth and conquer and let me know when you’re done.

Written by admin

July 23rd, 2009 at 6:24 am

Posted in Uncategorized

Flushing The Document Early

Comments

This post is a note to myself and regards the Transfer-Encoding header field, defined in the HTTP spec. I was reminded of its use while reading Even Faster Web Sites.

First, some assumptions for the example that follows. Let’s say that the header of your page contains some annoying Flash banner advert that is downloaded from a different host and that the body of the page takes a few seconds to generate – in the example below I suspend the current thread for 10 seconds.

The page will be generated on the server and then served back to the browser. The browser will then parse the HTML and proceed to fetch the banner ad etc. In the meantime the user will be sitting there wondering what, if anything, is going on! So how can we present a more user-friendly page? Something that feels more responsive to the user. Enter the Transfer-Encoding header. By setting it to chunked we can serve the header part of the page – the first chunk – to the browser while the server works on generating the body of the page; in other words we don’t need to generate the page all in one go and then serve it up. The Transfer-Encoding header informs the browser that the content for the current page is going to come down the pipe in pieces (“chunks”) and not all at once. It also has the added benefit that as soon as the browser retrieves the first part of the page (the “header”) it can start to download the banner ad in parallel* while it waits for the remaining part of the page from the server. Overall the page should feel more responsive.

So, how to do it in a Java servlet. You would think you would just call the setHeader method on the servlet response object but you don’t – what were you thinking? Turns out it’s even easier than that! An example is given below:


void doGet(HttpServletRequest request,
           HttpServletResponse response)
   throws ServletException, IOException {

	response.setContentType("text/plain");
	PrintWriter out = response.getWriter();
	out.println("The start of the page.");
	out.flush();
	// Wait 10 seconds for no reason whatsoever
	try {
	     Thread.currentThread().sleep(10000);
	} catch (InterruptedException e) {
	     // Do nothing
	}
	out.println("The rest of the page ...");
}

Basically, if you write something to the output buffer and then call flush() that will automatically set the Transfer-Encoding header for you. If you remove the call to flush() then all the output will be buffered before it is sent back to the browser. If you fire up the example code in Tomcat (or something) and then look at it in a browser, the first line will be returned immediately followed 10 seconds later by the rest of the page; if you remove the call to flush() then the page content will be returned all at the same time after about 10 seconds or so.

End of note.

(*) Most browsers open up a limited number of connections to a given host. For example, Firefox 3 opens up a maximum of 6 connections at any one time to a given host.

Written by admin

June 26th, 2009 at 8:22 am

Posted in Uncategorized

URL Shorteners

Comments

Yes, another post about url shorteners. Recently, apart from complaining about them, I have been thinking about how URL shortening services work; services such as bit.ly and tinyurl.

Many of these services reduce a URL to a small string; typically a length of 3 to 6 characters. As a result, they can’t be simply hashing the URL. For example, if you use MD5 to hash the domain name of this site you get (in hexadecimal): 4302e8ae08795f0c67c932338f516e2f. The resulting hash value is longer than the URL itself! Not very useful for a URL shortening service.

So how do they work? I still don’t know but here’s one approach that I took:

Let’s say we want to produce a code using characters from the following alphabet [a-zA-Z0-9]; that gives a total of 62 different alphanumeric characters. For a 5 character code there are 62*62*62*62*62 (=916,132,832) possible combinations. If we associate each code with a given URL – a simple one to one mapping – then that’s a lot of URLs! The key point is that, unlike a hash function, I don’t think the URL is used as input to determine what comes out of the other end; a “random” character code is generated and then is just stored with the URL so it can be retrieved with a simple table look-up.

I came up with a probabilistic approach to generating these “hash” codes. I say probabilistic as it just generates a code at random. If there is a collision, it just tries again and generates another one. So how likely are collisions to occur? Well according to the birthday paradox we should expect to see a collision after generating 2n/2 items, or approx. every 215 items for a 5 character code using the example code, assuming, of course, that all generated codes are equally likely to occur. It’s an example, it will do!

You can look at the source code here. In the example I use a bit vector to record what codes have already been generated; the bit vector is limited to representing 231-1 different values therefore the example code is restricted to generating a maximum of 5 character codes; each character requires 6 bits. I’ll let you do the math as I have already done it :)

If you do run it you may have to increase the maximum heap size, e.g. -Xmx256m. I ran out of the heap space the first time I ran it using Eclipse!

It’s a first attempt so there is likely some room for improvement but it’s a start. Would be great to hear any alternative thoughts on how these things work.

Written by admin

June 24th, 2009 at 7:37 am

Posted in Uncategorized

Alternative App Store?

Comments

I am about to embark* on developing an application for the iPhone, of which I am going to sell thousands of copies and then retire to the Caribbean (just like all iPhone apps right?) but I have a few concerns, primarily with Apple’s app store policy about what qualifies (and disqualifies) an application from being sold – or given away – on the app store. Now I don’t know much about Apple’s policy but I have followed the various discussions about it in the “media” so I know about things such as how if your application is deemed to compete in some way with Apple then your app will be rejected etc but it doesn’t help when I keep reading articles like this. From a business perspective I don’t want to spend months developing something only to have Apple turn around and reject it!

This neatly segways into my next point: Why isn’t there an independent store selling applications for the iPhone? A place where all the misfit applications rejected by Apple can be sold on – think of it as a council estate for mobile phone apps. Maybe there is one; I have no idea. I can see why developers would want to sell their apps through the App store as it’s baked right into iTunes and it’s easy to pay for and put on your iPhone etc but surely there must be some other way to manage applications? I guess I’ll find out soon enough but if you have any useful advice in the meantime, please leave a comment.

* when I say “about to embark” that’s actually dependent on dragging myself away from my keyboard and down to the Apple store to buy myself a new Macbook Pro; mine is getting a bit long in the tooth. As they have just released the latest versions and lowered the price I guess I don’t have any excuse not to get one.

Written by admin

June 19th, 2009 at 2:03 am

Posted in Uncategorized