Quote

I came across this quote the other day and thought I would share it:

Nothing in the world can take the place of persistence. Talent will not; nothing is more common than unsuccessful men with talent. Genius will not; unrewarded genius is almost a proverb. Education will not; the world is full of educated derelicts. Persistence and determination alone are omnipotent. The slogan “press on” has solved and always will solve the problems of the human race.

– Calvin Coolidge

Introduction to VoltDB

Following up on the recent tradition, or so it seems, of starting every one of my blog posts with the words, “Introduction to”, my VoltDB tutorial has (finally!) been published on the developerWorks site: Introduction to VoltDB.

The latest version of the source code that accompanies the article can be cloned from the VoltDB example project on my GitHub account here.

If you have any feedback, please leave a comment.

Introduction to Riak: Part Deux

Part 2 of my introduction to Riak has just been published. You can view it here:

http://www.ibm.com/developerworks/web/library/os-riak2/index.html

I’ve had a quick look and there appears to be an encoding issue in Listing 2. Ignore the question marks. It should read:

$ curl -i http://localhost:8098/riak/odds/
...
{ "odds":"", "description":"" }


 
Hopefully they will have corrected it by the time you read this.

Other than that it’s more or less how I submitted it (I think). I’ll go over it in more detail later on. Let me know what you think.

Simulating Auto Increment in VoltDB

Auto incrementing fields are quite useful, particularly for allocating values to primary keys. MySQL has AUTO_INCREMENT and PostgreSQL has a SERIAL data type. VoltDB has neither, nor anything remotely close to them. This brief article will show you how to simulate auto-incrementing fields in VoltDB. It assumes some knowledge of VoltDB.

VoltDB implements a subset of ANSI-standard SQL. It supports the basic CRUD operations (INSERT, SELECT, UPDATE, DELETE) but it does not have support for automatically generating unique identifiers. It is possible, however, to simulate these in VoltDB, as per this entry in the FAQ. What we can do is create a table that stores the name of the table and the current value that can be used as the unique value for, say, a given column. The schema for the table is shown below:

CREATE TABLE IDENTIFIER (
   TABLE_NAME VARCHAR(100) NOT NULL,
   CURRENT_VALUE INTEGER DEFAULT 1 NOT NULL,
   PRIMARY KEY (TABLE_NAME)
);
The next step is to create a stored procedure that, when called, will return the current value for a given table. The stored procedure will read the current value, increment it, and then return the value to the client. More…

Introduction to Riak

Several months ago – in a galaxy far, far away – I received an email inquiring as to whether I was interested in writing a couple of articles about Riak for IBM’s developerWorks site. I was, so I did – I first wrote something about Riak a while back on this site over here. Anyway, after a bit of a wait, the first one was released into the wild today. You can read it here:

http://www.ibm.com/developerworks/library/os-riak1/

It’s mostly intact although a few paragraphs appear to have fallen by the wayside. Not really surprising as the article was supposed to be under 3000 words whereas the (final) version I submitted was quite a bit over that.

There’s a second installment but I have no idea when it will be published. Or if for that matter. I guess that may depend on the reaction to the first one :)

Multi-Tenancy

I attended the Alfresco conference in London in the middle of November and there was a fair amount of talk about Alfresco’s cloud offering that – if it’s not already available – was due to be launched fairly soon. It will be a hosted service and will allow a single instance of Alfresco to host multiple sites (or tenants). This is usually referred to as multi-tenancy. There are a number of different approaches but the simplest one involves sharing the same database; at the database level you can think of each entry in a table, e.g. forum posts, having something like a site ID column that indicates which site the entry belongs to.

I started thinking about it and I don’t get it. I understand technically how multi-tenancy works; I just don’t see the benefits of making an application multi-tenant aware! More…

Yubico Java Client Changes

Just a quick note. As part of the integration work I did getting YubiKey to work with Alfresco, I also added support for signatures and making validation queries in parallel to the Yubico Java client (I forked the original client) so it should now work with version 2 of the validation protocol; see this FAQ. Hopefully I didn’t fork it up!

You can grab it from my GitHub account: https://github.com/sbuckle/yubico-java-client

If you do decide to use it, you might want to pick and choose which bits you want to pull as I have made other changes not related to the enhancements in version 2.0 of the validation protocol. Now I wonder if I’ll get my five free YubiKeys ;)

Update: I did. Just ordered them. Thanks Yubico :)

 

Two-Factor Authentication with Alfresco

So what is two-factor authentication? I’ll defer that explanation to the Wikipedia page on the subject. Most systems require users to identify themselves using a username and password. The problem is that if people choose a weak password, which evidence suggests they do, suddenly your secure authentication system is not so secure. Using “something you have” in the authentication process makes it much more secure. Take cash machines as an example. If I discover your PIN number, I can only take money out of an ATM if I am in possession of your bank card. Without the card, knowing the PIN number is not going to help me steal your money.

I’ve created an Alfresco extension that implements two-factor authentication using a YubiKey.

What is a YubiKey? It’s a device that you plug into your USB port and it generates one time passwords (OTP). It’s similar to RSA’s SecurID, only a lot cheaper. So now, in addition to specifying your username and password, you also have to submit a OTP when logging in – the OTPs are validated by Yubico’s servers.

Using a key like this makes logging in a lot more secure as it is now no longer possible to log in just using a username and password. In addition, each key is tied to a particular user account – the extension takes care of this – so it’s not possible to just use any key; the user has to use the key that has been (uniquely) assigned to them. The screencast below shows how it works.

I will release the extension shortly. You can download the extension from here. I’ll be attending Alfresco DevCon in London so come and say hi and I can give you a live demo of the system. In the meantime, if you have any questions, feel free to leave a comment or send me an email.

Using Hadoop to Analyze Apache Log Files

After my post a few days ago about analyzing Apache log files with Riak, I thought I would follow that up by showing how to do the same thing using Hadoop. I am not going to cover how to install Hadoop; I am going to assume you already have it installed. What is it they say about assumptions? Also, any Hadoop commands are executed relative to the directory where Hadoop is installed ($HADOOP_HOME). Read More…

Analyzing Apache Logs with Riak

This article will show you how to do some Apache log analysis using Riak and MapReduce. Specifically it will give an example of how to extract URLs from Apache logs stored in Riak (the map phase) and provide a count of how many times each URL was requested (the reduce phase).

So what is Riak? According to Wikipedia it’s “a NoSQL database implementing the principles from Amazon’s Dynamo paper”. Or, put another way,  it’s a distributed key-value store that has built-in support for MapReduce. If you aren’t familiar with MapReduce a good starting point would be to read Google’s MapReduce paper. I am not going to go over how to install Riak; there’s a good tutorial for that on the Riak website. Riak also has a lot of other features that won’t be covered here. Read More…