Tuesday, December 23, 2008

Hit Them Harder Until They Learn

Do you believe that hitting people harder will make them learn? No? Do you believe that prison will turn a bad person into a good citizen? If not, why don't you do something about it?

People like simple solutions. Out of the eyes, problem solved. Lock someone up, it will turn them into a nice fellow. Because you don't like to be locked up, you believe that others feel the same. The question remains: Why should locking up someone turn them into a better person?

Well, it does work ... for monks. They willingly lock themselves up somewhere to be able to concentrate on those things that are dear to them and where they want to improve. But there is a big difference between locking up yourself (like a hermit) and locking up someone with a bunch of criminals and guards. Just make a wild guess where most criminals learn their trade ...

The relapse rates vary a lot (by age, type of crime, prison, treatment). Still, I'm more astonished by the fact that not all of the inmates become criminal again rather than the fact that some do.

That said, it is refreshing when you meet people who don't fall for the temptation of retribution. The guys at stackoverflow.com recently discovered that some people gamed the system to hurt other members or to increase their status. Instead of punishing them, they decided to just take away any advantage of trying to game the system. Problem really solved. Carry on, commander!

Links: Vote Fraud And You
Ein Gefängnisersatz mit Rückfallquote null (German)
GEFÄNGNIS - LEBEN HINTER GITTERN (German)

Thursday, December 18, 2008

Holding a Program in One's Head

In my perpetual search for brain food for programmers, I've found this article: Holding a Program in One's Head

Tuesday, December 02, 2008

Obama Goes Creative Commons

If you've been living under a stone for the past few months, the next president of the USA, Barack Obama, had his team build a web site where he shares his thoughts, ideas and plans. That is itself probably constitutes a revolution but it gets better: You can talk to these guys. Or rather send them your ideas, hopes and worries. And it seems these really count. I mean, how much better can it get?

It can: The site and all content is under the Create Commons license CC-BY which basically means the content is free as long as you say where you got it from (read the license for details). Amazing :)

Thanks for the nice Christmas present, Mr. Soon-President! It really makes me happy to know that there is finally someone who gets the Internet.

Links: Lawrence Lessig's Blob (he's the inventor of CC, just in case).

Stackoverflow: Reputation over 1000 :)

Just a tiny post to cheer the fact that my reputation on Stackoverflow.com has transcended 1,000. Yay!

Monday, December 01, 2008

Writing Testable Code

Just stumbled over this article: "Writing Testable Code". Apparently, it's a set of rules which Google uses. While I'm not a 100% fan of Google, this is something every developer should read and understand.

Saturday, November 29, 2008

How To Be Agile

The article "When Agile Projects Go Bad" got me thinking. I've talked to many people about XP and Agile Development and TDD and the usual question is: "How do we make it work?" And the next sentence is: "This won't work with us because we can't do this or that.".

This is a general misconception which comes from the ... uh ... "great" methodologies which you were taught in school: the waterfall model, the V model, the old dinosaurs. They told you: "You must follow the rules to the letter or doom will rain on your head!" Since you could never follow all the rules, they could easily say "Told you so!" when things didn't work out.

Agile development is quite different in this respect. First of all, it assumes that you're an adult. That you have a brain and can actually use it. It also assumes that you want to improve your situation. It also assumes nothing else.

When a company is in trouble, it will call for help. Expensive external advisers will be called, they will think about the situation for a long time (= more money for them). After a while (when the new yacht is in the dry), they will come up with what's wrong and how to fix it. Did you know that in most companies in trouble, the external advisers will just repeat what they heard form the people working there?

It's not that people don't know what's wrong, it's just not healthy to mention it ... at least if you want to work there. So people walk around, with the anger in their hearts and the fist in the pocket and nothing will happen until someone from the outside comes in and states the obvious. Can't happen any other way because if it could, you wouldn't be in this situation in the first place.

Agile Development is similar. It acknowledges that you're smart and that you know what's wrong and that you don't have the power to call in help. What it does is it offers you a set of tools, things that have worked for other people in the past and some of them might apply to you. Maybe all. Probably not. Most likely, you will be able to use one or two. That doesn't sound like much but the old methodologies are pretty useless if you can't implement 90%+. Agile is agile. It can bend and twist and fit in your routine.

So you're thinking about doing TDD. Do you have to ask your boss? No. Do you have to get permission from anyone? No. Do you have to tell anyone? No. Can you do it any time you like, as often as you like, stop at will? Yes. If it doesn't work for you in your situation, for the current project, then don't use it. No harm done, nothing gained either.

But if you can use it, every little bit will help. Suddenly, you will find yourself to be able to deliver on time. Your code will work and it will be much more solid than before. You will be able to do more work in less time. People will notice. Your reputation will increase. And eventually, they will be curious: How do you do it? "TDD." What's that?

You win.

Be agile. Pick and choose. Pick what you think will work, try it, drop it if it doesn't deliver. And if it works, try the next thing. Evolve. Become the better you.

Agile is not a silver bullet. It won't miraculously solve all your issues. You still have to think and be an adult about your work. It's meant to be that way. I don't do every Agile practice every day. Sometimes, I don't even TDD (and I regret every time). But I always return because life is just so much more simple.

Friday, November 28, 2008

Space: Not So Black And Empty After All

If you always wanted to know what NASA does with all the billions of dollars spent, here are some images.

Wednesday, November 26, 2008

Navigating SharePoint Folders With Axis2

I've just written some test code to get a list of items in a SharePoint folder with Apache Axis2 and since this was "not so easy", I'll share my insights here.

First, you need Axis2. If you're using Maven2, put this in your pom.xml:

    <dependency>
        <groupId>org.apache.axis2</groupId>
        <artifactId>axis2-kernel</artifactId>
        <version>1.4.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.axis2</groupId>
        <artifactId>axis2-adb</artifactId>
        <version>1.4.1</version>
    </dependency>

Next stop: Setting up NTLM authorization.

import org.apache.axis2.transport.http.HttpTransportProperties;
import org.apache.commons.httpclient.auth.AuthPolicy;

        HttpTransportProperties.Authenticator auth = new
            HttpTransportProperties.Authenticator();
        auth.setUsername ("username");
        auth.setPassword "password");
        auth.setDomain ("ntdom");
        auth.setHost ("host.domain.com");

        List authPrefs = new ArrayList (1);
        authPrefs.add (AuthPolicy.NTLM);
        auth.setAuthSchemes (authPrefs);

This should be the username/password you're using to login to the NT domain "ntdom" on the NT domain server "host.domain.com". Often, this server is the same as the SharePoint server you want to connect to.

If the SharePoint server is somewhere outside your intranet, you may need to specify a proxy:

        HttpTransportProperties.ProxyProperties proxyProperties =
            new HttpTransportProperties.ProxyProperties();
        proxyProperties.setProxyName ("your.proxy.com");
        proxyProperties.setProxyPort (8888);

You can get these values from your Internet browser.

If there are several SharePoint "sites" on the server, set site to the relative URL of the site you want to connect to. Otherwise, leave site empty. If you have no idea what I'm talking about, browse the SharePoint server in Internet Explorer. In the location bar, you'll see an URL like this: https://sp.company.com/projects/demo/Documents2/Forms/AllItems.aspx?RootFolder=%2fprojects%2fdemo%2fDocument2%2f&FolderCTID=&View=%7b18698D80%2dE081%2d4BBE%2d96EB%2d73BA839230B9%7d. Scary, huh? Let's take it apart:

https:// = the protocol,
sp.company.com = The server name (with domain),
projects/demo = The "site" name
Documents2 = A "list" stored on the site "projects/demo"
/Forms/AllItems.aspx?RootFolder=... is stuff to make IE happy. Ignore it.

So in out example, we have to set site to:

        String site = "/projects/demo";

Mind the leading slash!

To verify that this is correct, replace "/Documents2/Forms/" and anything beyond with "/_vti_bin/Lists.asmx?WSDL". That should return the WSDL definition for this site. Save the result as "sharepoint.wsdl" (File menu, "Save as..."). Install Axis2, open a command prompt in the directory where you saved the WSDL file and run this command (don't forget to replace the Java package name):

%AXIS2_HOME%\bin\WSDL2Java -uri sharepoint.wsdl -p java.package.name -d adb -s

This will create a "src" directory with the Java package and a single file "ListsStub.java". Copy it into your Maven2 project.

Now, we can get a list of the lists on the site:

        ListsStub lists = new ListsStub
            ("https://sp.company.com"+site+"/_vti_bin/Lists.asmx");
        lists._getServiceClient ().getOptions ()
            .setProperty (HTTPConstants.AUTHENTICATE, auth);

If you need a proxy, specify it here:

        options.setProperty (HTTPConstants.HTTP_PROTOCOL_VERSION,
            HTTPConstants.HEADER_PROTOCOL_10);
        options.setProperty (HTTPConstants.PROXY, proxyProperties);

We need to reduce the HTTP protocol version to 1.0 because most proxies don't allow to send multiple requests over a single connection. If you want to speed things up, you can try to comment out this line but be prepared to see it fail afterwards.

Okay. The plumbing is in place. Now we query the server for the lists it has:

        String liste = "Documents2";
        String document2ID;
        {
            ListsStub.GetListCollection req = new ListsStub.GetListCollection();
            ListsStub.GetListCollectionResponse res = lists.GetListCollection (req);
            displayResult (req, res);
            
            document2ID = getIDByTitle (res, liste);
        }

This downloads all lists defined on the server and searches for the one we need. If you're in doubt what the name of the list might be: Check the bread crumbs in the blue part in the intern explorer. The first two items are the title of the site and the list you're currently in.

displayResult() is the usual XML dump code:

    private void displayResult (GetListCollection req,
            GetListCollectionResponse res)
    {
        System.out.println ("Result OK: "
                +res.localGetListCollectionResultTracker);
        OMElement root = res.getGetListCollectionResult ()
                .getExtraElement ();
        dump (System.out, root, 0);
    }

    private void dump (PrintStream out, OMElement e, int indent)
    {
        indent(out, indent);
        out.print (e.getLocalName ());
        for (Iterator iter = e.getAllAttributes (); iter.hasNext (); )
        {
            OMAttribute attr = (OMAttribute)iter.next ();
            out.print (" ");
            out.print (attr.getLocalName ());
            out.print ("=\"");
            out.print (attr.getAttributeValue ());
            out.print ("\"");
        }
        out.println ();
        
        for (Iterator iter = e.getChildElements (); iter.hasNext (); )
        {
            OMElement child = (OMElement)iter.next ();
            dump (out, child, indent+1);
        }
    }

    private void indent (PrintStream out, int indent)
    {
        for (int i=0; i<indent; i++)
            out.print ("    ");
    }

We also need getIDByTitle() to search for the ID of a SparePoint list:

    private String getIDByTitle (GetListCollectionResponse res, String title)
    {
        OMElement root = res.getGetListCollectionResult ().getExtraElement ();
        QName qnameTitle = new QName ("Title");
        QName qnameID = new QName ("ID");
        for (Iterator iter = root.getChildrenWithLocalName ("List"); iter.hasNext (); )
        {
            OMElement list = (OMElement)iter.next ();
            if (title.equals (list.getAttributeValue (qnameTitle)))
                return list.getAttributeValue (qnameID);
        }
        return null;
    }

With that, we can finally list the items in a folder:

        {
            String dir = "folder/subfolder";

            ListsStub.GetListItems req
                = new ListsStub.GetListItems ();
            req.setListName (document2ID);
            QueryOptions_type1 query
                = new QueryOptions_type1 ();
            OMFactory fac = OMAbstractFactory.getOMFactory();
            OMElement root = fac.createOMElement (
                new QName("", "QueryOptions"));
            query.setExtraElement (root);

            OMElement folder = fac.createOMElement (
                new QName("", "Folder"));
            root.addChild (folder);
            folder.setText (liste+"/"+dir); // <--!!

            req.setQueryOptions (query);
            GetListItemsResponse res = lists.GetListItems (req);
            displayResult (req, res);
        }

The important bits here are: To list the items in a folder, you must include the name of the list in the "Folder" element! For reference, this is the XML which actually sent to the server:

<?xml version='1.0' encoding='UTF-8'?>
<soapenv:Envelope xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
    <soapenv:Body>
        <ns1:GetListItems xmlns:ns1="http://schemas.microsoft.com/sharepoint/soap/">
            <ns1:listName>{12AF2346-CCA1-486D-BE3C-82223DEC3F42}</ns1:listName>
            <ns1:queryOptions>
                <QueryOptions>
                    <Folder>Documents2/folder/subfolder</Folder>
                </QueryOptions>
            </ns1:queryOptions>
        </ns1:GetListItems>
    </soapenv:Body>
</soapenv:Envelope>

If the folder name is not correct, you'll get a list of all files and folders that the SharePoint server can find anywhere. The folder names can be found in the bread crumbs. The first two items are the site and the list name, respectively, followed by the folder names.

The last missing piece is displayResult() for the items:

    private void displayResult (GetListItems req,
         GetListItemsResponse res)
    {
        System.out.println ("Result OK: "
                +res.localGetListItemsResultTracker);
        OMElement root = res.getGetListItemsResult ()
                .getExtraElement ();
        dump (System.out, root, 0);
    }

If you run this code and you see the exception "unable to find valid certification path to requested target", this article will help.

If the SharePoint server returns an error, you'll see "detail unsupported element in SOAPFault element". I haven't found a way to work around this bug in Axis2. Try to set the log level of "org.apache.axis2" to "DEBUG" and you'll see what the SharePoint server sent back (not that it will help in most of the cases ...)

Links: GetListItems on MSDN, How to configure Axis2 to support Basic, NTLM and Proxy authentication?, Java to SharePoint Integration - Part I (old, for Java 1.4)

Good luck!

Wednesday, November 19, 2008

"Hunderte von Milliarden" auf Perry-Rhodan.net

I've published a story :) Since the story is in German, this post is, too.

Ich gebe es zu, ich bin ein Perry Rhodan Fan. Nicht nur, weil es die grösste SciFi-Serie der Welt ist (mit inzwischen 2466 Heften à 64 Seiten jede Woche, seit nunmehr fast 50 Jahren! Die aktuellen Ereignisse um Roi Danton und Dantyren haben mich so lange beschäftigt, bis ich eine Geschichte zu Papier (oder in diesem Fall zu PDF) bringen musste.

Arndt Ellmer war so freundlich sie in der LKS Galerie auf der Homepage von Perry Rhodan zu platzieren. Der Titel ist "Hunderte von Milliarden" und enthält meine Interpretation von Aussagen wie "Der Erbe des Universums".

Viel Vergnügen!

Feedback ist erwünscht. Entweder als Kommentar anhängen oder per eine Mail (digulla at hepe dot com bzw. dark at pdark dot de).

Stuck? Ask Stack Overflow

Stuck with a hard programming problem? Just solved an impossible problem and want to show the world your genius? Don't know how to solve a problem with your favorite OS or programming language? Check out stackoverflow.com.

Testing the Impossible: Rules of Thumb

When people say "we can't test that", they usually mean "... with a reasonable effort". They say "we can't test that because it's using a database" or "we can't test the layout of the UI" or "to test this, we need information which is buried in private fields of that class".

And they are always wrong. You can test everything. Usually with a reasonable effort. But often, you need to take a step back and do the unusual. Some examples.

So your app is pumping lots of data into a database. You can't test the database. You'd need to scrap it for every test run and build it from scratch which would take hours or at least ages. Okay. Don't test the database. Test how you use it. You're not looking for bugs in the database, you're looking for bugs in your code. Saying "but some bugs might get away" is just a lame excuse.

Here is what you need to do: Identify independent objects (which need no other objects stored in the database). Write tests for those. Put the test data for them in an in-memory database. HSQLDB and Derby are your friends. If you must, use your production database but make the schema configurable. Scrap the tables before the test and load them from clean template tables.

So you need some really spiffy SQL extensions? Put them in an isolated place and test them without everything else against the real database. You need to test that searching a huge amount of data works? Put that data in a static test database. Switch database connections during the tests. Can't? Make that damn connection provider configurable at runtime! Can't? Sure you can. If everything else fails, get the source with JAD, compile that into an independent jar and force that as the first thing into the classpath when you run your tests. Use a custom classloader if you must.

While this is not perfect, it will allow you to learn how to test. How to test your work. Testing is always different just like every program is different. Allow yourself to make mistakes and to learn from them. Tackle the harder problems after the easier ones. Make the tests help you learn.

So you have this very complex user interface. Which you can't test. Let alone starting the app takes ten minutes and the UI changes all the time and ... Okay. Stop the whining. Your program is running on a computer and for same inputs, a computer should return the same outputs, right? Or did you just build a big random number generator? Something to challenge the Infinite Improbability Drive? No? Then you can test it. Follow me.

First, cut the code that does something from the code that connects said code to the UI. As a first simple step, we'll just assume that pressing a button will actually invoke your method. If this fails for some reason, that reason can't be very hard to find, so we can safely ignore these simple bugs for now.

After this change, you have the code that does stuff at the scruff. Now, you can write tests for it. Reduce entanglement. Keep separate issues separate. A friend of mine builds all his code around a central event service. Service providers register themselves and other parts of the code send events to do stuff. It costs a bit performance but it makes testing as easy as overwriting an existing service provider with a mock up.

Your software needs an insanely complex remote server? How about replacing this with a small proxy that always returns the same answers? Or at least fakes something that looks close enough to a real answer to make your code work (or fail when you're testing the error handling).

And if you need data that some stubborn object won't reveal, use the source, Luke (download the source and edit the offender to make the field public, remove "final" from all files, add a getter or make it protected and extend the class in the tests). If everything else fails, turn to java.lang.reflect.Field.setAccessible(true).

If you're using C/C++, always invoke methods via a trampoline: Put a pointer somewhere which contains the function to call and always use that pointer instead of the real function. Use header files and macros so no human can tell the difference. In your tests, bend those pointers. The Amiga did it in 1985. #ifdef is your friend.

If you're using some other language, put the test code in comments and have a self-written preprocessor create two versions that you can compile and run.

If all else fails, switch to Python.

Tuesday, November 18, 2008

Testing the Impossible: JavaScript in a Web Page

How do you run JUnit tests on JavaScript in a web page? Impossible?

Here is what you need: First, get a copy of Rhino (at least 1.6R7). Then, save a copy of the JavaScript code at the bottom as "env.js". And here is the setup code for the JUnit test:

    Context cx;
    Global scope;

    public void setupContext () throws IllegalAccessException,
            InstantiationException, InvocationTargetException
    {
        cx = Context.enter();
        scope = new Global();
        scope.init (cx);

        addScript(cx, scope, new File ("html/env.js"));

        File f = new File ("html/demo.html");
        cx.evaluateString(scope, 
                "window.location = '"+f.toURL()+"';\n" +
      "", "<"+getName ()+">", 1, null);
    }

    public void addScript (Context cx, Scriptable scope, File file) throws IOException
    {
        Reader in = new FileReader (file);
        cx.evaluateReader(scope, in, file.getAbsolutePath(), 1, null);
    }

This will load "demo.html" into the browser simulation. The problem here: The loading is asynchronous (just like in a real browser). Now what? We need synchronization:

import org.mozilla.javascript.ScriptableObject;

public class JSJSynchronize extends ScriptableObject
{
    public Object data;
    public Object lock = new Object ();
    
    public JSJSynchronize()
    {
    }
    
    @Override
    public String getClassName ()
    {
        return "JSJSynchronize";
    }
    
    public Object jsGet_data()
    {
        synchronized (lock)
        {
            try
            {
                lock.wait ();
            }
            catch (InterruptedException e)
            {
                throw new RuntimeException ("Should not happen", e);
            }
            
            return data;
        }
    }

    public void jsSet_data(Object data)
    {
        synchronized (lock)
        {
            this.data = data;
            lock.notify ();
        }
    }
    
    public Object getData()
    {
        synchronized (lock)
        {
            try
            {
                lock.wait ();
            }
            catch (InterruptedException e)
            {
                throw new RuntimeException ("Should not happen", e);
            }
            
            return data;
        }
    }

    public void setData(Object data)
    {
        synchronized (lock)
        {
            this.data = data;
            lock.notify ();
        }
    }
    
}

With this code and "window.onload", we can wait for the html to load:

        JSJSynchronize jsjSynchronize;
        ScriptableObject.defineClass(scope, JSJSynchronize.class);
        
        jsjSynchronize = (JSJSynchronize)cx.newObject (scope, "JSJSynchronize");
        scope.put("jsjSynchronize", scope, jsjSynchronize);

        cx.evaluateString(scope, 
                "window.location = '"+f.toURL()+"';\n" +
      "window.onload = function(){\n" +
      "    print('Window loaded');\n" +
      "    jsjSynchronize.data = window;\n" +
      "};\n" +
      "", "<"+getName ()+">", 1, null);

        ScriptableObject window = (ScriptableObject)jsjSynchronize.getData();
        System.out.println ("window="+window);
        ScriptableObject document = (ScriptableObject)scope.get ("document", scope);
        System.out.println ("document="+document);
        System.out.println ("document.forms="+document.get ("forms", document));
        ScriptableObject navigator = (ScriptableObject)scope.get ("navigator", scope);
        System.out.println ("navigator="+navigator);
        System.out.println ("navigator.location="+navigator.get ("location", navigator));

        // I've been too lazy to parse the HTML for the scripts:
        addScript(cx, scope, new File ("src/main/webapp/script/prototype.js"));

Slightly modified version of env.js, original by John Resig (original code):

/*
 * Simulated browser environment for Rhino
 *   By John Resig <http://ejohn.org/>
 * Copyright 2007 John Resig, under the MIT License
 * http://jqueryjs.googlecode.com/svn/trunk/jquery/build/runtest/
 * Revision 5251
 */

// The window Object
var window = this;

// generic enumeration
Function.prototype.forEach = function(object, block, context) {
 for (var key in object) {
  if (typeof this.prototype[key] == "undefined") {
   block.call(context, object[key], key, object);
  }
 }
};

// globally resolve forEach enumeration
var forEach = function(object, block, context) {
 if (object) {
  var resolve = Object; // default
  if (object instanceof Function) {
   // functions have a "length" property
   resolve = Function;
  } else if (object.forEach instanceof Function) {
   // the object implements a custom forEach method so use that
   object.forEach(block, context);
   return;
  } else if (typeof object.length == "number") {
   // the object is array-like
   resolve = Array;
  }
  resolve.forEach(object, block, context);
 }
};

function collectForms(document) {
 var result = document.body.getElementsByTagName('form');
 //print('collectForms');
 document.forms = result;
  
 for (var i=0; i<result.length; i++) {
     var f = result[i];
     f.name = f.attributes['name'];
     //print('Form '+f.name);
     document[f.name] = f;
     f.elements = f.getElementsByTagName('input');
     
     for(var j=0; j<f.elements.length; j++) {
         var e = f.elements[j];
         var attr = e.attributes;
         
         //forEach(attr, print);
         e.type = attr['type'];
         e.name = attr['name'];
         e.className = attr['class'];
         
         f[e.name] = e;
  //print('    Input '+e.name);
     }
 }
}

(function(){

 // Browser Navigator

 window.navigator = {
  get userAgent(){
   return "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3";
  },
  get appVersion(){
   return "Mozilla/5.0";
  }
 };
 
 var curLocation = (new java.io.File("./")).toURL();
 
 window.__defineSetter__("location", function(url){
  var xhr = new XMLHttpRequest();
  xhr.open("GET", url);
  xhr.onreadystatechange = function(){
   curLocation = new java.net.URL( curLocation, url );
   window.document = xhr.responseXML;
   collectForms(window.document);

   var event = document.createEvent();
   event.initEvent("load");
   window.dispatchEvent( event );
  };
  xhr.send();
 });
 
 window.__defineGetter__("location", function(url){
  return {
   get protocol(){
    return curLocation.getProtocol() + ":";
   },
   get href(){
    return curLocation.toString();
   },
   toString: function(){
    return this.href;
   }
  };
 });
 
 // Timers

 var timers = [];
 
 window.setTimeout = function(fn, time){
  var num;
  return num = setInterval(function(){
   fn();
   clearInterval(num);
  }, time);
 };
 
 window.setInterval = function(fn, time){
  var num = timers.length;
  
  timers[num] = new java.lang.Thread(new java.lang.Runnable({
   run: function(){
    while (true){
     java.lang.Thread.currentThread().sleep(time);
     fn();
    }
   }
  }));
  
  timers[num].start();
 
  return num;
 };
 
 window.clearInterval = function(num){
  if ( timers[num] ) {
   timers[num].stop();
   delete timers[num];
  }
 };
 
 // Window Events
 
 var events = [{}];

 window.addEventListener = function(type, fn){
  if ( !this.uuid || this == window ) {
   this.uuid = events.length;
   events[this.uuid] = {};
  }
    
  if ( !events[this.uuid][type] )
   events[this.uuid][type] = [];
  
  if ( events[this.uuid][type].indexOf( fn ) < 0 )
   events[this.uuid][type].push( fn );
 };
 
 window.removeEventListener = function(type, fn){
    if ( !this.uuid || this == window ) {
        this.uuid = events.length;
        events[this.uuid] = {};
    }
    
    if ( !events[this.uuid][type] )
   events[this.uuid][type] = [];
   
  events[this.uuid][type] =
   events[this.uuid][type].filter(function(f){
    return f != fn;
   });
 };
 
 window.dispatchEvent = function(event){
  if ( event.type ) {
   if ( this.uuid && events[this.uuid][event.type] ) {
    var self = this;
   
    events[this.uuid][event.type].forEach(function(fn){
     fn.call( self, event );
    });
   }
   
   if ( this["on" + event.type] )
    this["on" + event.type].call( self, event );
  }
 };
 
 // DOM Document
 
 window.DOMDocument = function(file){
  this._file = file;
  var factory = Packages.javax.xml.parsers.DocumentBuilderFactory.newInstance();
  factory.setValidating(false);
  this._dom = factory.newDocumentBuilder().parse(file);
  
  if ( !obj_nodes.containsKey( this._dom ) )
   obj_nodes.put( this._dom, this );
 };
 
 DOMDocument.prototype = {
  createTextNode: function(text){
   return makeNode( this._dom.createTextNode(
    text.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;")) );
  },
  createElement: function(name){
   return makeNode( this._dom.createElement(name.toLowerCase()) );
  },
  getElementsByTagName: function(name){
   return new DOMNodeList( this._dom.getElementsByTagName(
    name.toLowerCase()) );
  },
  getElementById: function(id){
   var elems = this._dom.getElementsByTagName("*");
   
   for ( var i = 0; i < elems.length; i++ ) {
    var elem = elems.item(i);
    if ( elem.getAttribute("id") == id )
     return makeNode(elem);
   }
   
   return null;
  },
  get body(){
   return this.getElementsByTagName("body")[0];
  },
  get documentElement(){
   return makeNode( this._dom.getDocumentElement() );
  },
  get ownerDocument(){
   return null;
  },
  addEventListener: window.addEventListener,
  removeEventListener: window.removeEventListener,
  dispatchEvent: window.dispatchEvent,
  get nodeName() {
   return "#document";
  },
  importNode: function(node, deep){
   return makeNode( this._dom.importNode(node._dom, deep) );
  },
  toString: function(){
   return "Document" + (typeof this._file == "string" ?
    ": " + this._file : "");
  },
  get innerHTML(){
   return this.documentElement.outerHTML;
  },
  
  get defaultView(){
   return {
    getComputedStyle: function(elem){
     return {
      getPropertyValue: function(prop){
       prop = prop.replace(/\-(\w)/g,function(m,c){
        return c.toUpperCase();
       });
       var val = elem.style[prop];
       
       if ( prop == "opacity" && val == "" )
        val = "1";
        
       return val;
      }
     };
    }
   };
  },
  
  createEvent: function(){
   return {
    type: "",
    initEvent: function(type){
     this.type = type;
    }
   };
  }
 };
 
 function getDocument(node){
  return obj_nodes.get(node);
 }
 
 // DOM NodeList
 
 window.DOMNodeList = function(list){
  this._dom = list;
  this.length = list.getLength();
  
  for ( var i = 0; i < this.length; i++ ) {
   var node = list.item(i);
   this[i] = makeNode( node );
  }
 };
 
 DOMNodeList.prototype = {
  toString: function(){
   return "[ " +
    Array.prototype.join.call( this, ", " ) + " ]";
  },
  get outerHTML(){
   return Array.prototype.map.call(
    this, function(node){return node.outerHTML;}).join('');
  }
 };
 
 // DOM Node
 
 window.DOMNode = function(node){
  this._dom = node;
 };
 
 DOMNode.prototype = {
  get nodeType(){
   return this._dom.getNodeType();
  },
  get nodeValue(){
   return this._dom.getNodeValue();
  },
  get nodeName() {
   return this._dom.getNodeName();
  },
  cloneNode: function(deep){
   return makeNode( this._dom.cloneNode(deep) );
  },
  get ownerDocument(){
   return getDocument( this._dom.ownerDocument );
  },
  get documentElement(){
   return makeNode( this._dom.documentElement );
  },
  get parentNode() {
   return makeNode( this._dom.getParentNode() );
  },
  get nextSibling() {
   return makeNode( this._dom.getNextSibling() );
  },
  get previousSibling() {
   return makeNode( this._dom.getPreviousSibling() );
  },
  toString: function(){
   return '"' + this.nodeValue + '"';
  },
  get outerHTML(){
   return this.nodeValue;
  }
 };

 // DOM Element

 window.DOMElement = function(elem){
  this._dom = elem;
  this.style = {
   get opacity(){ return this._opacity; },
   set opacity(val){ this._opacity = val + ""; }
  };
  
  // Load CSS info
  var styles = (this.getAttribute("style") || "").split(/\s*;\s*/);
  
  for ( var i = 0; i < styles.length; i++ ) {
   var style = styles[i].split(/\s*:\s*/);
   if ( style.length == 2 )
    this.style[ style[0] ] = style[1];
  }
 };
 
 DOMElement.prototype = extend( new DOMNode(), {
  get nodeName(){
   return this.tagName.toUpperCase();
  },
  get tagName(){
   return this._dom.getTagName();
  },
  toString: function(){
   return "<" + this.tagName + (this.id ? "#" + this.id : "" ) + ">";
  },
  get outerHTML(){
   var ret = "<" + this.tagName, attr = this.attributes;
   
   for ( var i in attr )
    ret += " " + i + "='" + attr[i] + "'";
    
   if ( this.childNodes.length || this.nodeName == "SCRIPT" )
    ret += ">" + this.childNodes.outerHTML + 
     "</" + this.tagName + ">";
   else
    ret += "/>";
   
   return ret;
  },
  
  get attributes(){
   var attr = {}, attrs = this._dom.getAttributes();
   
   for ( var i = 0; i < attrs.getLength(); i++ )
    attr[ attrs.item(i).nodeName ] = attrs.item(i).nodeValue;
    
   return attr;
  },
  
  get innerHTML(){
   return this.childNodes.outerHTML; 
  },
  set innerHTML(html){
   html = html.replace(/<\/?([A-Z]+)/g, function(m){
    return m.toLowerCase();
   });
   
   var nodes = this.ownerDocument.importNode(
    new DOMDocument( new java.io.ByteArrayInputStream(
     (new java.lang.String("<wrap>" + html + "</wrap>"))
      .getBytes("UTF8"))).documentElement, true).childNodes;
    
   while (this.firstChild)
    this.removeChild( this.firstChild );
   
   for ( var i = 0; i < nodes.length; i++ )
    this.appendChild( nodes[i] );
  },
  
  get textContent(){
   return nav(this.childNodes);
   
   function nav(nodes){
    var str = "";
    for ( var i = 0; i < nodes.length; i++ )
     if ( nodes[i].nodeType == 3 )
      str += nodes[i].nodeValue;
     else if ( nodes[i].nodeType == 1 )
      str += nav(nodes[i].childNodes);
    return str;
   }
  },
  set textContent(text){
   while (this.firstChild)
    this.removeChild( this.firstChild );
   this.appendChild( this.ownerDocument.createTextNode(text));
  },
  
  style: {},
  clientHeight: 0,
  clientWidth: 0,
  offsetHeight: 0,
  offsetWidth: 0,
  
  get disabled() {
   var val = this.getAttribute("disabled");
   return val != "false" && !!val;
  },
  set disabled(val) { return this.setAttribute("disabled",val); },
  
  get checked() {
   var val = this.getAttribute("checked");
   return val != "false" && !!val;
  },
  set checked(val) { return this.setAttribute("checked",val); },
  
  get selected() {
   if ( !this._selectDone ) {
    this._selectDone = true;
    
    if ( this.nodeName == "OPTION" && !this.parentNode.getAttribute("multiple") ) {
     var opt = this.parentNode.getElementsByTagName("option");
     
     if ( this == opt[0] ) {
      var select = true;
      
      for ( var i = 1; i < opt.length; i++ )
       if ( opt[i].selected ) {
        select = false;
        break;
       }
       
      if ( select )
       this.selected = true;
     }
    }
   }
   
   var val = this.getAttribute("selected");
   return val != "false" && !!val;
  },
  set selected(val) { return this.setAttribute("selected",val); },

  get className() { return this.getAttribute("class") || ""; },
  set className(val) {
   if (typeof val != 'string') { val = "" + val; }
   return this.setAttribute("class",
    val.replace(/(^\s*|\s*$)/g,""));
  },
  
  get type() { return this.getAttribute("type") || ""; },
  set type(val) { return this.setAttribute("type",val); },
  
  get value() { return this.getAttribute("value") || ""; },
  set value(val) { return this.setAttribute("value",val); },
  
  get src() { return this.getAttribute("src") || ""; },
  set src(val) { return this.setAttribute("src",val); },
  
  get id() { return this.getAttribute("id") || ""; },
  set id(val) { return this.setAttribute("id",val); },
  
  getAttribute: function(name){
   return this._dom.hasAttribute(name) ?
    new String( this._dom.getAttribute(name) ) :
    null;
  },
  setAttribute: function(name,value){
   this._dom.setAttribute(name,value);
  },
  removeAttribute: function(name){
   this._dom.removeAttribute(name);
  },
  
  get childNodes(){
   return new DOMNodeList( this._dom.getChildNodes() );
  },
  get firstChild(){
   return makeNode( this._dom.getFirstChild() );
  },
  get lastChild(){
   return makeNode( this._dom.getLastChild() );
  },
  appendChild: function(node){
   this._dom.appendChild( node._dom );
  },
  insertBefore: function(node,before){
   this._dom.insertBefore( node._dom, before ? before._dom : before );
  },
  removeChild: function(node){
   this._dom.removeChild( node._dom );
  },

  getElementsByTagName: DOMDocument.prototype.getElementsByTagName,
  
  addEventListener: window.addEventListener,
  removeEventListener: window.removeEventListener,
  dispatchEvent: window.dispatchEvent,
  
  click: function(){
   var event = document.createEvent();
   event.initEvent("click");
   this.dispatchEvent(event);
  },
  submit: function(){
   var event = document.createEvent();
   event.initEvent("submit");
   this.dispatchEvent(event);
  },
  focus: function(){
   var event = document.createEvent();
   event.initEvent("focus");
   this.dispatchEvent(event);
  },
  blur: function(){
   var event = document.createEvent();
   event.initEvent("blur");
   this.dispatchEvent(event);
  },
  get elements(){
   return this.getElementsByTagName("*");
  },
  get contentWindow(){
   return this.nodeName == "IFRAME" ? {
    document: this.contentDocument
   } : null;
  },
  get contentDocument(){
   if ( this.nodeName == "IFRAME" ) {
    if ( !this._doc )
     this._doc = new DOMDocument(
      new java.io.ByteArrayInputStream((new java.lang.String(
      "<html><head><title></title></head><body></body></html>"))
      .getBytes("UTF8")));
    return this._doc;
   } else
    return null;
  }
 });
 
 // Helper method for extending one object with another
 
 function extend(a,b) {
  for ( var i in b ) {
   var g = b.__lookupGetter__(i), s = b.__lookupSetter__(i);
   
   if ( g || s ) {
    if ( g )
     a.__defineGetter__(i, g);
    if ( s )
     a.__defineSetter__(i, s);
   } else
    a[i] = b[i];
  }
  return a;
 }
 
 // Helper method for generating the right
 // DOM objects based upon the type
 
 var obj_nodes = new java.util.HashMap();
 
 function makeNode(node){
  if ( node ) {
   if ( !obj_nodes.containsKey( node ) )
    obj_nodes.put( node, node.getNodeType() == 
     Packages.org.w3c.dom.Node.ELEMENT_NODE ?
      new DOMElement( node ) : new DOMNode( node ) );
   
   return obj_nodes.get(node);
  } else
   return null;
 }
 
 // XMLHttpRequest
 // Originally implemented by Yehuda Katz

 window.XMLHttpRequest = function(){
  this.headers = {};
  this.responseHeaders = {};
 };
 
 XMLHttpRequest.prototype = {
  open: function(method, url, async, user, password){ 
   this.readyState = 1;
   if (async)
    this.async = true;
   this.method = method || "GET";
   this.url = url;
   this.onreadystatechange();
  },
  setRequestHeader: function(header, value){
   this.headers[header] = value;
  },
  getResponseHeader: function(header){ },
  send: function(data){
   var self = this;
   
   function makeRequest(){
    var url = new java.net.URL(curLocation, self.url);
    
    if ( url.getProtocol() == "file" ) {
     if ( self.method == "PUT" ) {
      var out = new java.io.FileWriter( 
        new java.io.File( new java.net.URI( url.toString() ) ) ),
       text = new java.lang.String( data || "" );
      
      out.write( text, 0, text.length() );
      out.flush();
      out.close();
     } else if ( self.method == "DELETE" ) {
      var file = new java.io.File( new java.net.URI( url.toString() ) );
      file["delete"]();
     } else {
      var connection = url.openConnection();
      connection.connect();
      handleResponse();
     }
    } else { 
     var connection = url.openConnection();
     
     connection.setRequestMethod( self.method );
     
     // Add headers to Java connection
     for (var header in self.headers)
      connection.addRequestProperty(header, self.headers[header]);
    
     connection.connect();
     
     // Stick the response headers into responseHeaders
     for (var i = 0; ; i++) { 
      var headerName = connection.getHeaderFieldKey(i); 
      var headerValue = connection.getHeaderField(i); 
      if (!headerName && !headerValue) break; 
      if (headerName)
       self.responseHeaders[headerName] = headerValue;
     }
     
     handleResponse();
    }
    
    function handleResponse(){
     self.readyState = 4;
     self.status = parseInt(connection.responseCode) || undefined;
     self.statusText = connection.responseMessage || "";
     
     var stream = new java.io.InputStreamReader(connection.getInputStream()),
      buffer = new java.io.BufferedReader(stream), line;
     
     while ((line = buffer.readLine()) != null)
      self.responseText += line;
      
     self.responseXML = null;
     
     if ( self.responseText.match(/^\s*</) ) {
      //try {
       self.responseXML = new DOMDocument(
        new java.io.ByteArrayInputStream(
         (new java.lang.String(
          self.responseText)).getBytes("UTF8")));
      //} catch(e) {
      //}
     }
    }
    
    self.onreadystatechange();
   }

   if (this.async)
    (new java.lang.Thread(new java.lang.Runnable({
     run: makeRequest
    }))).start();
   else
    makeRequest();
  },
  abort: function(){},
  onreadystatechange: function(){},
  getResponseHeader: function(header){
   if (this.readyState < 3)
    throw new Error("INVALID_STATE_ERR");
   else {
    var returnedHeaders = [];
    for (var rHeader in this.responseHeaders) {
     if (rHeader.match(new Regexp(header, "i")))
      returnedHeaders.push(this.responseHeaders[rHeader]);
    }
   
    if (returnedHeaders.length)
     return returnedHeaders.join(", ");
   }
   
   return null;
  },
  getAllResponseHeaders: function(header){
   if (this.readyState < 3)
    throw new Error("INVALID_STATE_ERR");
   else {
    var returnedHeaders = [];
    
    for (var header in this.responseHeaders)
     returnedHeaders.push( header + ": " + this.responseHeaders[header] );
    
    return returnedHeaders.join("\r\n");
   }
  },
  async: true,
  readyState: 0,
  responseText: "",
  status: 0
 };
})();

Sunday, November 16, 2008

UPCScan 0.7: Where is my stuff?

UPCScan 0.7 is released. New features:

UPCScan can now find music CDs
If UPCScan can't find something on Amazon, it will still create an entry which you can then edit to fill in the details.
Entries can be deleted.
I've added lending information so you can quickly figure out who your new "ex-friends" should be.
I'm working on a series/issue information system to make it more simple to complete your collection. With this version, you'll need to edit the database directly to add series/issue information but the user interface can already display this data.
I'm working on a feature to create an OpenOffice document with the locations. This would allow you to print this out and then scan the locations in as you scan your collection to tell UPCScan under which location to file the items. If you can't wait, then you can use the barcode.py script to generate PNG images with barcodes which you can import in OpenOffice to achieve the same effect.

Download: upcscan-0.7.tar.gz (26,921 Bytes, MD5)

Tuesday, November 11, 2008

Testing the Impossible: User Dialogs

How do you test a user dialog like "Do you really want to quit?"

This code usually looks like this:

    public void quit () {
        if (!MessageDialog.ask (getShell(),
            "Really quit?",
            "Do you really want to quit?"
        ))
            return;

        ... quit ...
    }

The solution is simple:

    public void quit () {
        if (askToQuit ())
            return;

        ... quit ...
    }

    /** For tests */
    protected boolean askToQuit () {
        ... ask your question here ...
    }

In test cases, you can now extend the class, and override askToQuit:

    public boolean askToQuitWasCalled = false;
    public boolean askToQuitResult = true;

    protected boolean askToQuit () {
        askToQuitWasCalled = true;
        return askToQuitResult;
    }

Now, you can find out if the question would be asked and you can verify that the code behaves correctly depending on the answer. Tests that just want to quit won't need to do anything special to get the desired behavior.

The same applies to more complex dialogs: Refactor them to put their data into an intermediate structure which you can mock during the tests. That means to copy the data if the dialog is a black box but that's a small price to be paid for being able to test modal user dialogs.

Lesson: You don't want to test the dialog, you want to test whether it is opened at the right place, under the right circumstances and if the result is processes correctly.

Monday, November 10, 2008

I Have Nothing to Hide ... I Think

So it has happened again. Someone put a nice web site online and when it came to pick and chose between security and comfort, guess who won. Alas, those who do as you shouldn't still server as a bad example. What has happened?

DHL, a German parcel delivery service, offers a web site where you can track where your brand new gadget is now so you can guess how long it will take until you rip the wrapping off it. That good.

Not so good is that all customers of DHL get the same default password.

Bad is that DHL reuses the tracking numbers after roughly six months (depending on the amount of parcels that go through the system; if there are less, you can look further into the past).

Really bad is that part of DHL's tracking number of fixed. It's based on the DHL customer number. That's not you, this "customer" is the guy or company you ordered from (DHL renders a service for them).

So this leaves us with a convenient way to check who else has ordered anything from those that shop.

Now imagine you ordered something innocent ... oh, maybe porn or "adult toys" or something from company B which is the arch enemy of company A which incidentally pays your wage. All of a sudden, a couple of innocent bits of information have turned ugly.

Whenever you put something out to the world, step away for a few moments from your dreams how much good someone could do with your service and think how much bad someone could do with it. And if you can't think of anything, you should be very, very worried.

Thursday, October 30, 2008

Multi-line String Literals in Java

You want multi-line string literals in Java which work in Java 1.4 and up? Can't be done, you say?

Sven Efftinge found a way.

Monday, October 27, 2008

Compile VMware tools on openSUSE 10.3

I just tried to install the VMware tools on openSUSE 10.3 (Linux kernel 2.6.22.18) so the virtual machine would survive more than 10 days on an ESX server and failed. If you have the same problem, the solution is here.

Errors you'll see during the installation otherwise:

The directory of kernel headers (version @@VMWARE@@ UTS_RELEASE) does not match your running kernel (version 2.6.22.18-0.2-default). Even if the module were to compile successfully, it would not load into the running kernel.

Wednesday, October 22, 2008

Failure is not an Option

Everyone loves war stories. Here is one of mine. I need a special diet, especially bread. So one Friday evening, I was taking the train home after buying a couple of custom made loafs of bread. In Dübendorf, I left the train and walked home.

About halfway home, I noticed that I had my head, my arms, my bag ... but not my bread! ARGH! Stupid, stupid, stupid! I knew I should have stuffed them in my rucksack but didn't because it was so full and ... yeah ... okay. My baker needs three days to make these breads so that meant about a week without any for me.

Arriving home, an idea struck me and I fired up the VBZ online service to find out where the train was and when the driver would make the next break. A few moments later, the SBB Train Police got a really strange call by me: "I need my bread. It's in this train and can you please, please ask the driver to check if the white plastic bag with the bread is still there?"

The woman on the other end was surprised and promised to call me back.

Ten minutes can be sooo long.

From the timetable, I knew that my train would probably come through my town in about twenty minutes when I got the call. Yes, they found it and the driver would take the plastic bag into his cabin and she told me where to wait on the platform so he could hand it over.

Try train was on time (as usual), the driver handed me my bag (and it really was mine and all the bread was still there) and I was really relieved. After thanking him, I went home to have my dinner. Thanks to the SBB train police, a train driver and an unknown person who put my bread in the overhead compartment when I left it behind, I didn't go hungry that weekend.

Lesson: If all seems lost, take a step back, do something else and you might have the idea which will save the day.

Train related joke: The SBB (Schweizer Bundesbahn - Swiss federation train company) and the German Bundesbahn (the counterpart of the SBB in Germany) wanted to save some money and decided to buy the same information system to inform about arriving trains on the platform. After a longer evaluation, the plan was dropped. The SBB needed signs which said "Train is 1, 2, 3, 4, 5 minutes late" and the German Bundesbahn needed "Train is half an hour, 1 hour, 2 hours, 3 hours, 4 hours, Train Cancelled."

Tuesday, October 14, 2008

So... You want your code to be maintainable.

A great post if you're interested in TDD or testing in general: So... You want your code to be maintainable. by Uncle Bob. Thanks, Bob!

Saturday, October 11, 2008

Good Games for the PS3

I've recently upgraded to a PlayStation 3. I kept my old PS2, though, since the new PS3 can't emulate the PS2. I wonder why that is ... maybe it's because Sony is still selling so many PS2's? Ah, rumors :) Easy to create and hard to kill.

So what good games are there? Here is my list:

Burnout Paradise City
PixelJunk Eden
Ratchet & Clank - Tools of Destruction
Flow

Burnout Paradise City

Mindless street racing with a high adrenaline level. Ideal to waste a couple of minutes or an hour. Great graphics, no blood, no violence (it's more like auto scooter) and nice ideas like smashing ads or the super jumps. If you don't like some events (haven't managed to win a single race, yet. I excel at kicking other cars off the street), you can simply ignore them and still complete enough of the game to have fun.

PixelJunk Eden

A definitive feel of Tarzan or Spiderman when you want to relax a bit. Simple, fitting graphics, no violence, no agression. All that and at the price it's a steal.

Ratchet & Clank - Tools of Destruction

My favorite jump'n'run. Lots of insane weapons, Ratchet's ears look great on the PS3, the story has more depth than usual; not sure I like the depressive realization at the end, though. Judge for yourself.

Flow

Like Eden, it's a brand new kind of game. One of a kind. I play that when I want to come down from all the stress in my life. Go get it!

Bad games

I've got a couple of other games. First, we have the Orange Box with Half Life 2, two of the extra episodes, Portal and Team Fortress. I liked the puzzles in Portal. That games was much too short. I didn't like Half Life 2 much and I hate the episodes. The story was great, the levels gigantic and intelligent. You could almost always find a way around without getting killed. But the handling ... Freeman feels like a block of wood when you move him through the levels: You'll get stuck all the time at hand rails and stuff like that. Sometimes, he'll be able to jump on something, sometimes not. Sometimes, he'll stay on top of a barrel, sometimes not. This sucks. And then those stupid zombie levels. Yeah, I'm stuck in the elevator scene in the first episode. Got killed five times in the dark and now, the games goes where it belongs: The trash.

Resistance - Fall of Man. I like the other games by Insomniac and I like this one, too. It's just too violent for my taste. I like shooting pixels or push empty cars off the street or zoom down a highway at break-neck speed. I don't like shooting at people. I finished the game but it left me asking: Is that all? Running around, shooting people, blow up stuff? Is that the result of many years of game evolution? Better graphics?

Uncharted: Drake's Fortune. Oh well. Okay, the levels look great. When you scale the wall of the castle, there is a sense of vertigo. It's breath taking. The jeep escape is a lot of fun. Smart story (mostly). The game character moves smart. You press a button and he takes cover. He's smart, not a dead puppet like Freeman. He moves as if he was real. Again the violence cooled me off quickly. Too much killing, not enough puzzles.

Thursday, October 09, 2008

Enthought Traits

I'm always looking for more simple ways to build applications. Let's face it, it's 2008 and after roughly 50 years, writing something that collects a few bits of data and presents them in a nice way is still several days of work. And that's without Undo/Redo, a way to persist the data, a way to evolve the storage format, etc.

Python was always promising and with the tkinter module, they set a rather high watermark on how you easily could build UIs ... alas Tk is not the most powerful UI framework out there and ... well ... let's just leave it at that.

With Traits, we have a new contender and I have to admit that I like it ... a lot. The traits framework solves a lot of the standard issues out of the box while leaving all the hooks and bolts available between a very thin polish so you can still get at them when you have to.

For example, you have a list of persons and you want to assign each person a gender. Here is the model:

class Gender(HasTraits):
    name = Str
    
    def __repr__(self):
        return 'Gender %s' % self.name

class Person(HasTraits):
    name = Str
    gender = Instance(Gender)
    
    def __repr__(self):
        return 'Person %s' % self.name

class Model(HasTraits):
    genderList = List(Gender)
    persons = List(Person)

Here is how you use this model:

female = Gender(name='Female')
male = Gender(name='Male')
undefined = Gender(name='Undefined')

aMale = Person(name='a male', gender=male)
aFemale = Person(name='a female', gender=female)

model = Model()
model.genderList.append(female)
model.genderList.append(male)
model.genderList.append(undefined)
model.persons.append(aFemale)
model.persons.append(aMale)

Nothing fancy so far. Unlike the rest of Python, with Traits, you can make sure that an attribute of an instance has the correct type. For example, "aMale.gender = aFemale" would throw an exception in the assignment.

The nice stuff is that the UI components honor the information you use to build your model. So if you want to show a tree with all persons and genders, you use code like this:

class Model(HasTraits):
    genderList = List(Gender)
    persons = List(Person)
    tree = Property
    
    def _get_tree(self):
        return self

class ModelView(View):
    def __init__(self):
        super(ModelView, self).__init__(
            Item('tree',
                editor=TreeEditor(
                    nodes = [
                       TreeNode(node_for = [ Model ],
                           children = 'persons',
                           label = '=Persons',
                           view = View(),
                       ),
                       TreeNode(node_for = [ Person ],
                           children = '',
                           label = 'name',
                           view = View(
                               Item('name'),
                               Item('gender',
                                  editor=EnumEditor(values=genderList,)
                               ),
                           ),
                       ),
                       TreeNode(node_for = [ Model ],
                           children = 'genderList',
                           label = '=Persons by Gender',
                           view = View(),
                       ),
                       TreeNode(node_for = [ Gender ],
                           children = '',
                           label = 'name',
                           view = View(),
                       ),
                    ],
                ),
            ),
            Item('genderList', style='custom'),
            title = 'Tree Test',
            resizable = True,
            width = .5,
            height = .5,
        )

model.configure_traits(view=ModelView())

First of all, I needed to add a property "tree" to my "Model" class. This is a calculated field which just returns "self" and I need this to be able to reference it in my tree editor. The tree editor defines nodes by defining their properties. So a "Model" node has "persons" and "genderList" as children. The tree view is smart enough to figure out that these are in fact lists of elements and it will try to turn each element into a node if it can find a definition for it.

That's it. Everything else has already been defined in your model and what would be the point in doing that again?

But there is more. With just a few more lines of code, we can get a list of all persons from a Gender instance and with just a single change in the tree view, we can see them in the view. If you select a person and change its name, all nodes in the tree will update. Without any additional wiring. Sounds too good to be true?

First, we must be able to find all persons with a certain sex in Gender. To do that, we add a property which gives us access to the model and then query the model for all persons, filter this list by gender and that's it. Sounds complex? Have a look:

class Gender(HasTraits):
    name = Str
    persons = Property
    
    def _get_persons(self):
        return [p for p in self.model.persons
                if p.gender == self]

But how do I define the attribute "model" in Gender? This is a hen-and-egg problem. Gender references Model and vice versa. Python to the rescue. Add this line after the definition of Model:

Gender.model = Instance(Model)

That's it. Now we need to assign this new field in Gender. We could do this manually but Traits offers a much better way: You can listen for changes on genderList!

    def _genderList_items_changed(self, new):
        for child in new.added:
            child.model = self

This code will be executed for every change to the list. I walk over the list of new children and assign "model".

Does that work? Let's check: Append this line at the end of the file:

assert male.persons == [aMale], male.persons

And the icing of the cake: The tree. Just change the argument "children=''" to "children = 'persons'" in the TreeNode for Gender. Run and enjoy!

One last polish: The editor for genders looks a bit ugly. To suppress the persons list, add this to the Gender class:

    traits_view = View(
        Item('name')
    )

There is one minor issue: You can't assign a type to the property "persons" in Gender. If you do, you'll get strange exceptions and bugs. Other than that, this is probably the most simple way to build a tree of objects in your model that I've seen so far.

To make things easier for you to try, here is the complete source again in one big block. You can download the Enthought Python Distribution which contains all and everything on the Enthought website.

from enthought.traits.api import \
        HasTraits, Str, Instance, List, Property, This

from enthought.traits.ui.api import \
        TreeEditor, TreeNode, View, Item, EnumEditor

class Gender(HasTraits):
    name = Str
    # Bug1: This works
    persons = Property
    # This corrupts the UI:
    # wx._core.PyDeadObjectError: The C++ part of the ScrolledPanel object has been 
    # deleted, attribute access no longer allowed.
    #persons = Property(List)
    
    traits_view = View(
        Item('name')
    )
    
    def _get_persons(self):
        return [p for p in self.model.persons if p.gender == self]
    
    def __repr__(self):
        return 'Gender %s' % self.name

class Person(HasTraits):
    name = Str
    gender = Instance(Gender)
    
    def __repr__(self):
        return 'Person %s' % self.name

# Bug1: This doesn't work; you'll get ForwardProperty instead of a list when
# you access the property "persons"!
#Gender.persons = Property(fget=Gender._get_persons, trait=List(Person),)
# Same
#Gender.persons = Property(trait=List(Person),)
# Same
#Gender.persons = Property()
# Same, except it's now a TraitFactory
#Gender.persons = Property

class Model(HasTraits):
    genderList = List(Gender)
    persons = List(Person)
    tree = Property
    
    def _get_tree(self):
        return self
    
    def _genderList_items_changed(self, new):
        for child in new.added:
            child.model = self

Person.model = Instance(Model)
Gender.model = Instance(Model)

female = Gender(name='Female')
male = Gender(name='Male')
undefined = Gender(name='Undefined')

aMale = Person(name='a male', gender=male)
aFemale = Person(name='a female', gender=female)

model = Model()
model.genderList.append(female)
model.genderList.append(male)
model.genderList.append(undefined)
model.persons.append(aFemale)
model.persons.append(aMale)

assert male.persons == [aMale], male.persons

# This must be extenal because it references "Model"
# Usually, you would define this in the class to edit
# as a class field called "traits_view".
class ModelView(View):
    def __init__(self):
        super(ModelView, self).__init__(
            Item('tree',
                editor=TreeEditor(
                    nodes = [
                       TreeNode(node_for = [ Model ],
                           children = 'persons',
                           label = '=Persons',
                           view = View(),
                       ),
                       TreeNode(node_for = [ Person ],
                           children = '',
                           label = 'name',
                           view = View(
                               Item('name'),
                               Item('gender',
                                  editor=EnumEditor(
                                      values=model.genderList,
                                  )
                               ),
                           ),
                       ),
                       TreeNode(node_for = [ Model ],
                           children = 'genderList',
                           label = '=Persons by Gender',
                           view = View(),
                       ),
                       TreeNode(node_for = [ Gender ],
                           children = 'persons',
                           label = 'name',
                           view = View(),
                       ),
                    ],
                ),
            ),
            Item('genderList', style='custom'),
            title = 'Tree Test',
            resizable = True,
            width = .5,
            height = .5,
        )

model.configure_traits(view=ModelView())

Wednesday, October 08, 2008

UPCScan 0.6: It's Qt, Man!

Update: Version 0.7 released.

Getting drowned in your ever growing CD, DVD, book or comic collection? Then UPCScan might be for you.

UPCScan 0.6 is ready for download. There are many fixed and improvements. The biggest one is probably the live PyQt4 user interface (live means that the UI saves all your changes instantly, so no data loss if your computer crashes because of some other program ;-)).

The search field accepts barcodes (from a barcode laser scanner) and ISBN numbers. There is a nice cover image dialog where you can download and assign images if Amazon doesn't have one. Note: Amazon sometimes has an image but it's marked as "customer image". Use the "Visit" button on the UI to check if an image is missing and click on the "No Cover" button to open the "Cover Image" dialog where you can download and assign images. I haven't checked if the result of the search query contains anything useful in this case.

UPCScan 0.6 - 24,055 bytes, MD5 Checksum. Needs Python 2.5. PyQt4 4.4.3 is optional.

Security notice: You need an Amazon Web Service Account (get one here). When you run the program for the first time, it will tell you what to do. This means two things:

Your queries will be logged. So if you don't want Amazon to know what you own, this program is not very useful for you.
Your account ID will be stored in the article database at various places. I'm working on an export function which filters all private data out. Until then, don't give this file to your friends unless you know what that means (and frankly, I don't). You have been warned.

Tuesday, October 07, 2008

Name of the Longest Distance Between Two Points

Q: What's the name of the shortest distance between two points?

A: The straight line.

Q: What's the name of the longest distance between two points?

A: The shortcut.

Monday, October 06, 2008

You Can't Stop OOXML!? Watch Me :-)

MicroSoft is the de-facto standard on the desktop. Despite all the efforts to break the monopoly, the average user still doesn't want to switch. Alas, the average user is not an expert and when MicroSoft tries its way with the experts, that usually backfires. Unlike the Average Joe, geeks and nerds are no cattle and they find creative ways to get even when they are served what MicroSoft dishes out.

So in Norway, 21 of 23 experts voted against OOXML as a new ISO standard. That didn't stop the ca...administration of Standard Norge to embrace this great work from Seattle (which has eaten thousands, maybe millions, of dissertations over the years) and so they announced that Norway votes "Yes". Every geek out there knows what it means when your management has stopped listening to you: Get a new job. And they did.

Right on, commander!

Somebody is Playing Pong With Two Elevators and the Floorlights...

Once, it was just a joke in the UserFriendly comic strip. Now, it's real: Welcome to Project Blinkenlights

Saturday, October 04, 2008

Scanning Your DVD, Book, Comic, ... Collection

Update: Version 0.6 released.

If you're like me, you have a lot of DVDs, books, comics, whatever ... and a few years ago, you kind of lost your grip on your collection. Whenever there is a DVD sale, you invariantly come home with a movie you already have.

After the German Linux Magazin published an article how to setup a laser scanner with Amazon, I decided to get me one and give it a try. Unfortunately, the Perl script has a few problems:

It's written in Perl.
It's written in Perl.
It's written in Perl.
There is no download link for the script without line numbers.
The DB setup script is missing.
The script uses POE.
It's hard to add new services.
Did I mention that it's written in Perl? Right.

So I wrote a new version in Python. You can find the docs how to use it in the header of each file. Additionally, I've included a file "Location codes.odt". You can edit it with OpenOffice and put the names of the places where you store your stuff in there. Before you start to scan in the EAN/UPC codes of the stuff in a new place, scan the location code and upcscan.py will make the link for you. It will also ask you for a nice name of the location when you scan a location code for the first time.

If you need more location codes, you can generate them yourself. The codes starting with "200" are for private use, so there is no risk of a collision. I'm using this Python script to generate the GIF images. Just put this at the end of the script:

if __name__=='__main__':
    import sys
    s = checksum(sys.argv[1])
    img = genbarcode(s, 1)
    img.save('EAN13-%s.gif' % s, 'GIF')
    print error

There is a primitive tool to generate a HTML page from your goods and a small tool to push your own cover images into the database if Amazon doesn't provide one.

Note: You'll need an AWS account for the script to work. The script will tell you where to get your account ID and where you need to put the ID when you start it for the first time.

Download upscan-0.1.tar.gz (54KB, MD5 Checksum)

Monday, September 29, 2008

Firefox 3.0.1 and GMail: Gray Background???

Am I the only one who gets a gray background on many websites after the update to Firefox 3.0.1?

Update: Apparently so. If not, browse to about:config and search for "browser.display.background_color". In my version of FF, that was set to #C0C0C0 for some reason (when it should have been #FFFFFF).

Installing Eclipse 3.4.1 Despite p2

If you're, like me, one of the unlucky ones that aren't on p2's friends list (translation: Eclipse p2 provisioning causes you an endless stream of pain and suffering), then you can't install the 3.4.1 patches because p2 won't let you.

There are several ways to deal with this. One of them is to delete your workspace's .metadata, your Eclipse install and start from scratch, installing all plug-ins again, etc., always hoping that p2 doesn't mess up until you've installed everything.

The other way is a workaround. It needs a bit of disk space and discipline. Do this:

Leave your original Eclipse install alone. Specifically, never ever use the menu "Software updates..." again! Never. I mean it. Disable the entry if you can.
Install eclipse again in a new place. This must be a standard install (not a shared one!!!)
Do not start this install! Specifically, do not attempt to add all your update sites to this base template! Just unpack it and rename the "eclipse" directory to "eclipse-template".
Copy "eclipse-template" to "eclipse-install".
Start eclipse-install. If you worry that you might accidentally start the template once in a while, rename "eclipse.exe" to "eclipse.exe Is this install".
Download the 3.4.1 updates.
Exit eclipse-install.
Use your favorite file copy tool to copy all new files and directories in eclipse-install\plugins and eclipse-install\features to your working copy of eclipse.
Start your working copy.

Installing and updating plug-ins works in a similar way:

Delete eclipse-install and recreate it from eclipse-template.
Start eclipse-install.
Open the install software dialog. Add the update site. You may be tempted to add the update sites to eclipse-template. Don't do this! As soon as p2 can see more than one update site, it will eventually mess up in the dependency calculation.
Install the plug-in.
Create a directory for your new plug-in in the driectory "dropins" of your working copy of Eclipse.
Copy the new files and directories from eclipse-install\features and eclipse-install\plugins to the new directory below "dopins" in your working copy of Eclipse.
If you need to install more than one plug-in, start with step 1. After you have installed anything in eclipse-install, the Eclipse instance is tainted and shouldn't be used again.

That's all folks. At least until the p2 guys fix the many bugs in their code. Which will probably in the Eclipse release in 2010.

That's not because I believe that the p2 guys are stupid or lazy but because this kind of product just takes three years to mature and they started in 2007, so the first working version can be expected in 20010.

Thursday, September 18, 2008

Stack Overflow Launches

Stack Overflow is a Q&A site for programmers. If you're looking for an answer to a question or if you know a lot and can't really fill your needs to help in your current position, have a go at it.

Monday, September 08, 2008

Amiga Forever 2008

I have the set for a while now (thanks Michele!) and after reading the announcement elsewhere, I'd like to remind all you Amiga fans out there that the 2008 release of Amiga Forever is ready. Did you know that Andy Warhol (yeah, that Andy Warhol) gave a demo of the system when it was launched 1985? If not, grab the premium edition and watch him do his thing with Marilyn Monroe (or rather a picture of her).

Friday, September 05, 2008

How To Launch Software

If you're still wondering if "big bang" software releases are a good thing, read this.

Sunday, August 31, 2008

The Space Between Two Characters

If you're claustrophobic, you're afraid of confined spaces. If you're a software developer, you can be afraid of non-existing space.

When it comes to editing text, we usually don't think about the space between the characters. There simply isn't any. When you write a text editor, things start to look different. Suddenly, you have a caret or cursor which goes between the characters and that space between two characters can suddenly become uncomfortably tight.

Fire up your favorite text editor, Word, Writer, whatever. It has to support character formatting, though. Now enter this:

Hello, world!

If in doubt, the bold text ends before the comma and the italic part starts with the "w" and ends with "!" including both.

Now move to the "e" and type "x". What do you get?

Hxello, world!

Piece of cake.

Now move to the "H" and type "x". What do you get? Is the new x bold or not? Do you get "xHxello" or "xHxello"? How about typing "x" after the "o" of our abused "Hello"? Is that new x bold or not? If it is, what is the most simple way to make it non-bold? Do you have to delete the comma, do you have to go through menus or toolbars or is there a simple, consistent way to add a character inside and outside of a formatted range of text?

Let's go one step further. Add a character after the "!". Is it italic? If not, you're lucky. If it is ... what's the most simple way to you get rid of the italic? If you press Return now, will the italic leak to the next line? If not, how can you make it leak? If that italic is the last thing in your text, can you add non-italic text beyond without fumbling with the formatting options?

There is no space between two characters and when you write a text editor, that non-existing space is biting you. Which is actually the problem: There is no consistent way to move in and out of a formatted range of characters.

The naive attempt would be to say "depending on the side you came from, you're inside or outside." So, if we have this (| is the cursor or caret): "Hello |world" and you type something, the question is: How did the caret end there? Did it come from "w|o" and moved one to the left? Or from "o| " and move one position right?

That works somewhat but it fails at the beginning and the end of the text plus you're in trouble during deleting text. What should happen after the last character of "Hello" has been deleted? Should that also delete the character range or should there be an empty, invisible bold range left and when you type something now, it should appear again? If you keep the empty invisible range, when do you drop it? Do you keep it as long as the user stays "in" it? Or until the document is saved? Loaded again from disk?

It's a mess and there is a reason why neither Word nor OpenOffice get it right: You can't. There is information in the head of the user (what she wants) but no way for her to tell the computer. Duh.

That is, unless you start to give the user a visual cue what is going on. The problems we have is that there is no simple, obvious way for the user to say "I want ..." because there is no space on the screen reserved for this. We barely manage to squeeze a caret between the characters. There is just not enough room.

Well, there could be. A simple solution might be to add a little hint to the cursor to show which way it is leaning right now. Right. How about "A|B"? Here, you have three options. Add bold, italic and normal.

In HTML, this is simple. I'm editing this text in Firefox using the standard text area. What looks fancy to you looks like this to me: "<b>A</b>|<i>B</i>"

And this is the solution: I need to add a visual cue for the start and end of the format ranges. Maybe a simple U-shape which underlines the text for which the character format applies. Or an image (> and < in this example): ">A<|>B<". And suddenly, it's completely obvious on which side of the range start and end you are and what you want. You can delete the text in the range without losing it or you can delete both and you can move in and out of the range at will.

The drawback is that you need to keep that information somewhere. It adds a pretty huge cost to the limits of a format range. I'll have to try and see how much that is and if I can get away with less by cleverly using the information I already have.

Also, it clearly violates WHYSIWYG. On the other hand, we get WYSIWYW which is probably better for the user.

DecentXML 1.2

DecentXML 1.2, my own XML 1.1-compliant parser, is now available.

Wednesday, August 27, 2008

Text Editor Component and JADS

While working on DecentXML (1.2 due this weekend), I've had those other two things that were bugging me. One is that there is no high-quality, open-source framework with algorithms and data structures. I'm not talking about java.lang.Collections, I'm talking about red-black trees, interval trees, gap buffers, things like that. Powerful data structures you need to build complex software.

Welcome the "Java Algorithm and Data Structure" project - jads. I haven't started opened a project page on SourceForge or Google Code, yet, but I'll probably do that this weekend.

Based on that, I'm working on a versatile text editor component for Java software. The final editor will work with user interfaces implemented in Swing, SWT and Qt. It's an extensible framework where you can easily replace parts with your own code to get the special features you need. I currently have a demo running which can display text, which allows scrolling and where you can do some basic editing. Nothing fancy but it's coming alone nicely.

If you want to hear more about these projects, post a comment or drop me a mail.

Tuesday, August 19, 2008

Death Star in EVE Online

Apparently, a group of 4000 players of EVE Online have built a kind of a "Death Star" (a "titan ship" in the language of the game) to rule the game galaxy. Assembly took 8 months in total secrecy and the result was destroyed completely within 3 months.

Another Lesson on Performance

Just another story you can tell someone who fears that "XYZ might be too slow":

I'm toying with the idea to write a new text editor. I mean, I've written my own OS, my own XML parser and I once maintained XDME, an editor written originally by Matthew Dillon. XDME has a couple of bugs and major design flaws that I always wanted to fix but never really got to it. Anyway.

There are various data structures which are suitable for a text editor and some of those depend on copying data around (see gap buffers). The question is: How effective is that? The first instinct of a developer is to avoid copying large amounts data and to optimize the data structure instead.

After years of training, I've yet to overcome this instinct and start to measure:

    public static void main (String[] args)
    {
        long start = System.currentTimeMillis ();
        
        int N = 10000;
        for (int i=0; i<N; i++)
        {
            int[] buffer = new int[1024*1024];
            System.arraycopy (buffer, 0, buffer, 1,
                buffer.length-1);
        }
        
        long duration = System.currentTimeMillis () - start;
        System.out.println (duration);
        System.out.println (duration / N);
    }

On my computer at work (which is pretty fast but not cutting edge), prints: "135223" and "13". That's thirteen milliseconds to copy 4MB of RAM. Okay. It's obviously not worth to spend a second to think about the cost of moving data around in a big block of bytes.

Lesson: If you're talking about performance and you didn't measure, you have no idea what you're talking about.

Still not convinced? Read this.

Monday, August 04, 2008

Quantity Always Trumps Quality

While I wouldn't completely subscribe to that without a grain of salt, the story is nice.

Four harmful Java idioms, and how NOT to fix them

In his article "Four harmful Java idioms, and how to fix them", John O'Hanley writes about how to make Java more maintainable. He picks four common patterns and gives tips how to fix them ... or not. Follow me.

Names

The first idea is to prefix names with a letter giving a hint what they mean: Is "start" a method? A field? A parameter? The goal is to make the code more readable to humans.

Unfortunately, this doesn't work. The human brain doesn't read letters, it reads words. So "fStart" (meaning a field with the name "start") is rejected by the brain because it's not a word. This triggers the conscious analysis which John tries to avoid! Which is why modern IDEs use color to tell you what something is: The brain can decode color and words in independent parts - unconsciously.

Packaging Convention

Next, he moves on how to split code into packages. Currently, we use a "package by layer" scheme, meaning all DB code goes into one package and the model code into another and UI layer in a third, etc. He proposes to use a "by-feature" packaging with the litmus test "you should be able to delete a feature by deleting a single directory, without leaving behind any cruft".

Uhm. When have you ever written any code where you could remove a feature just by deleting a class? This sounds nice and simple but it's fails Einstein's litmus test: "Make it as simple as possible but not more simple". Even if you have a plug-in based software like Eclipse, this doesn't work because there are still references outside (otherwise, your plug-in wouldn't be able to do anything).

Also, to keep a feature as isolated from everything else as possible (which is a good thing), you need to copy a lot of code into the feature which would otherwise reside elsewhere, neatly packed up in its own package. Really just a limitation of Java where you can't tell the compiler to generate boiler plate code for you. Still, you need to cut code in such a way that it reduces dependencies, not increases them. Therefore, a general rule won't cut (or maybe it will cut: you).

Immutables

John quotes: "'Classes should be immutable unless there is a very good reason for making them mutable,' says Bloch.". And later: "From a practical perspective, many widely-used frameworks require application programmers to use JavaBeans (or something similar) to model database records. This is deeply unfortunate, because it doesn't allow programmers to take advantage of the many positive qualities of immutable objects."

From a practical perspective, immutable objects are dead weight. Applications are all about changing something. I read data from the database, I modify it, I write it back. I rarely read, display and forget about something. Yes, immutables have advantages because they can be shared between threads but that's their only advantage.

Just think about this: You must modify data from the database. So you read the data into an immutable. How to modify it, now? Obviously, you need a method to change it. If you prefer setters, the "setter" must return a copy. So you need to copy the object for every single change. If you want to get a feeling for that, try to do math with BigDecimal. Okay, after the copy you can write the copy back to the database. Question: How do you notify everyone else who might have a (now stale) copy of the old immutable? There are no listeners; immutables can't have listeners. Duh. Driving this to the extreme, lists wouldn't offer methods to add or remove items; or rather they would return new copies of themselves after every add/remove operation.

Sorry, no sale. I can't add money to my cash register. It's immutable.

And a colleague just introduced me to another great concept: Constructors which require values for all fields. The class in question has 95 fields. This idea has the following flaws: a) No matter how big your screen, you can't fit the call onto it. b) After argument #10, you lost track and you can't see anymore which value goes into which argument. Now imagine you have to remove a field. How do you find the right one in this mega-call?

No, nothing beats the no-arg constructor plus a list of setters, all costs considered.

Private members

John proposes to move private members to the end of the class. Here, I agree. I'd even put them close to the getter and setter so that a lot of stuff that belongs together is together.

In todays IDEs with their superb code navigation (I can't really believe there was a time before the F3 key), this doesn't matter much, though.

Conclusion: Think about it, but don't bother.

Friday, August 01, 2008

DecentXML 1.1

I've just released 1.1 of DecentXML.

Wednesday, July 30, 2008

Update to DecentXML

I've updated my XML parser. The tests now cover 97.7% of the code (well, actually 100% of the code which can be executed; there are a couple of exceptions which will never be thrown but I still have to handle them) and there are classes to read XML from InputStream and Reader sources (including encoding detection).

The XMLInputStreamReader class can be used standalone, if you ever want to read an XML file with the correct encoding.

You can download the sources and report issues in the new Google Code project I've created.

Information Management With Zotero

I've been long looking for a nice tool to manage my vast extra-brain information collection, i.e. the stuff that I don't want to save in my long term memory. Web snippets, notes, that kind of stuff. All the usual solutions didn't appeal to me. Either I was locked to Windows or to a single computer or the UI was bad or the feature list lacked some important points.

Zotero to the rescue. This beast is advertised as "Zotero [zoh-TAIR-oh] is a free, easy-to-use Firefox extension to help you collect, manage, and cite your research sources. It lives right where you do your work — in the web browser itself."

Which makes sense. I watch most of my information in my web browser, so why no collect it in there, too? The UI is nice, I'm just missing a few features. Also being able to sync with my own server would be nice. But I'm sure that will be fixed, soon. In the mean time, I can at least tag and order my snippets.

Tuesday, July 29, 2008

A Decent XML Parser

Since there isn't one, I've started writing one myself. Main features:

Allows 100% round-tripping, even for weird whitespace between attributes in elements
Suitable for building editors and filters which want to preserve the original file layout
Error messages have line and column information
Easy to reuse
XML 1.0 compatible

You can download the latest sources here as a Maven 2 project.

Monday, July 28, 2008

DSLs: Introducing Slang

Did you ever ask for a more compact way to express something in your favorite programming language? Say hello to DSL (Domain Specific Language). A DSL is a slang, a compact way to say what you want. When two astronauts talk, they use slang. They need to get information across and presto. "Over" instead of "I'll now clear the frequency so you can start talking." And when these guys do it, there's no reason for us not to.

Here is an artical on Java World which gives some nice examples how to create a slang in Java and in Groovy. Pizza-lovers of the world, eat your heart out.

FREE! Really.

I just found a nice comment under my blog. It offered a free service. One sentence was: "REGISTRATION IS ABSOLUTELY FREE!" When you see that, you know you're being ripped off. I'm not mentioning the name of the guys who tried that stunt in order to give them no additional advertisement. 'Nuff said.

Tip: If you want me to join your planet or RSS mega feed or whatever, it's not smart to post a comment in my blog. This is my blog, my reputation, my honor. I decide who gets free advertisement here.

Saturday, July 26, 2008

Testing With Databases

When it comes to testing, databases are kind of a sore spot. People like to think that "you can't test when you need a database" or "it's too complicated" or "it's not worth it." I'd like to give you some ideas what you can do when you need to test code that depends on a database. This list is sorted in the order in which I try to tackle the problem:

Use POJOs to store the data from the database in the real code and for the tests, create some dummy objects with test data and use them.
Make the database layer a plug-in of your application and replace it with a mockup for testing that doesn't need the database and which returns test objects instead.
Instead of connecting to the real database, get HSQLDB or Derby and use an embedded or at least local database. I prefer HSQLDB because it's smaller and starts faster (and tests should always be fast) but Derby has more features.
Create a second instance of the production database system on a different machine, preferably your own computer.
Create another instance of the real database with test data on the same machine as the real database.
Use database schemas to create a logical database in the real database, for example if all tables are in the schema APP, create APP_TEST and in your code, add a way to replace the schema name in the SQL statements. If you wrote the DB layer yourself, use a system property which isn't set in production. If you're using Hibernate, walk the mapping objects which are created and replace the table names after loading the production configuration. Field.setAccessible(true) is your friend.

If you can't decide, here are a few hints:

Creating two databases using schemas in the same instance can get you into serious trouble without you noticing. For example, the tests should be able to rebuild the test database from scratch at the press of a button so you can be sure in which state the database really is. If you make a mistake with the schema name during that setup, you'll destroy the real database. You might not notice you did, because the flawed statement is usually hidden under a few hundred others.

Installing a second instance on a different machine might trigger license fees or your DB admin might not like it for some reason. Also, a test database should be very flexible because you'll need to be able to drop and recreate it a dozen times per hour if you need to. Your DB admins might not like to give you the necessary rights to do that. Lastly, this means only one developer can run all the tests at any given point in time because you're all going against the same database. This is bad, really bad. More often than not, you'll have spurious errors because of that.

If you can legally get a copy of the real database on your own machine, that's cool ... until you see the memory, CPU and hard disk requirements plus a DB admin will probably hog your machine for a day or two to install it. Having to run two applications which need 1GB of RAM (your IDE and the DB) with a machine that has only 1GB of RAM isn't going to fun.

For many cases, using HSQLDB or Derby is a good compromise between all forces that pull at you. While that will make your tests slow, they will often run much faster than against the real DB. You can install these as many times you like without any license issues, fees or DB admins bothering you. They don't take much memory or hard disk space and they are under your total control.

Only, they are not the real DB. There might be slight differences, hidden performance issues and other stuff that you won't notice. Don't bother about that, though. If you can test your application, you'll find that you'll be able to fix any problems that come up when you run against the real database in little time. If you can't test your application, thought, well, you're doomed.

I strongly recommend to be able to setup the database from scratch automatically. With Derby, you can create a template database and clone that on the first connection. With HSQLDB, loading data is so fast that you can afford to rebuild it with INSERT statements every time you run the tests.

Still, test as much code as possible without a database connection. For one, any test without a DB will run 100-1000 times faster. Secondly, you're adding a couple more points of failure to your test which are really outside the scope of your test. Remember, every test in your suite should test exactly one thing. But if you can't get rid of the connection, you're testing the feature you want plus the DB layer plus the DB connection plus the DB itself. Plus you'll have the overhead of setup, etc. It will be hard to run a single test from your suite.

At the end of the day, testing needs to be fun. If you feel that the tests are the biggest obstacle in being productive, you wouldn't be the good developer you are if you didn't get rid of them.

One last thing: Do not load as much data as possible! It is a common mistake to think that your tests will be "better" if you have "as much data as possible". Instead load as little data as possible to make the tests work. When you find a bug, add as little data as possible to create a test for this bug. Otherwise, you'll hog your database with useless junk that a) costs time, b) no one can tell apart from the useful stuff and c) it will give you a false feeling of safety that isn't there.

If you don't know which data is useful and which isn't, then you don't know. Loading of huge amounts of junk into your database won't change that. In order to learn, you must start with what you know and work from there. Simply copying the whole production system will only slow you down and it will overwrite the carefully designed test cases you inserted yesterday.