Friday, January 30, 2009

A Different View on Exceptions

The discussion about checked/unchecked exceptions is almost as old as Java. While we all have a point in your stance towards this, maybe we are looking at the problem from the wrong angle. Manuel Woelker wrote an article which concentrates on the receiver of the exception, the user, and how exceptions should behave to help the user: Exceptions From a User’s Perspective

Wednesday, January 28, 2009

How to Hide a Virus in Source Code

I've been looking for quite some time for this article: How can you hide a virus in the source code? Basically, you create a binary of a compiler which contains the virus and which is patched to infect other programs as it compiles them.

Reflections on Trusting Trust by Ken Thompson.

Friday, January 23, 2009

Another Lesson on Performance

Just another story you can tell someone who fears that "XYZ might be too slow":

I'm toying with the idea to write a new text editor. I mean, I've written my own OS, my own XML parser and I once maintained XDME, an editor written originally by Matthew Dillon. XDME has a couple of bugs and major design flaws that I always wanted to fix but never really got to it. Anyway.

So what is the best data structure for a text editor in 2008? List of lines? Gap-Buffer? Multi-Gap-Buffer?

XDME would keep the text in a list of lines and each line would point to a character array with the actual data. When editing, the characters would be copied into an edit buffer, the changes made and after the edit, the changed characters would be copied back, allocation a new character array if necessary.

This worked, it was a simple design but it had a flaw: it didn't scale. The length of a line was limited to the max size of the edit buffer and loading a huge file took ages because each line was read, analyzed, memory was allocated ... you get the idea.

So I wanted to make it better. Much better. I'd start with reading the file into a big buffer, chopped into evenly sized chunks to make reading both fast and memory efficient (imagine loading a 46MB file into a single memory chunk - after a couple of changes, I'd need to allocate a second 46MB chunk, copy the whole stuff over, etc, needing twice the amount of RAM for a short time).

During the weekend, I mulled the various ideas over, planned, started with a complex red-black tree structure for markers (positions in the text that move when you insert before them). It's huge, complex. It screams "wrong way!"

So today, I sat back and did what I should have done first: Get some figures. How much does it really cost to copy 4MB of RAM? Make a guess. Here is the code to check:

    public static void main (String[] args)
    {
        long start = System.currentTimeMillis ();
        
        int N = 10000;
        for (int i=0; i<N; i++)
        {
            int[] buffer = new int[1024*1024];
            System.arraycopy (buffer, 0, buffer, 1, buffer.length-1);
        }
        
        long duration = System.currentTimeMillis () - start;
        System.out.println (duration);
        System.out.println (duration / N);
    }

On my machine, this prints "135223" and "13". That's thirteen milliseconds to copy 4MB of RAM. Okay. It's obviously not worth to spend a second to think about the cost of moving data around in a big block of bytes.

That leaves the memory issue. I would really like to be able to load and edit a 40MB file in a VM which has 64MB heap. Also, I would like to be effective loading a file with 40MB worth of line-feeds as well as a file which contains just a single line with 40MB data in it.

But this simple test has solved one problem for me: I can keep the lines in an ArrayList for fast access and need not worry too much about performance. The actual character data needs to go into a chunked memory structure, though.

Morale: There is no way to tell the performance of a piece of code by looking at it.

25 Most Dangerous Programming Errors

If you want to improve your l33t [0d|ng skillz, especially keeping script kiddies off your back, here is a list of the 25 most common coding errors: http://www.sans.org/top25errors/

Thursday, January 22, 2009

Sorting for Humans: Natural Sort Order

Kate Rhodes sums it up nicely:

Silly me, I just figured that alphabetical sorting was such a common need (judging by the number of people asking how to do it I'm not wrong either) that I wouldn't have to write the damn thing. But I didn't count on the stupid factor. Jesus Christ people. You're programmers. You're almost all college graduates and none of you know what the f**k "Alphabetical" means. You should all be ashamed. If any of you are using your language's default sort algorithm, which is almost guaranteed to be ASCIIbetical (for good reason) to get alphabetical sorting you proceed to the nearest mirror and slap yourself repeatedly before returning to your desks and fixing your unit tests that didn't catch this problem.

So if you want to sort your lists the right way (instead of the ASCII way), read this.

Saturday, January 03, 2009

The Temporal Void

Holidays. The only time where I can read or "dream with open eyes" (text from a bookmark). This year, it was "The Temporal Void" by Peter F. Hamilton. It's the sequel to "The Dreaming Void" (my review).

Again, the series is coming along great (which Peter can probably see on your bank account :) Well deserved if you ask me). I like the rich characters, the story is sound and believable. Recommendation: Buy. Now.

There were three spots which I didn't "buy" in "The Temporal Void", places where I dropped from the story and thought "WTF?" Note: Only mild spoilers below; you can read on even if you haven't read the book, yet.

  1. So Aaron is stranded on Hanko, the planet is about to blow up and the Navy scout is about to pick him up. After being warned that he's dangerous, having the best sensors military money can buy, they let him simply walk on their ship battle ready and kill them. I mean, OK, shit happens and maybe these was the Omega ship with the best morons the Navy could find and such ... but ... nah, really :) With instant comm available at all times, no one is watching this important operation? There isn't even a recording? Didn't buy that one.

    The same happened in the first part when Aaron broke into the storage vault to claim Inigos memories. Why did you place the guards *inside* (where all that delicate stuff will break if they ever would have to engage someone)? Why not place them on the other side of the vault door where they can pummel any intruder against a foot or two of solid steel, without any cover?

  2. Edeard finds his childhood friend Salrana in the clutches of Ranalee and leaves her there. I never thought he would be the character to leave someone behind. He knows only bad can come from this; I mean it's only the tenth time this happens, he got to learn something, right? If he dragged Salrana away, the girl would be mad but he could leave her with the Pythia and look for a solution if she doesn't know one. If all else fails, he could simply blackmail Ranalee into fixing what she did. So I accept that he's tired and worn out and all that but this just didn't fit.
  3. Paula and the quantumbuster. So this thing really distorts spacetime to wreak havoc with matter. How can she get away when space is so twisted? How about just nailing her in place using the ships in orbit and blowing up the station the traditional way?

Other than that, the story is the usual perfect piece of work from Peter. I've posted the text above in Peter's inbox; should I get a reply, I'll post it here.