Why not?

Thursday, October 30, 2008

2 Days with Google Gears

I had a chance to play with Google Gears on a mockup project recently. I was surprised to learn that my understanding of Gears was not the same as the reality of Gears. I had expected it to be a JS library to facilitate rich HTML applications to handle spotty network connections. It turned out that Gears is a browser plugin that adds useful tools that any rich web app developer would find useful, though they are all aligned to handle an app losing its network connection. In the two days that I spent playing with Gears, I was pretty much blown away by the power and simplicity of it.

Besides understanding what Gears is, it's important to also understand what Gears is not. Gears doesn't try to be a UI library, or a general "glue" library (see JQuery, Prototype, MochiKit, MooTools, or any other Javascript framework). It doesn't really force you to code in any particular way (this is a bit of a lie, but more on that later). Just as a sampling, Gears includes:

A cross-browser compatible XMLHttpRequest
Local database storage
A web resource cache, so that items can still be fetched even if the network connection goes down. This is nearly completely transparent to the developer, and works for resources loaded by Gears or by the browser itself.
A mechanism to run scripts in the background (for real).

In our little demo, we have a set of checkboxes that can be checked. Initially, these would perform an asynchronous post to the server, where they would update some server-side state. However, if the app goes offline, nothing will ever reach the server. We modified this to enqueue changes into a local database, with a background script pulling items out of the database and sending them to the server. If the server ever goes down, that thread simply stops pulling items out of the database. In addition, we set our app up so that its resources (HTML, Javascript, CSS, and images) were initially cached when the app is first loaded. A neat feature of Gears seems to be that it will monitor the apps that it knows about and will automatically update its cache if the cache ever gets stale. Unfortunately, it's not perfect. It depends on the developer updating a version string, which causes Gears to update its cache from the source.

One problem that we had is that the HTML that we served up would include information about what items were checked. That is to say, when you would load the page, we would serve up some <input type="checkbox" /> and <input type="checkbox" checked="checked" /> elements. This makes total sense in a traditional web app. The client requests the page, and you serve it, and everybody is happy. Every time the page is served, it is reconstructed with the current state of the data. As you might imagine, this caused all kinds of problems for us. Concretely, we noticed that every time the page was reloaded (whether the network connection is up or down), the browser would display the state of the page as it was when the cache first acquired it. In a real application, that could mean that you are seeing data that is several months out of date. Now you see how I lied earlier. Gears does influence the way you code your application, but its requirements are about the same as those of any Javascript-heavy web app. As long as you separate your application presentation from your data, you should be fine.

Another thing that surprised and greatly pleased me was Gears' WorkerPool implementation. As everybody knows, it is impossible to run Javascript in the background in a normal web browser. I think that's because multi-threaded programming is hard, and Javascript can be pretty hairy as it is. I think that the browser designers have held off on implementing a threading solution out of fear that multithreaded Javascript would cause the apocalypse. As it turns out, though, Gears' implementation is both simple and powerful. Gears uses a message-passing mechanism for communication, with absolutely no shared state. This is great news. As far as I can tell, just as your main JS code has an event loop, each worker also has an event loop. Whenever a message is sent from the main JS code to a worker, that message is copied and onMessage is invoked on by that worker's event loop. Likewise, when a worker sends a message back to the main JS, the message is copied and onMessage is invoked on the main event loop. This has some interesting implications. For one, none of the workers have access to the DOM, or to global variables defined on the page, and cannot participate in closures with mainline Javascript code. By placing a concrete wall between your page and your workers, Gears forces you to think about the interactions that the page and the worker will have, and that's a Good Thing. I'm sure that it's still possible for threading to ruin you, it's just a lot harder with a scheme like this.

And that's it. There's more to Gears that what I described (though not much more). It also includes some geolocation bits (presumably for Android, and maybe Safari Mobile, integration), desktop integration stuff, a standards-compliant timer, a file multi-chooser (yay!), and a binary data type (as opposed to String, which is for textual content). It's a shame that Gears is still in beta. I would really like to see some sites that use it. Of course, since I just recently installed Gears, there might be some sites that do and I never realized it.

Wednesday, October 29, 2008

User-Visible Permissions in Android

I picked up a T-Mobile G1 (danger: Flash-heavy site) at the local T-Mobile store. For those that don't know, the G1 is the first device to run Google's Android platform. So far I like it a lot, and I'll probably post a lot more about it in the near future.

Like the iPhone, Android has its own app store. Unlike the iPhone, nobody moderates apps submitted to the Android app store. If an app tries to do anything of consequence (i.e. anything that a user might want to know about), it must explicitly request that permission. When you start to download an app from the marketplace, it tells you what permissions that app will require. Most apps are well behaved, but some ask for way too much.

For example, I wanted a weather app. I saw that there is a Weather Channel app. When I went to download it, however, I was very surprised. Here is a list of the permissions that it requested.

Network communication	full Internet access

Your location	coarse (network-based) location, fine (GPS) location
System tools	change network communication, change your UI settings, modify global system settings
Your messages	edit SMS or MMS
Services that cost you money	send SMS messages
Your personal information	read contact data

What? Why does this app need access to my contacts, or to send text messages? I hope that this was just a lazy developer who requested more permissions that he actually needed, but I'm suspicious. It's entirely possible that The Weather Channel intends to compile a list of all my contacts. Not cool. Especially since it makes no mention of that.

I think it's great that Android provides some ability for the end user to judge the software that they might install on their phone. I'll wait until The Weather Channel updates their app.

Tuesday, October 14, 2008

Scala's Model of Functions

I was a little dismayed to learn that Scala models functions with different numbers of parameters as instances of distinct function trait definitions. A practical upshot of this is that you can't really work with functions that take more than 22 parameters.

def crazy(a:Int, b:Int, c:Int, d:Int, e:Int, f:Int, g:Int, h:Int, i:Int, j:Int, 
          k:Int, l:Int, m:Int, n:Int, o:Int, p:Int, q:Int, r:Int, s:Int, t:Int, 
          u:Int, v:Int) = 0
(crazy _).curry    :    (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => 
                        (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => (Int) => 
                        (Int) => (Int) => Int = <function>

def crazier(a:Int, b:Int, c:Int, d:Int, e:Int, f:Int, g:Int, h:Int, i:Int, j:Int, 
            k:Int, l:Int, m:Int, n:Int, o:Int, p:Int, q:Int, r:Int, s:Int, t:Int, 
            u:Int, v:Int, w:Int) = 0
(crazier _).curry    :    <error>

The scala runtime apparently has traits Function0 through Function22 defined. I guess this is so that they can have call methods that take a statically known list of parameters (rather than, say, an array). That's all well and good, and probably necesary for proper Java interop, but it's still a little sad. Still, I don't expect to run into that limit any time soon. Oh wait, I have already worked on projects with functions that take more than 20 parameters. Maybe this was added just for me. Now I'm sad.

Monday, October 13, 2008

Partial Application in Haskell and Scala

This is an attempt to squeeze out a blog post while I wait for my laundry to finish.

Functional languages are fun. Fun in ways that Java (and, for that matter, Ruby) are not. Take Haskell. In that language, we can take any operator and turn it into a function. Normally, we use the symbol + to represent addition. If we enclose it in parentheses, we instead have a function.

(+) :: (Num a) => a -> a -> a

In this case, (+) is a function of 2 number parameters, which returns a number. Now that we have a function, we can apply all of the standard Haskell magic to it. Since Haskell is automatically curried (no function really ever takes more than one parameter), we chain calls to fully evaluate our (+) function.

(+) 2 3 => 5

We can also partially apply this operator.

add5 :: Integer -> Integer
add5 = (+) 5
add5 3 => 8

In this case, we have created an alias for the partially bound + operator. Rather than jump through so many hoops, we could specify add5 more directly.

add5 = (5+)

Finally, a slightly more complicated example.

simple :: Integer -> Integer -> Integer -> Integer
simple x y z = x * (y + z)

simpler :: Integer -> Integer -> Integer
simpler = simple 2

simplest :: Integer
simplest = simpler 3 4 => 14

All functions are also values in Haskell.

easy = simple
easy 2 3 4 => 14

As you can see, in Haskell, we can turn any operator into a function. Functions are curried, and can be partially evaluated from the left. Functions are also values that can be assigned and passed around as needed.

Scala takes a different approach. In Scala, operators are actually methods on values. There is no global + operator. Instead, you invoke the + method on the left hand parameter.

5 + 3 //is the same as...
(5).+(3)

If you want to refer to a function as a value in Scala, you must "partially apply" it to zero parameters.

val output = println //will result in a compilation error
val output = println _
output "Oh Hai, World!"

The underscore is the Scala placeholder operator. If used as we did with println, it stands in for the whole argument list, effectively turning the function into a function value. It is also the mechanism by which we can partially apply a function.

def simple(x:Int, y:Int, z:Int) = x * (y + z)
val simpler = simple(2, _:Int, _:Int)
simpler(3, 4) => 14

The underscores, when used this way, compel the result of the expression to itself be a function that takes n parameters, where n is the number of placeholders. Sometimes, it is possible to infer the type of the missing parameters; other times, it isn't. It depends on how the parameters are used.

It is very important to notice that, unlike Haskell, it is very easy to bind only the parameter in the middle of this expression.

val sample = simple(_:Int, 3, _:Int)
sample(2, 4) => 14

By combining placeholder syntax with operators, it is possible to turn an operator into a function, even a function that takes its left operand as a parameter.

List(1, 2, 3).map(_ + 2) => List(3, 4, 5)
List(1, 2, 3).reduceLeft(_ + _) => 6

As you can see, Haskell and Scala have a lot in common. Haskell's syntax is a bit more concise (and its inference rules much better), but Scala's ability to bind any parameter is pretty handy, too. There's something both cluttered and clean about Haskell's use of underscores, especially when types aren't required. Of course, I'm not an expert (or, in fact, experienced at all) with either language, so please correct me if I got any of my facts wrong.

Looks like I failed. My laundry was done 30 minutes ago.

Saturday, October 04, 2008

The Machine

Two weeks ago, I had the opportunity to see The Machine with my family. The Machine is a Pink Floyd tribute band. That is to say, at their shows, they play nothing but Pink Floyd music. All of the musicians are clearly extreme Floyd fans. I mean, why else would you spend 20 years of your life playing somebody else's music? Now, some people don't like tribute bands. I had a hard time getting people to see The Australian Pink Floyd Show when they came to New York (playing literally a few blocks from where we were staying). Who cares that these aren't the original musicians? Would you also refuse to go to a performance of Beethoven's 5th because it wasn't being conducted by the man himself? Of course not! The music is just as good, and the musicians are going to make it special and awesome anyway. But I digress...

It was interesting to see the variety in people in the theater. Obviously, many of the patrons were my parents' age, but there were also some college kids and folks whose heads were completely gray. What was perhaps more interesting to me is that the 50 year olds were more animated and crazy than the college kids. They had some smoke machines up on stage, but I don't think that was the source of all the smoke in the hall. It's fun to watch adults relive their youth.

The set was Dark Side of the Moon (with the Wizard of Oz projected onto their own version of Mr. Screen), followed by an intermission, followed by The Wall. Not a bad setlist at all. As they launched into the beginning songs from Dark Side, I was carefully listening for any variation from the album tracks that I know so well. I couldn't help it. These guys were playing well-known and well-loved music, so it's only natural to compare their performance to the original. By The Great Gig in the Sky, though, I was totally sold. The woman that belted out those notes was simply amazing. She absolutely hit every note. It was surreal. The keyboardist was younger than the rest and totally crazy, with a maniacal grin that was somehow larger than his actual face. The drummer hid behind the drums for most of the show, but did a very good job. The bassist seemed detached, standing apart from the others. I suspect that was completely intentional. The saxophone player was decent, but wasn't very memorable (after all, he only played on a few songs). Rounding out the group is the lead guitarist / lead singer. His ability to mimic both David Gilmour and Roger Waters was spooky. The man knew his guitar well, and made it sound just like the original.

By the time they were playing The Wall, people in the crowd were singing along. Performing Dark Side first was a good idea. People were more mellow when the entered the theater than when they left, and Dark Side is best appreciated without whoops and cheers. The Wall, on the other hand, is great with audience participation. In the end, they ended up getting 4 standing ovations (after Dark Side, after (I think) Comfortably Numb, after The Wall, and after their encore of Run Like Hell). They deserved each and every one of them. They probably played for 2.5 hours all told.

I never got a chance to see Pink Floyd live. As one of the people sitting next to us pointed it, this is the closest you can get at this point. While I agree with him, it is wrong to think of these guys as a facsimile of that famous band. These are all very talented musicians who love this music so much that they have dedicated a big chunk of their lives to it. As a fan, I'm grateful to them for doing that.

Saturday, September 06, 2008

No More Statics! Part 2

In a previous post, I explained how Scala's use of singleton objects is better than Java's use of static members. I was asked for some sample code after that post, so I thought I would throw some together. Let's look at a simple Java class.

class Foo {
    private static int number = 1;
    
    public static Foo create(String a) {
        return new Foo("Some " + a);
    }
    
    private String a;
    private int id = number++;
    
    public Foo(String a) {
        this.a = a;
    }
    
    @Override
    public String toString() {
        return "Foo #" + id + " is " + a;
    }
}

This class keeps track of how many instances have ever been created. You construct a Foo with a name, and the Foo's name and id are part of its string representation. In addition, there is a create method that has been defined on the Foo class itself.

Scala doesn't have a "static" keyword. Instead, members that would otherwise be static are placed onto the so-called companion object.

object Foo {
  var number:Int = 1;

  def create(a:String) = new Foo("Some " + a)
}

class Foo(a:String) {
  private val id:Int = Foo.number
  Foo.number = Foo.number + 1

  override def toString() = {
    "Foo #" + id + " is " + a
  }
}

Because Foo is the name of both an object and a class in the same package, they are allowed to access each other's private members. Basically, this makes the instance members of a singleton object equivalent to static members of an ordinary Java class. However, since the singleton object is a fully fledged object, it can be passed around in a way that Java classes normally can't be.

def createList(f : Foo.type) = {
  List(f.create("One"), f.create("Two"))
}

Have you ever wanted a Java class' static members to obey an interface? Well, the singleton object can mix in Scala traits (Scala traits seem to take the place of both interfaces and mixins from other languages).

trait Creatable[A] {
  def create(a:String) : A

  def createDefault() : A = {
    return create("Default")
  }
}

object Foo extends Creatable[Foo] {
  var number:Int = 1;

  override def create(a:String) = new Foo(a)
}

And here's a whole sample program:

trait Creatable[A] {
  def create(a:String) : A
  
  def createDefault() : A = {
    return create("Default")
  }
}

object Foo extends Creatable[Foo] {
  var number:Int = 1;
  
  override def create(a:String) = new Foo(a)
}

class Foo(a:String) {
  private val id:Int = Foo.number
  Foo.number = Foo.number + 1

  override def toString() = {
    "Foo #" + id + " is " + a
  }
}

def createList(f : Creatable[Foo]) = {
  List(f.create("Three"), f.create("Four"))
}

println(Foo.create("One"))
println(Foo.create("Two"))
println(createList(Foo))
println(Foo.createDefault())

----------

Foo #1 is One
Foo #2 is Two
List(Foo #3 is Three, Foo #4 is Four)
Foo #5 is Default

Why are singleton objects better than static members? To begin with, Scala's singleton objects are at least as expressive as static class members, so you're not losing anything from Java. You define a singleton object differently that you define static members in Java, but you access them using notation identical to Java (i.e. Foo.bar(5) in both languages). In addition, you get some other nice features - first class object status and the ability to participate in the normal class hierarchy. As an added bonus, Scala's simpler syntax actually made the class/singleton-object pair shorter than the equivalent Java solution. Not bad!

Tuesday, September 02, 2008

On Architecture

The other day, a coworker told me that they want to improve their architectural skills. That brought up a number of questions for me. What is architecture and how does it differ from programming? What are traits of a good architecture? What about a good architect? Am I a good architect?

"Traits of a good architecture" is something that gets covered in a lot of CS textbooks: extensibility, security, resilience, and so on. However, concentrating on those traits won't make you a better architect.

I don't know what will improve your architectural skills, but I do know what I think about when I'm working on architecture. I don't necessarily architect by reason. I have "good code" values, and I evaluate all the code that I write or encounter against them. For example, I think that exceptions (if available) should be used to signal the presence of and to recover from erroneous conditions. Status codes are straight out, and "fixups"[1] are a bad idea, too. Only in rare performance-critical areas are other mechanisms appropriate.

Where did I get my "good code" values? I've built them up from my personal and professional experience. I've been writing toy programs for 18 years. A lot of that is useless today, but some of those experiences have taught me valuable lessons. Just like a master carpenter didn't acquire his skills overnight, a software architect needs to develop his skills over a period of time. Trying, failing, reading, watching, discussing, and reflecting are all useful for developing this sense.

Just like other developers, I like to program. In fact, for an architect, it's essential to get one's hands dirty. However, it's also important to know when to step back and look at the big picture. For me, it's all about relying on that design "sense". When people talk about code smells, it's because something smells rank to their design sense. On the other hand, it's important not to get paralyzed by that sense. If you're unsure what direction you should be heading, it might be time to try something. Commit your current work (or, if using Git or Mercurial, create a new branch) and try going down a path. Keep your eyes open along the way, and learn what works and what does not. If you have time, try the other path. It might also make sense to start a new "toy" project to try some ideas without being burdened by the current code base. Also, be ready to revert all of your changes. Most of the work in programming is wrapped up in the thinking. Once you know how to do something, it should be reasonably simple to reproduce your work[2].

Finally, I'm a big fan of deciding on a direction and pursuing it. Sometimes, you don't know what to do, and it's too much work to mock up or try out even one approach. In this case, I ask myself how I would want it to work, and I try to do that. I ignore technical constraints unless they are insurmountable. Blindly following a path because it is "right" can get you into a lot of trouble, but so can meandering about without any goal in sight.

What do other people think? What is architecture to you, and how do you improve those skills?

[1] In this case, I'm talking about code which detects invalid input or state (which is a good thing), but then changes the input or state to be valid. An example would be a function with a parameter whose value needs to be in the range [1, 1000], but which will assume 0 for any invalid value. This makes the function always terminate normally (after all, there are no longer any invalid inputs), but it doesn't necessarily ensure correctness of the program. Who's to say that 24124142 and 0 are equivalent, anyway? Exceptions are better because they allow the caller, not the callee, to determine the result in the case of an invalid condition.

[2] If, on the other hand, most of your programming effort is spent on pushing keys on a keyboard, you're in a difficult position. Programming is primarily a thinking man's game, not a typing man's game. Your goal should not be to generate the most code in a day, but rather to generate the most value per line of code. Partially, then, your goal should be to keep the number of lines of code to a minimum. Removing code while retaining functionality, flexibility, and performance is always a good thing.

No More Statics!

As I read more about Scala, I'm running across a lot of things that I like. In Scala, there are no static members: no static methods; no static fields. Instead, Scala has so-called "singleton" objects. These singleton objects are globally accessible, though their instance methods and fields are still subject to access restriction. This is great because it exposes what we all knew all along: that static fields and methods in Java are really just global variables and functions. Granted, they are access-controlled, namespaced globals, but they're still globals.

Since each class' singleton object is in fact an object, it can subclass another object or mix in traits, just like objects that are spawned by a class. The singleton object has the same rights as any other object in the system.

In addition, a singleton object can share a name with a class; if it does so, they can access each other's private data. I'm not sure yet, but I assume that this is how Scala accesses static members of Java classes - it creates a singleton object that doesn't derive or mix in anything, but turns all the static methods and fields of the Java class into instance members of the singleton object.

Sunday, August 31, 2008

Terseness for Terseness' Sake

I've been reading up on Scala, since it seems like it may be a better Java than Java itself. As I was reading through the pre-release PDF of Programming in Scala, I came across something goofy.

Scala, like (as I understand it) F#, tries to nestle itself comfortably between the functional and imperative camps. It has a syntax that supports both schools of though. So, as you might expect, some functions in Scala will behave nicely and will return a value without any side effects. Other functions will be executed solely for the side effects, and will return nothing (the Unit type in Scala). To further the Functional mindset, Scala does not require an explicit return statement at the end of a function. Instead, the last value in the function is used as the value of the function. Programming in Scala is quick to point out that, if you want, you can just as easily use explicit return statements (if that floats your boat).

The functional and imperative worlds collide in a shower of fireworks. From Programming in Scala:

One puzzler to watch out for is that whenever you leave off the equals sign before the body of a function, its result type will definitely be Unit. This is true no matter what the body contains, because the Scala compiler can convert any type to Unit. For example, if the last result of a method is a String, but the method’s result type is declared to be Unit, the String will be converted to Unit and its value lost.

The book then goes on to provide an example where a function's value is accidentally lost.

Now, I'm all for shortening my programs. The less I have to type, the better. This is, in fact, one of the big advantages Scala has over Java. But wait just a minute! I thought that our compilers were supposed to help us, not trip us up! Here's a situation where 2 different things (a function's return value and the function's return statement) are optional. If they are not specified, they are inferred. In that case, the only difference between retaining and losing your return value is a single character - a '='.

To get all concrete, here are a pair of Scala programs that do different things.

package org.balefrost.demo

object Sample {
  def foo {
    "bar"
  }
  
  def main(args : Array[String]) : Unit = {
    val baz = foo
    println(baz)
  }
}

=> ()

package org.balefrost.demo

object Sample {
  def foo = {
    "bar"
  }
  
  def main(args : Array[String]) : Unit = {
    val baz = foo
    println(baz)
  }
}

=> "bar"

I don't know. To me, that's goofy. Other people might find it completely reasonable. Of course, you can protect yourself with explicit types.

package org.balefrost.demo

object Sample {
  def foo {
    "bar"
  }
  
  def main(args : Array[String]) : Unit = {
    val baz:String = foo  //compiler error: can't assign Unit to String
    println(baz)
  }
}

Anyway, kudos to Programming in Scala for pointing out the potential cause of a hair-yankingly-frustrating bug. Now that I understand what's going on, I will probably be better able to handle it when it comes up in a real program.

Thursday, August 14, 2008

Should I Squash Underhanded Corporate Comments or Let Them Live?

At this point, my Memeo Autosync post has gotten a few comments that clearly originate from somebody who works for(or otherwise has a stake in) Memeo. On one hand, I really dislike this corporate intrusion in an otherwise pristine blog. They have masqueraded as a genuine user, which is misleading and underhanded. On the other hand, it appears that they have offered a discount on Memeo software.

What do other bloggers do with these situations? Do they squash comments that are subversive like this? Do they just allow them, realizing that blog readers are intelligent individuals and will notice the obvious deception? What do you think?

Why You Can Throw Away "If", and Why You Shouldn't

Introduction

Most reasonably experienced object-oriented programmers have probably stumbled upon the same realization; namely, that it's possible to replace if statements with polymorphism. Polymorphism is simply a way to delay a decision until runtime. The if statement does the same thing. In fact, procedural programmers need to resort to things like if and switch statements because they have no other tool. Functional programmers, on the other hand, simply toss functions around willy nilly.

This realization can be powerful. It can also really hurt a code base (I know - I've smashed my share of algorithmic china with this hammer). I recently ran into a place on a project where it was a great idea, and I thought I would share why I thought it worked so well.

Current Implementation

Suppose you have 2 methods:

public void storeNewItem() {
    Item item = new Item();
    item.name = request["name"];
    item.description = request["description"];
    item.quantity = request["quantity"];
    item.value = someComplexCalculation();
    item.totalValue = item.quantity * item.value;
    // calculate and store some more fields here
    items.addNewItem(item);
}

public void storeExistingItem() {
    Item item = items.get(request["itemId"]);
    item.name = request["name"];
    item.description = request["description"];
    item.quantity = request["quantity"];
    item.value = someComplexCalculation();
    item.totalValue = item.quantity * item.value;
    // calculate and store some more fields here
    item.update();
}

These two functions should look pretty similar. In fact, they are nearly identical. Both acquire an item, populate it with data, and then store it. The only difference is the way that the item is acquired and the way that the item is stored.

First Attempt

I wanted to merge these methods, and this was my first attempt.

public void storeItem() {
    Item item;
    if (request["itemId"] == null) {
        item = new Item();
    } else {
        item = items.get(request["itemId"]);
    }

    item.name = request["name"];
    item.description = request["description"];
    item.quantity = request["quantity"];
    item.value = someComplexCalculation();
    item.totalValue = item.quantity * item.value;
    // calculate and store some more fields here

    if (request["itemId"] == null) {
        items.addNewItem(item);
    } else {
        item.update();
    }
}

This works, but is obvious crap. I found myself saying "I wish Item were able to handle those details by itself".

Second Attempt

Well, I wasn't brave enough to change Item, so I instead wrapped it.

public void storeItem() {
    Persister persister;
    if (request["itemId"] == null) {
        persister = new NewItemPersister();
    } else {
        persister = new ExistingItemPersister(request["itemId"]);
    }

    Item item = persister.getItem();
    item.name = request["name"];
    item.description = request["description"];
    item.quantity = request["quantity"];
    item.value = someComplexCalculation();
    item.totalValue = item.quantity * item.value;
    // calculate and store some more fields here
    persister.persist();
}

interface Persister {
    Item getItem();
    void persist();
}

class NewItemPersister implements Persister {
    private Item item = new Item();
    
    public Item getItem() { return item; }
    
    public void persist() { items.addNewItem(item); }
}

class ExistingItemPersister implements Persister {
    private Item item;
    
    public ExistingItemPersister(String itemId) {
        item = items.get(request["itemId"]);
    }
    
    public Item getItem() { return item; }
    
    public void persist() { item.update(); }
}

We still have an ugly if at the top of the function, and we have certainly ballooned the code. I still think that this is better than what we started with.

There is less duplication, which will make maintenance here much easier.
The Persister interface could be made into a generic class, and the implementations could be re-used all over the system. Some reflection here could really simplify your life.
A good web framework would allow you to remove that pesky initial if statement. In a less good framework, you could hide this behind some sort of object that knows how to generate a persister from an itemId (or null).

The practical upshot is that these changes should make it easier to apply metaprogramming techniques to this chunk of code. The only code that can't really be made declarative is some of the code which assigns values to fields.

There is one thing that bothers me, though. We have made the Persister implementors responsible for the lifetime of the Item. That's not at all clear from the interface, but it is obvious from the use. The tell-tale sign is that we have a getItem() method. Getters that expose their class' internals like this are evil, and if you don't believe me, you're just plain wrong. I won't try to justify that statement in this post, but trust me.

Third Attempt

To solve this, we could change the interface yet again (and I will switch to Javascript, because Java doesn't have any convenient lambda syntax).

function storeItem() {
    if (request["itemId"] == null) {
        var persister = newItemPersister;
    } else {
        var persister = new ExistingItemPersister(request["itemId"]);
    }
    
    persister.update(function(item) {
        item.name = request["name"];
        item.description = request["description"];
        item.quantity = request["quantity"];
        item.value = someComplexCalculation();
        item.totalValue = item.quantity * item.value;
        // calculate and store some more fields here
    });
}

var newItemPersister {
    update:function(f) {
        var item = new Item();
        f(item);
        items.addNewItem(item);
    }
}

function ExistingItemPersister(itemId) {
    this.itemId = itemId;
}

ExistingItemPersister.prototype.update = function(f) {
    var item = items.get(request["itemId"]);
    f(item);
    item.update();
}

Now, the item's lifetime is only as long as a call to update() is on the stack. This is a common idiom in Ruby, as well.

Conclusion

In the end, I wasn't completely happy with any of these solutions. I think that things are better than they were before. There are also a number of other permutations that will get it marginally closer to ideal. I think that the real solution is to update Item so that you can create a new item and save an existing item with a single method. After that, the code to choose whether to create a new object or fetch an existing object should be minimal and extractable.

I did learn a rule of thumb for deciding when to replace an if statement with polymorphism. If you find yourself saying "I don't want to deal with this, I wish it were handled for me," there's a good chance that you could benefit from some polymorphism. Also, if you find yourself checking the same condition multiple times in a function (as we had in the original implementation), you might want to consider whether polymorphism will help you.

Thursday, August 07, 2008

Experiments in Firmware Hacking

I got a new ethernet-ready printer today, and wanted to add it to my existing wireless network. This is not the usual home wireless network use case - most people want to share an upstream connection with a bunch of wireless clients. I wanted to connect a wired device to a wireless network. I first tried using a spare Airport Express. That worked perfectly. Then, I decided to try getting my Linksys WRT54G v2 to work. When I realized that the stock firmware was definitely not up to the task, I grabbed Tomato. It claims to be solid, fast, and AJAX-y with realtime, SVG charts. How could I resist. However, the settings that I needed weren't obvious at first. After some fiddling, I think I've made it work. I'll share them here in case they're useful to somebody.

The most important setting is Wireless Mode (under Basic/Network). Here's my current best understanding of these modes:

Access Point	This is what the Linksys router would do with the default firmware. It allows wireless clients to connect to it in infrastructure mode, and will route packets between the wireless network, the LAN network, and the WAN network (with NAT).
Access Point + WDS	I think this may work like the Airport Express' WDS Remote mode. That would mean that it can accept wireless clients and simultaneously connect to a WDS network.
Wireless Client	This mode appeared to work like the Wireless Ethernet Bridge mode, except with NAT. It appeared that the router will can run a DHCP server on the LAN interfaces. It also requires that the WAN port be configured, which seems very strange to me.
Wireless Ethernet Bridge	This is the one that ended up doing what I need. As far as I can tell, the WAN port is disabled. The device connects to an existing wireless network. It will then route packets between the wired and wireless network without NAT. Furthermore, contrary to other reports, it appears that you can connect devices to more than one of the LAN ports. I had both my printer and my laptop connected to LAN ports, and things still seemed to be working.
WDS	I think this may be similar to the Airport Express' "WDS Relay" mode.

It might be fun to pick up a WRT54GL. (The 54G has been simplified and will no longer work with most custom firmware. The 54GL restores the missing features. It appears to have been created specifically so that people can continue to use alternative firmware.)

Sunday, August 03, 2008

Date Formatting in Javascript

Problem

I found myself doing some stuff in Javascript. In particular, I needed to be able to turn a Date into HTML specific to the application that I'm working on. Let's say that the server sends us a task's creation date. We need to format it:

var fromServer = Date.parse("Sun Aug 03 2008 23:12:52 GMT-0400 (EDT)");

td.innerHTML = format_date(fromServer);

<td>
    <span class="date">2008-08-03</span><span class="time">11:12 PM</span>
</td>

However, and I have no guarantee that it will always be a parseable date. In fact, the server sends '-' for dates that don't exist. We still need to output something.

var fromServer = "-";

td.innerHTML = format_date(fromServer);

<td>-</td>

I would like format_date to accept either a Date that should be formatted, or a string that should be passed along verbatim (with only necessary HTML character entity escaping). How can we do this in an object-oriented fashion?

Attempt 1

The default object-oriented mindset would encourage us to use polymorphism. We have an object, and we want to be able to call a method on that object. Well, we have a Date object.

Since this is Javascript, we could stick an extra method onto Date.prototype that would let us do this. While we're at it, we can put a similar function onto String.prototype:

Date.prototype.formatAsAppSpecificHTML = function() {
    return "<span>" + this.getFullYear() + ... + "</span><span>" + this.getHours() ... + "</span>";
}

String.prototype.formatAsAppSpecificHTML = function() {
    return this;
}

function format_date(o) {
    return o.formatAsAppSpecificHTML();
}

There are some problems with this.

We have an ugly, ugly method name. This is because we're mixing abstractions. A Date, in general, shouldn't know how to format itself in this way. Why not? Because it probably doesn't apply to most usages of Date. It may be a common behavior in my application, but its a nonsensical behavior in your application. Since the method is very context-specific, the name has to be equally specific.
This only works in an "open" language that lets us add methods to an existing class/prototype (Ruby and Lua (and arguably C#) fall into this camp, Java and C++ do not). Even if your language has the necessary support, you still have to wonder whether it's a good idea to handle stuff this magical.
It's not obvious. You don't normally connect Date and String. You don't expect to see methods shared between them. They are very orthogonal primitives. Yet we've tied them together in an unnatural way. In order for somebody to discover this, they need to think to look in two (potentially distant) places in the code.

Attempt 2

Polymorphism is a form of runtime decision making. Rather than use language constructs (such as 'if' and '? :'), polymorphism leverages the power of pointers. Since Polymorphism created some problems in this example, what if we switch to use a more traditional (i.e. not object-oriented) solution?

function format_date(o) {
    if (o.constructor === Date) {
        return "<span>" + o.getFullYear() + ... + "</span><span>" + o.getHours() ... + "</span>";
    } else {
        return o.toString();
    }
}

This approach also creates some problems. We've simply moved the complexity further up the ladder. Before, the knowledge that a task may or may not have a creation date was strewn across 2 types: Date and String. Looking at either type in isolation, you only see half of the picture. You might not realize that you can take anything that comes from the server and format it correctly. Now, however, that knowledge is pushed in your face. We're making explicit decisions about concrete types wherever we need to. It's easy to get it right once. So far, we're only handling formatting. What if we also want to draw a timeline? What if we want to find the earliest task in a list of tasks? What if you want to relate a task to source control submissions that occurred while the task was active. In all of these cases, you will need to deal with the fact that a task might or might not have a starting date. At some point, somebody's going to forget that they need to check this, and there's going to be a bug.

Attempt 3

If object-oriented didn't work, and procedural didn't work, what are we going to do? Well, actually, I lied. The first attempt used one form of object-oriented abstraction. There are many more. Both of the attempts so far have suffered from primitive obsession. They dealt with both Strings and Dates. In actuality, we don't want to concern ourselves with either of these. We actually have something different - we have an OptionalDate. An OptionalDate knows whether it represents an actual date or whether it represents no date at all. It can format itself correctly in either case, and can be compared to other OptionalDates for sorting purposes. In fact, OptionalDate handles any operation that needs to work with both actual dates and "not dates".

function format_date(o) {
    return o.format();
}

function OptionalDate(d) {
    this.d = d;
}

OptionalDate.prototype.format() {
    if (d) {
        return "<span>" + this.d.getFullYear() + ... + "</span><span>" + this..getHours() ... + "</span>"
    } else {
        return "-";
    }
}

What makes this a better solution? After all, the code looks very similar to the code in attempt 2.

It localizes the code better than attempt 2. Rather than checking for the presence of a date all throughout your code base, you can collect all those if/else statements in one place. You also get the chance for some pretty cool higher-order programming, where OptionalDate has a method that takes a function to be called if the OptionalDate actually has a date.
It also gives us a better place to hang domain-specific code. Hey, business logic has to live somewhere. It never seems right to put it on the primitive objects, and it also doesn't make sense to put it at the highest level of abstraction. Business logic is the foundation upon which you build an application. As a foundation, it needs a separate place to live.
It makes more sense. When a new programmer is brought onto the team, they will be able to better understand just what is going on. This is extremely important. I believe that, if a person ever has a question about the code, it's a good sign that the code should change. That doesn't mean that you actually take the time to refactor the code, but it's a sign that this is a place that could use some attention.

My Solution

In the end, I went with something close to Attempt 2. This was actually my first choice; Attempt 1 was purely synthetic. I'm working with a legacy code base, and I'm a little wary about introducing big changes just yet. I'm also very conscious about time. In any case, this is an improvement over what was there before (it just treated the date/time as an opaque string, which wouldn't work at all for my requirements).

Conclusion

There are definitely some problems for which the object-oriented noun/verb paradigm breaks down. Or, perhaps stated more precisely, there are problems where that paradigm confuses more than it helps. However, that isn't true in this case. We were able to introduce some good refactorings even while staying true to the spirit of good design.

You may wonder why I care so much. I mean, any of the attempts would have solved the problem perfectly well. Why spend time even thinking about it? I believe that pragmatism is an important trait in programmers, but so too is learning. Whenever you start working on a problem, you need to choose a direction to pursue. Until you start walking, you'll never make progress. You may find yourself at a dead-end, but you wouldn't have known if you hadn't gone that way. My goal is to develop a strong enough design sense that the path that I choose with little thought tends to be one that will work out in the long run. Programming is both tactical and strategic. Most programmers develop their tactical skill as a natural part of writing code. I'm trying to sharpen my strategic skill.

Blogger Timestamping

Interesting thing about Blogger - it looks like the "Posted by... at..." clause uses the time the post was started, not the time that you eventually push the big Publish button.

F is for Fail

It's really sad to see code copied-and-pasted. I feel like a teacher who is grading a test, only to find that two kids have exactly the same answers. It's even sadder to see copied-and-pasted comments. That's like seeing that two kids have exactly the same answers, and both kids' answers are wrong.

Monday, July 21, 2008

Object Creation in Ext JS

Ext JS has a useful object creation pattern. Most constructors can be passed a hash of configuration parameters. This can be used instead of the prototype creation pattern (not to be confused with Javascript's prototypical inheritance). Rather than create new objects that are copies of an existing object, you create objects based on a set of configuration data.

You want an example? OK. Today, I was creating context menus for items in a tree. There is a global pool of possible actions, and each node responds to a different subset of them. When the user right-clicks on a node, I need to

Create a Menu instance
Add all of the appropriate menu items to it
Show the menu

I had hoped that I could create one menu item instance per possible action (20 or so), and then re-use them in different Menu instances.

var actionMenuMap = {
    addChild: new Ext.menu.Item({text:"Add child", icon:"add.png"}),
    delete: new Ext.menu.Item({text:"Delete", icon:"delete.png"}),
    fireZeMissiles: new Ext.menu.Item({text:"Fire ze Missiles!", icon:"fire.png"})
}

var nodeActions = ["addChild", "delete"];   //in practice, this would come from the node itself

//dangerous nesting ahead!
new Menu({
    items: nodeActions.map(function(a) {
        return actionMenuMap[a];
    })
}).showAt(event.getXY());

That didn't work - it seemed like I couldn't share a menu item instance between menu instances. I might be able to get it to work by removing menu items after the menu is dismissed, but I don't actually need to. I can simply hold on to the configuration information.

var actionMenuMap = {
    addChild: {text:"Add child", icon:"add.png"},
    delete: {text:"Delete", icon:"delete.png"},
    fireZeMissiles: {text:"Fire ze Missiles!", icon:"fire.png"}
}

var nodeActions = ["addChild", "delete"];   //in practice, this would come from the node itself

//now we're cooking with functional programming!
new Menu({
    items: nodeActions.map(function(a) {
        return new Ext.menu.Item(actionMenuMap[a]);
    })
}).showAt(event.getXY());

In case you are not familiar, map is a function that is present in every functional language and many dynamic languages. It does not exist in Javascript natively, but is added by Prototype, jQuery, dojo, Mochikit, and probably every other Javascript framework. Here's a sample implementation for reference:

Array.prototype.map = function(f) {
    if (typeof(f) !== "function") {
        throw new Error("map takes a function");
    }
    
    var result = new Array(this.length);
    for (var i = 0; i < this.length; ++i) {
        result[i] = f(this[i]);
    }
    return result;
}

What could possibly make this better? In addition to taking a hash, allow the constructor to take a function. That function manipulates the object after the rest of the construction runs, allowing you to add children or manipulate settings or calculate values. This would be pure icing, of course. Factory functions also feel a lot more lightweight to me than factory objects. See also Rails' version of the K combinator.

Saturday, July 12, 2008

Please encode the patterns that you see

As my software career has advanced, I've steadily become more cynical and bitter. This isn't good, and it's something that I'm trying to work on. It's hard to go a day without having one of those "you've got to be kidding me" moments. "I can't believe they did it like that." "What were they thinking?" "Hasn't anybody ever tried to do this before?" "There has to be an easier way."

Case in point: I recently read that SWFObject was THE way to embed swf files in your HTML. Fine, I though. I'll give it a try. Within minutes, I ran into something simple that it couldn't do. When you call the function that embeds the swf, it might not actually do the operation immediately - it might wait until the page initialization has progressed beyond a certain point, and then inject the code. I needed to run some code AFTER the swf had been embedded in the page (for sure). There's no way that I could see to do this with the stock SWFobject. Despite the fact that SWFObject inherently depends on event dispatch to work, it doesn't dispatch enough of its own events to be useful to me.

As I started to look at the code to see if I could work it in, I stumbled across the following snippet:

var att = {};
if (attObj && typeof attObj === OBJECT) {
    for (var i in attObj) {
        if (attObj[i] != Object.prototype[i]) { // Filter out prototype additions from other potential libraries
            att[i] = attObj[i];
        }
    }
}

"Wow!" I thought. "That looks like it would be useful to lots of people." As it turns out, it is very useful - even to the SWFObject developers, who used the same pattern twice more in the same 58-line function:

var par = {}; 
if (parObj && typeof parObj === OBJECT) {
    for (var j in parObj) {
        if (parObj[j] != Object.prototype[j]) { // Filter out prototype additions from other potential libraries
            par[j] = parObj[j];
        }
    }
}
if (flashvarsObj && typeof flashvarsObj === OBJECT) {
    for (var k in flashvarsObj) {
        if (flashvarsObj[k] != Object.prototype[k]) { // Filter out prototype additions from other potential libraries
            if (typeof par.flashvars != UNDEF) {
                par.flashvars += "&" + k + "=" + flashvarsObj[k];
            }
            else {
                par.flashvars = k + "=" + flashvarsObj[k];
            }
        }
    }
}

That general pattern appears at least 9 times in the code. It's a good thing they put it in a reusable functio... oh wait, they didn't. Instead, they dutifully pasted the code (comment and all) 8 times.

OK, something's going on. Maybe I've become a much better programmer in the past year; maybe everybody else is getting worse; maybe I'm off my rocker; maybe there's some strange performance or compatibility tweak that prevented them from doing what seems plainly obvious to me. Removing the obvious duplication would have simplified their code, making it more readable (and possibly obviating the need for the accompanying comment).

It's the little things like this that make me wonder if we're actually moving the state-of-the-art forward at all. If we can't see (or don't care about) the duplication at the micro scale, how will we re-use software at any broader level?

Thursday, June 26, 2008

Planescape: Effortless

I was amazed to learn that Planescape: Torment, a game released in 1999, is still playable under Vista. I installed it the other night (intending to finish it this time), patched it (using the most recent official patch, which is probably 8 years old), and started it without a hitch. I'm sure some combination of Microsoft's fervent dedication to backwards compatibility, nVidia's seemingly rock-solid drivers, and Black Isle's crack team of developers made this possible.

Recently, Scott Hanselman interviewed Steven Frank of Panic (a Mac software developer). In the interview, it was revealed that Mac users are less worried about backwards compatibility. Apple has completely axed support for Classic mode, which pretty much kills any software more than 8 years old. As a Mac user, I agree. I'm not that worried about old applications not working. But that's because I'm talking about applications. If an old application stops working, there's a good chance that somebody has written a more recent, and probably better, replacement.

Games are a different beast, though. Games are closer to movies or novels. How would people feel if they couldn't read Les Misérables anymore? Or if somebody went and mucked with Star Wars, and then threatened to not publish the theatrical release anymore? Well, we know what happens then: thousands of fans get really really angry. People crave old stuff. Really great content is timeless. Games should be treated the same way. Ever since "game designer" has become an official title, games have been steadily becoming more than children's amusements. There's often some great narrative buried in there, wrapped up tight inside a husk of decaying code. Eventually, that code won't run on new computers, and the narrative will be lost forever.

Of course, now we have virtualization, so maybe it's not such a big problem.

On Succinctness

One of the reasons that I've been blogging has been to help me communicate succinctly. I've been trying to keep my posts small without being lifeless. I want to balance terseness with expressiveness. If you read this blog, how am I doing? What could I do better?

On Usability

I've been slowly learning that, in my heart, I'm a usability guy. I'm always concerned for the experience that end users will have with software. I care about making it easier for people to do what they want to do. When it comes to user interfaces, I like bling, but only when it helps the user accomplish a task or understand an interaction. I love Expose. It's such a simple but powerful concept. It's also made for a great demo. I hate Flip 3D. It's pretty, but also pretty pointless.

I've been interested in API design for a long time, too. As a developer, I have used some libraries that were an absolute joy, and other libraries that were a total mess. My recent foray into Flex (specifically, getting the Tree control to lazily load its data) reminded me what it's like to use a bad API.

I mentioned to a coworker the other day that, if you're developing a library, you should be at last as good a developer as the people that will end up using it. To me, it looked like Adobe (or Macromedia - I don't really know who's responsible) put all of their junior developers on the component library project. As a result, we're blessed with the interface IDataDescriptor, whose methods include isBranch() and hasChildren(). What's the difference? Should they ever be different? Who knows?!

I realize that I misspoke to my coworker. I should have said that people developing libraries should have a code aesthetic sense. I don't mean code aesthetics in terms of indentation or brace style or what have you. I mean that API designers need to consider the people that will end up using the library. Just as the hallmark of an Apple Design Award winner is that users "just get it", a good API should also be so easy to understand that it seems obvious to you. As in, "why would anybody make this API any different?"

Design is hard. It takes a conscious effort (well, for most of us). That's why it's important to do. If you're a developer, ask yourself: is this easy enough to understand. Remember, you're writing for an audience - your code does not exist for your eyes only. Even if it did, you're liable to forget everything before you come back to work on it again. If you're a manager, make sure there's enough space for developers to focus on this. If it looks like you're going to miss a deadline, that doesn't give you carte blanche to pressure developers out of finishing their work. Just because the code works doesn't mean it's done.

If you can't make an API grokkable, at least document it well. The standard Java and MSDN docs are full of nonsense.

setFoo(IFurple f) - sets foo to f

Really? I wouldn't have guessed. This tells me nothing. What is foo and why do I care. Why is foo a furple? If anybody ever asks a question about your API, it's a sign that something needs to change. Maybe you just need to add some documentation. Maybe you need a high-level wiki page explaining what the library is trying to do. Maybe you need to rename a method. Maybe you really do need to change the API. Or maybe you just need to smack the developer on the head and tell them to read the Javadocs first. But if you don't change anything, the problem is just going to repeat itself. Who knows - maybe you'll discover a bug or race condition along the way.