Thursday, June 18, 2015

A quick overview of what WebAssembly is and what it is not (yet)

There's been a lot of buzz today about WebAssembly, and that's completely understandable. A bytecode form of Javascript has been on the minds of many web developers for a long time. But the online commentary seems to have been based more on hopes than on released information. I hope to clear up some of that confusion. Note that I'm not involved in the project at all, so some of this information is likely incorrect; feel free to leave a comment if I've gotten something wrong.

WebAssembly isn't a spec yet. It's not even a draft spec. It's an idea and a proof-of-concept. So far, that's all that's in the public. But it has a lot of promise.

The WebAssembly roadmap essentially spells out three phases:

  1. A minimum viable product (MVP) that is roughly analogous to asm.js, specifically targeting C/C++ as the source language
  2. Additional features, such as threads, SIMD, and proper exception handling, all still with a focus on C/C++
  3. "Everything else", including features meant to support more languages. One such feature is access from WebAssembly code to objects on the garbage-collected Javascript heap. This could enable WebAssembly code to access the DOM and web APIs (which would not be supported in either of the previous iterations). This phase also contains support for things like large heaps, coroutines, tail-call optimization, mmapping files, etc. If and when we get to this phase, I would expect it to get further split.

There will also be a polyfill, since browsers will not initially support WebAssembly. In fact, there is already a polyfill, but it's more of a proof-of-concept than an actual implementation. There is no WebAssembly spec yet; this polyfill is merely to test the viability of a binary-encoded AST.

Initially, it's not wrong to think of WebAssembly as binary-encoded asm.js, though that will change over time. It may eventually grow to the point that it could replace Javascript, but in its first two incarnations, you will still need some JS code (to interact with the DOM and with web APIs). A likely use case is to compile low-level, algorithmic code (think image processing, compression, or encryption code) to WebAssembly, but to still write the bulk of your application in plain JS. If you're not familiar with asm.js, it might make sense to look into it; many of the same limitations of that environment also appear to apply to the first incarnation of WebAssembly.

However, even at this early stage, WebAssembly does specify some features that even asm.js doesn't provide. In particular, it talks about 64-bit integer operations, which asm.js can't natively provide. The current polyfill POC doesn't seem to support them, and they may choose for the polyfill to always implement them as floating-point operations, but an actual runtime would need to provide proper support for int64 (and other sizes as well, like int8 and int16).

WebAssembly only deals with the binary format and runtime environment; it doesn't provide any new APIs for dealing with the DOM or the network or anything like that. It may eventually provide alternatives to other web APIs (WebWorkers would be an obvious one, when WebAssembly eventually adds threading support).

The current plan for WebAssembly is to not use bytecode in the same way as JVM or CLR use bytecode. Those VMs represent their bytecode as an instruction stream, similar to instructions for non-virtual machines. WebAssembly, on the other hand, appears to be going more for a "serialized abstract syntax tree" form. The claim seems to be that this can be compressed more efficiently than can be achieved using general-purpose compression routines, and that doesn't seem too crazy. This distinction isn't important for web app developers, though it could mean that "disassembled" WebAssembly is easier to grok than disassembled JVM bytecode.

WebAssembly will have defined binary AND text forms. So, while you won't be able to curl a WebAssembly file and immediately understand it, there will be tooling to convert from the binary form to the text form (which will certainly eventually be built into browsers). It seems likely that the bytecode semantics will make concessions to retain some degree of human readability when converted to text form.

I really like this quote from Peter Kasting in the comments on Ars. Like, seriously, if you have an Ars account, go give him some fake internet points.

Notably, the people working on WebAssembly are the PNaCl (Google) and asm.js (Mozilla) teams. In some sense this can be considered the followon effort to those projects, meant to combine the best attributes from each, and in a way that can be agreed on by all browser vendors.

This is exactly how the system is supposed to work: individual teams try to advance the state of the art, and eventually all those lessons learned are incorporated into a new and better system. See e.g. SPDY -> HTTP2. WebAssembly draws on both the past work and the experience of all those involved, and wouldn't be what it is without them.

I say this partly so the sorts of people who have bemoaned "non-standard" vendor efforts in the past may have reason to pause next time they feel the urge to do so. No one wants a balkanized world forever, but that vendor-specific effort may gain the sort of real-world experience necessary to come back and design a great cross-vendor solution afterwards. All four major browser vendors have taken flak for this sort of thing in the past, and in my mind often unfairly so.

All told, it looks like this is just the first, small step. I'm a little surprised that they went public so early. Either they really plan to develop it in the open, or the tech media got wind of it and blew it well out of proportion. Whatever the case, this looks like it will be a large undertaking. It's very promising that everybody seems to be at the table. And given that the browser vendors appear to want to actively develop their products, we might actually be able to use this within a couple of years.

What a time to be alive!

Tuesday, June 16, 2015

The joys of parsing a toy programming language

In my personal time, I've been playing at building a toy programming language. Progress has been slow, but things are finally starting to come together. Last night, I ended up finding an interesting bug, and I wanted to document it here.

The syntax of my language isn't super complicated. Here's an example of a function:

def foo(a, b) = a + b

Because this language is a so-called modern language, it has to support lambdas as well. Here's an example:

val bar = (a, b) => a + b

You can call functions and lambdas using the same syntax:

foo(5, 7)
bar(5, 7)

Naturally, since lambdas are first-class, we need to support more complicated call syntax as well. In particular, this should be allowed:

((a, b) => a + b)(5, 7)

In order to support that, here's the section of the grammar that describes function calls (you can imagine the rest of the grammer):

FnCall    :=  Callable lparen Args rParen
Callable  :=  identifier
          :=  lParen Expression rParen
Args      :=  identifier MoreArgs
          :=  epsilon
MoreArgs  :=  comma identifier MoreArgs
          :=  epsilon

My compiler was able to successfully compile and run this extremely simple program:

val bar = (a, b) => a + b
bar(5, 7)

But imagine my surprise when it failed to compile this program:

val bar = (a, b) => a + b
(bar)(5, 7)

Looking at the parse tree, it was obvious what had happened. It turns out that my grammar was ambiguous, and the parser had chosen this interpretation:

val bar = (a, b) => a + (b(bar))(5, 7)

Rather than parsing this as a definition and a corresponding expression, it instead parsed it as a single definition. It assumed that b was a function of one parameter, which returned a function of two parameters. Even this simpler grammar has the same problem:

FnCall    :=  identifier lparen Args rParen
Args      :=  identifier MoreArgs
          :=  epsilon
MoreArgs  :=  comma identifier MoreArgs
          :=  epsilon

Given this program:

val bar = (a, b) => a + b
(5 + 7) * 2

This could (and would) get parsed as:

val bar = (a, b) => a + b(5 + 7) * 2

(It's worth noting that, if I had been using a shift / reduce parser, this would have been detected as a S/R conflict. But I'm not using a S/R parser; I'm using the Scala parser combinator library, mostly because it was easy to get started and I haven't outgrown it yet.)

Looking at Scala, from which I stole a lot of my syntax, I found that newline handling is a bit complicated, though the rules are laid out plainly in section 1.2 of the Scala language spec. Essentially, a newline can be treated either as plain whitespace or as a statement terminator, depending on the context. Within a parenthesized expression, newlines are always treated as whitespace. Outside an expression, newlines are treated as statement terminators if they appear between a token that could end a statement and a token that could begin a statement. One one hand, having written a fair amount of Scala, the rules feel pretty natural and intuitive. However, you do end up with strange behavior at the edges, as the following example demonstrates:

(succ
  (5))   => 6

{succ
  (5)}   => 5

On the other hand, Haskell (with which I'm admittedly not very familiar) appears to use indentation level to determine expression grouping. That is, while this is valid:

main = putStrLn "hi"

and this is also valid:

main = putStrLn 
 "hi"

this is not:

main = putStrLn 
"hi"

Context is not considered, so unlike Scala, this still fails:

main = (putStrLn 
"hi")

Both of these approaches have merit. Haskell's approach is natural enough, though there are some unambiguous representations that it would reject. Scala's approach is more permissive, at the cost of more complicated implementation rules and more surprising behavior. I can't tell which is more appropriate for my language.

For now, I took neither approach. I realized that, as far as I know, I have but one case where expressions are ambiguous - when the parameters to a function appear on a separate line from the Callable itself. It was possible for me to simply require that the whitespace between the Callable and its parameters not include any newlines. So an identifier at the end of a line will NEVER be considered to be a function call. I don't think this will remain this way forever, but I think it will get me unstuck.