GFX::Monk - Dealing with non-local control flow in deferred CoffeeScript

…a continuation (hah!) of defer: Taming asynchronous javascript with coffeescript.

In my last post, I outlined the ideas behind defer and its current state. One of the things I mentioned is that how to deal with return statements in asynchronous code is not yet decided. I’d like to explore a few of those ideas here. If you haven’t read the previous post, I suggest that you do.

The problem

With real continuations, a continuation object represents the rest of a program’s execution. With defer, the compiler can only generate a continuation that includes the rest of the current function. This works just as well, but only as long as the function calls a callback with its result, instead of returning it. Such a rule may be acceptable (it’s still better than writing your own continuations), but it’s certainly worth exploring other possibilities.

For reference, here is the sample synchronous function all the examples will be representing:

get_document: (id) ->
	document: db.get(id)
	return document

Here is the same function when db.get is asynchronous, using the currently-implemented defer feature:

get_document: (id, callback) ->
	document: defer db.get(id)
	return callback(document)

The return here is unnecessary, but included for completeness. If one were to return early from a function, you would have to write return callback(result) to ensure that the result is returned and that the function ceases execution.

In converting the synchronous case to the asynchronous one, two things were required: - a callback argument was added to the end of the argument list - the callback was invoked at every return site, as well as return

This will be the base against which we compare the alternatives. The aim is to have as few changes required, while maintaining flexibility and obviousness (since an async function must be called differently to a sync one, the programmer should know at a glance which is which).

proposal 1: infer everything

This was the initial approach. Basically, the programmer would not be required to add anything - merely use the defer keyword. If a function contains defer, then it becomes asynchronous. In order to support this, an extra (hidden) callback argument is added, and all return statements call this callback explicitly. For example, the following:

get_document: (id) ->
	document: defer db.get(id)
	return document

Is almost identical to the sync code, with the addition of the defer keyword. The code generated would be similar to the following in the current implementation:

get_document: (id, _a) ->
	document: defer db.get(id)
	return _a(document)

where _a is a generated name known only to the CoffeeScript compiler.

advantages:

you cannot possibly forget to return into your callback - the compiler manages it for you
minimal changes are required to convert a function into asynchronous code - in fact, there is no change aside from the (necessary) defer statement
since the compiler knows which argument is the callback, it could insert a check at run-time to ensure that a callback was provided, and raise a useful error message while the stack is still active, so that you can see exactly what call failed to provide a callback. This lessens the severity of the first disadvantage below, as it will be caught the first time the code is run

disadvantages:

it’s hard to tell at a glance how many arguments the function takes - the function signature lists only one, and the only way of telling is to scan for the defer keyword anywhere inside the function body
the programmer cannot specify which argument should be the callback
the programmer cannot save the callback object for later use, since its name is known only to the compiler
accepting multiple callback arguments (one for success, one for error) is not possible

proposal 2: designated callback

Instead of automatically adding a callback, how about annotating the argument list? That way a caller can easily see that a callback argument is required, but a programmer does not need to alter return statements. For example, you could use the return modifier like so:

get_document: (id, return callback) ->
	document: defer db.get(id)
	return document

I think this strikes a good balance - to convert from sync to async, one simply uses defer (which is necessary), and adds a special argument to the function signature. The compiler is then responsible for converting all return statements into calls to the callback local variable.

advantages:

minimal changes to convert a sync function (aside from the defer, only the function signature)
function signature accurately represents required arguments
the programmer cannot forget to return into the given callback, as the compiler will enforce it

disadvantages:

for the most part, callback will never actually be mentioned in the function body. That could be a little confusing
multiple callback parameters (error, ok) are still not supported

proposal 3: proposal two again, with an error callback

In order to support error and success callbacks, a modification could be made. Consider the following:

get_document: (id, throw error_callback, return callback) ->
	document: defer db.get(id)
	throw new Exception("things are looking bad") unless document.validate()
	return document

this code would indicate that callback is the callback that should be used for returning values, but that error_callback should be used for exceptions raised within that function body. That is, it will act just like the following code:

get_document: (id, error_callback, callback) ->
	document: defer db.get(id)
	return error_callback(new Exception("things are looking bad")) unless document.validate()
	return callback(document)

It’s important to note that this will only work for exceptions raised in the function body, as opposed to exceptions that bubble up from calls made in the body. Since this is a compiler-level transform, it has no knowledge of runtime exceptions - only exceptions that are thrown by the local syntax.

If this were to be used, the omission of a throw callback in the argument list could cause exceptions to be returned into the return callback. Although I’m doubtful that this is a good idea - probably exceptions should just be immediately thrown up to the browser in the case where no error callback is defined.

advantages:

minimal changes to convert a sync function (aside from the defer, only the function signature)
function signature accurately represents required arguments
the programmer cannot forget to return into the given callback, as the compiler will enforce it
exceptions can actually be used (locally) if the optional throw callback argument is used
the error_callback can be (manually) provided to defer calls that themselves take error callbacks, in order to cascade errors

disadvantages:

for the most part, callback will never actually be mentioned in the function body. That could be a little confusing

proposal 4: A less intrusive alternative: “return into”

My final idea is less intrusive, and simply ensures that all returns are made into callbacks. For example, return value into callback could be synonymous with return callback(value). It’s not shorter, but it allows for the compiler to check that all returns go via a callback function (of some sort). Any function that used defer and a return statement that didn’t have the into modifier would fail to compile - alerting the programmer to the issue, but without assuming which callback should be used.

The example would then become:

get_document: (id, callback) ->
	document: defer db.get(id)
	return document into callback

which is not much better than the current situation, except that the compiler can now tell you when you forget to return into a callback.

advantages:

least restrictive and most flexible approach
protects against some programmer errors

disadvantages:

there’s still a lot of typing to convert between a synchronous and asynchronous function
the compiler isn’t helping as much as it could
introduces a new keyword, into

Conclusion

All things considered, I favour proposal 3. I think it affords sufficient flexibility while still being as helpful to the programmer as is possible. And it’s also the most clear - all changes required to mark a function as asynchronous are entirely on the function signature line, which is the best possible place to put such information.

I’m interested to hear the thoughts of others, as well as any proposals I haven’t thought of. Please, let me know in the comments, or add your voice to the CoffeeScript issue discussion

What's all this then?

Index:

Elsewhere, internet-style:

Contact me:

Dealing with non-local control flow in deferred CoffeeScript

The problem

proposal 1: infer everything

proposal 2: designated callback

proposal 3: proposal two again, with an error callback

proposal 4: A less intrusive alternative: “return into”

Conclusion