Friday, January 19, 2007

Asynchronous?


As I do not have a CS background I have always been confused
when people used the term asynchronous. This entry has been
prompted by Dave Orchard's description of  SOA

HTTP is a network protocol. HTTP is asynchronous. It uses a
Request-Response pattern, a client sends request and the server
sends a response. However there is nothing in the protocol about
how long the response will take, therefore it is an asynchronous
protocol. TCP is a synchronous protocol, there are various timeouts
specified in the standard and, agents are supposed to react in
specified ways to these timeouts.

However most HTTP libraries use synchronous function calls: a client
makes a call to a function in the HTTP library, the library sends a
request to a HTTP server and returns the response message to the
client using the return mechanism of the function. Most HTTP
libraries allow the client to pass a timeout value to the library,
if the server does not respond within the timeout then the library
reports an error to the client. The timeout is specified by the client,
not by the HTTP standard.

Some HTTP libraries support asynchronous function calls: for example
the client makes a function call to the HTTP library, the call does
not block but returns immediately, the client can do some other processing
before making another function call to pick up the HTTP response.
LWP::Parallel is an example of such a library. There are also terms like
blocking/non-blocking etc which can be used to describe function calls.

So there are two concepts of asynchronous/synchronous at play here: one
that is used by people like Lamport and Lynch to describe distributed systems
and one used by people to talk about programming models. As I don't have a
background in CS this confused me for a long time, I was never sure how people
were using the term. The issue is further complicated by the fact that if you
are building an asynchronous system, per Lamport and Lynch, then using an
asynchronous programming model is more suited to the task.

So reading Dave Orchard's blog entry on SOA I am wondering what he means
by advocating "asynchronous". He is describing an approach to
distributed computing called SOA, so I guess he must be talking about
the Lynch/Lamport definition of asynchronous. This has consequences,
namely that you won't be able to do very much.

In an asynchronous system it is impossible to reach consensus with just one
faulty processor, even with a perfect network. Consensus is the problem of
getting a set of processors to agree a value. Consider the case of submitting
a purchase order in an asynchronous system, you send the request but the
request can take an infinite time to reach the server, the server can take
an infinite time to process the message and the response can take an infinite
time to return. How can you tell if the server failed? On the Web this is
equivalent to hitting the submit button and nothing happening - what do you
do? Wait a bit longer, wait for an e-mail, re-try the submit button,
ring/e-mail the server administrator etc, some internal clock triggers an action.
In an asynchronous system you just wait, if you include timeouts then it is
no longer asynchronous.

There are three timing models in distributed systems: synchronous,
partially-synchronous, and asynchronous. For definitions see

Consensus in the Presence of Partial Synchrony
:



  • synchronous: In a synchronous system, there is a known fixed upper
    bound on the time required for a message to be sent from one processor
    to another and a known fixed upper bound on the relative speeds of
    different processors.


  • asynchronous: In an asynchronous system no fixed upper bounds exist
    on the time required for a message to be sent from one processor
    to another, nor on the relative speeds of different processors.


  • partially-synchronous: Fixed bounds are known to exist but are not
    know a priori, or in another version the fixed bounds are known
    but are only guaranteed to hold starting at some unknown time.


Designing algorithms for asynchronous systems is attractive because
they will work in any system, synchronous or partially-synchronous.
So using an asynchronous model for the internet makes sense, but you
are restricted in what you can do. It is interestingly to note that
some Web Service specifications are synchronous. For example WS-Security
uses time ranges to define how long a message is valid for,
messages that arrive outside this range are not processed. This causes
problems when systems that have a large clock skew, I have seen many weird
bugs arise due to this including messages arriving before they were sent!

So choosing an asynchronous timing model for your distributed system
may not be so attractive after all. Perhaps Dave meant an asynchronous
programming model? At first this doesn't make sense, since he is talking
about an architecture for a distributed system, how you write code
for such a system should be an orthogonal issue. However looking through
the discussions of SOA I can see a pattern.

Remote Procedure Call, RPC, as introduced in IETF RFC 707, introduced an
abstraction which allowed programmers to think that invoking a remote
service was the same as invoking a local function call. The abstraction
was extended by the concept of distributed objects, the programmer could
think of a remote object as being just like a local object. The idea
behind this thinking is that programmers are familiar with procedure
calls and objects, and so RPC and distributed objects will be an easy
path into distributed computing for a programmer already familiar with
procedure calls and objects. However this abstraction was shown to be
a bad one, perhaps most famously by Waldo et al. By pretending remote
things are just like local things, the programmer is lead to ignore
latency and partial failures. (Though it is hard to imagine people like
Jim Gray and Butler Lampson would fall into this trap.)

There has been a lot of debate in the past with issues like SOA != RPC
and WS != Distributed Objects. However I think there are two different
issues, one is the architecture of a distributed system, the other is
the programming model. I can take a good architecture, for example REST,
and use a bad programming model, for example RPC, to build a client
library.

A Web services toolkit can attempt to hide the complexity associated
with distributed systems, and it may initially be easy and familiar to use,
but ultimately it will betray the programmer by misleading him just like RPC.
Or you can design a better programming model, which does not mislead the
programmer but would probably be more a difficult programming environment.
A naive programmer might think the first is better because it is easier
to use. There is a balance to be struck.

An asynchronous programming style forces the programmer to
think about the system more carefully as a distributed system,
perhaps this is why Dave is advocating asynchronous. However to
me, someone who uses raw sockets a lot, it muddies the discussion
on architectures of distributed systems.

REST doesn't prescribe a programming model it is focused purely on
the architecture of the system, you can write bad HTTP clients and
services, or you can write good ones like LWP. SOA/WS-* has had its feet
stuck in the debate about programming models, which has caused
me confusion.

0 Comments:

Post a Comment

<< Home