Java non-blocking servers, and what I expect node.js to do if it is to become mature

node.js is getting a lot of attention at the moment.
It’s goal is to provide an easy way to build scalable network programs, e.g. build web servers.
It’s different, in two ways. First of all, it brings Javacript to the server. But more importantly,
it’s event based, rather than thread based, so relies on the OS to send it events when say a connection to the
server is made, so that it can handle that request.

The argument goes, that a typical web server “handles each request using a thread, which is relatively inefficient and very
difficult to use.” As an example they state that “Node will show much better memory efficiency under high loads than systems which
allocate 2MB thread stacks for each connection.”

They go on to state that “users of Node are free from worries of dead-locking the process — there are no locks.
Almost no function in Node directly performs I/O, so the process never blocks.
Because nothing blocks, less-than-expert programmers are able to develop fast systems”.

Well, reading those statements makes me think that I have problems in my world which need addressing! But then I look at my world, and realise
that I don’t have problems. How can that be? Well first of all, because we don’t have a thread per request, rather we use a thread pool, reducing
the memory overhead, and queuing incoming requests if we can’t run them in a thread instantly, until a thread becomes free in the pool.

The second reason I don’t have threading problems is because I work with Java EE. I write business software which is written as components
which get deployed to an app server. That app server is what does all the hard work with threads. All I have to do is follow some simple
and common sense rules to avoid threading issues, like ensuring that shared resources like a Map(dictionary) are instantiated using the thread safe API
which Java offers. In more extreme cases, I use the synchronized keyword to protect methods or objects being shared.

All this means that we Java programmers have become very efficient at writing software to solve business problems, rather than writing software to solve
technical problems like concurrency, scalability, security, transactions, resources, object and resource pooling, etc., etc.

So when do I need to solve those problems which lots of threads introduce, namely memory problems?
Any server which needs to maintain an open connection to the client, in order to say stream video or Voice over IP (VOIP), or handle instant
messaging, is a server where I’m going to have problems with memory if I have a thread per connection.

Now, the Java EE specifications tend to focus on multithreaded servers. And to my knowledge, there is no Java EE specification which
tells vendors how to make a server which lets people deploy software components to it, which are handled in a non-blocking manner. Sure, you could
write a Servlet compliant server like Tomcat with a non-blocking engine under the hood, but Java EE doesn’t talk about streaming. And probably
99% of websites out there don’t do the kind of streaming which is going to cause memory issues.

So there certainly are times when something like node.js will be useful. Or at least the idea of a non-blocking server. But looking into the
details, node.js isn’t something that I would seriously use if I needed such a server. The main problems are that:

  1. you deploy Javascript to the server
  2. it doesn’t seem to specify a standardised way of writing components so that I can concentrate on writing business software rather than
    solving technical issues
  3. while it is rumoured to be fast, studies such as this suggest it’s half as slow as Java
  4. compared to Java, the node.js API and Javascript libraries available today are immature – think about things like sending email, ORM, etc.
  5. node.js has political problems because it relies on Google. If Google don’t want to support Javascript on the server (their V8 engine is designed
    for Chrome, client side), and node.js needs a patch or development on V8 to handle a serverside issue, they might never get it.
    OK, like I have a chance to get a Java bug patched by Oracle 😉

It started for me, with this code snippet, taken from the node.js site:

    var net = require('net');

    var server = net.createServer(function (socket) {
      socket.write("Echo server\r\n");
      socket.pipe(socket);
    });

    server.listen(1337, "127.0.0.1");

Well, if the point of node.js is to make it really easy to create a server, and pass it a function for handling requests, I can do that in
Java too:

TCPProtocol protocol = new TCPProtocol(){

    public void handleRequest(ServerRequest request, 
                              ServerResponse response) {

        //echo what we just received
        try {
            response.write(request.getData());
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
};

Configuration config = new Configuration(1337, protocol);
ServerFactory.createServer(config).listen();

OK, it’s a tiny bit more complicated. But the reason is, I decided to write a non-blocking server which can be configured to handle any kind of request.
The configuration takes a port number for the server to run on, and a protocol object. The protocol object is a subclass of an abstract protocol,
and its job is to look at the data coming in from the wire and decide how to handle it. A TCPProtocol as shown above is simplest protocol,
in that it does nothing. You need to tell it what to do, by overriding the handleRequest() method (analagous to supplying a function
in Javascript), for example as shown above, by returning to the client what it sent to the server.

But I wanted to do something more useful, so I extended what I wrote to handle voice over IP, or at least a client streaming an MP3 to a different
client, to simulate a phone call between the two of them.

The first step was to design the protocol. At the byte level, the first byte in a packet sent to the server contains the command, or action, the
second byte contains the length of the payload, if any, and the subsequent bytes contain the payload.

The protocol then lets a caller login (returning a session ID), start a call (returning an OK/NOK), send data for the call, end a call,
and quit (logout).

After a few hours I had my laptop playing my favourite MP3s, streamed to my laptop from a different PC. But it looked clunky. I had a think,
and took inspiration from the Servlet and EJB specifications in Java EE. Namely, I wanted to have deployable components whose sole job was to handle
a business requirement.

In .NET and Java you can define a web service or web page (Servlet) by annotating the class. In the annotation, you provide a URL path, which the server uses to
determine when to call your servlet or web service. This is similar to a “command” or “action” as I had in my VOIP protocol.

So I wrote a little container, which sits inside the non-blocking server. The server takes the incoming request and once it has handled the low
level non-blocking stuff, it passes the request over to the configured protocol. The VOIP protocol then extracts the command (first byte) from
the incoming request. It then uses a “context” object, which it gets injected into it from the container, to send the incoming request to a
Handler. The handler is a class which has an annotation to state which command it can handle. The handler is then analagous to a web
service or servlet. It is a piece of busienss code which handles an incoming request, based on the command (path) which the client is requesting.

Put together, it looks as follows. The first bit is the protocol’s handler method which determines the command and sends it to a handler:

    public void handleRequest(ServerRequest request,
	                          ServerResponse response) {

        String criterion = String.valueOf(request.getData()[0]);
        try{
            //call through to the context for help - it
            //will call the handler for us.
            getContext().handleRequest(criterion, request, response);
        } catch (UnknownHandlerException e) {
            //send NOK to client because of unknown command
            response.write(new Packet(UNKNOWN_COMMAND).marshal());
        }
    }

A Packet object in the above code is simply an encapsulation of the data sent over the wire, but knows that the first byte is the command,
the second is the length and
subsequent bytes are the payload. In an HTTP request, this Packet object would be something which knew about the HTTP header’s attributes and
the HTTP payload, rather than the ServerRequest and ServerResponse objects which only know about byte arrays and
sockets/channels.

The second part of the solution are then the handlers. These are contained in the configuration, either created programatically as shown
in the TCP example above, or created using XML which the server reads upon startup.
The container then locates, instantiates and calls these handlers
on behalf of the protocol, when the protocol object calls getContext().handleRequest(criterion, request, response) in the above
snippet. A typical handler looks like this:

@Handler(selector=VoipProtocol.LOGIN)
public class VoipLoginHandler extends VoipHandler {

    public void service(VoipRequest request,
                        VoipResponse response)
                        throws IOException {
		
        String name = request.getPacket().getPayloadAsString();
        Participant p = new Participant(
                        name, response.getSocketChannel());
        getProtocol().getModel().addParticipant(p);
        request.getKey().attach(p);
		
        //ACK
        String sessId = p.getSessId().getBytes(VoipProtocol.CHARSET);
        Packet packet = new Packet(sessId, VoipProtocol.LOGIN)
        response.write(packet);
	}

}

So, the service method is the only one in a handler,
and is in charge of doing business stuff. There is no techy stuff here. The code works out who is logging in,
creates an instance of them (the Participant object) in the
model. The model is contained in the protocol object, which exists just once in the server. Compared to a servlet, the call to get the model
is analagous to putting data into the application scope. The code then attaches the participant object
to the client connection using the attach(Object) method (see Java NIO Package) so that we can always get straight to the
relevant part of the data model from the connection object, when subsequent requests arrive. Finally, the code responds to the client with
the session ID as an aknowledgement. Note that here, I haven’t bothered to authenticate the password – the payload only contains the username.
But if I were writing a complete app, I would have more data in my payload, perhaps even something like XML or JSON, and I would authenticate
the username and password.

The @Handler annotation at the top of the handler class is analagous to the @WebServlet annotation applied to
Java EE Servlets. It has a “selector” attribute which the container uses to compare to the criterion which the protocol extracts from the
request. This is analagous to the URL patterns attribute in @WebServlet which tells a Java EE web container to which path
the servlet should be mapped. There is a little bit of hidden magic too – the service method in the
handler already knows VoipRequest and VoipResponse, rather than ServerRequest and
ServerResponse. The superclass does this magic, by implementing
the standard service method and calling the specialised abstract service method, implemented in subclasses.

So, being creative, I added a further attribute to the @Handler annotation. It is called runAsync and defaults to false.
But, if it is set to true, then the container sends the handler to a thread pool for execution sometime in the future. I don’t actually use this in the
VOIP example, but I did this, to show that it is perfectly feasible, that an app server can do such things. The developer doesn’t need to worry
about threads or anything – they simple configure the annotation and the container handles the hard parts. This is typical of Java EE! And it
becomes extremely useful in cases where a request needs a little more time to execute. In a non-blocking single threaded process,
while connections are concurrently connected to the server, they are serviced sequentially, meaning that they MUST return fast, if you don’t want
those in the queue to wait too long. This is something which node.js CANNOT do, because it has no way of starting threads. Their solution is to
send an async request to a different process. But to do that, the developer is spending time working on technical issues, rather than getting
on and trusing the container to do it for them, so that they can spend more time writing cost effective business code. One of the main
reasons to use Java EE is that the developer can spend more time writing business software and less time handling techie problems.
There is no reason why a handler couldn’t also have other annotations:

@Handler(selector=VoipProtocol.QUIT)
@RolesAllowed({"someRole", "someOtherRole"})
@TransactionManagement(TransactionManagementType.CONTAINER)
public class VoipQuitHandler extends VoipHandler {

    @PersistenceContext(unitName="persistenceUnitName")
    private EntityManager em;
    .
    .
    .
}

The @RolesAllowed@TransactionManagement annotation means that transactions will be handled by the container
rather than by the programmer. When the handler is executed, the @PersistenceContext annotation causes the container to
inject a JPA entity manager so that the business code has access to a database. This is exactly the same way that a Servlet or EJB gets resources
from the container. Those resources are created based on the app server configuration, in a standardised way and are managed by the
container too (pooling, reconnecting, etc.), again, relieving the programmer from that burden.

So what do we have now? Instead of a low level API like node.js provides, we have a high level container for running software components within
a non-blocking server. What we don’t have, is Javascript running on the server, because it seemed like the ideal language for handling
callbacks in a non-blocking environment, because it has an event queue and function pointers.

Could this little exercise be the basis of a JSR? Well, is it really useful? Only really in cases where you have clients which need to keep
connections open to the server for a long time, and you have thousands of clients. There won’t be many people needing to do this, and there
are more important JSRs awaiting acceptance and implementation. But who knows what the future will bring.

To summarise, node.js makes me uneasy, because I don’t like the idea of Javascript and its immature stack of libraries being deployed
on a production server. Whenever I develop with Javascript I spend a lot more time with the debugger than I would like to, because the
language is not entirely checkable using static analysis because it is duck typed.
But the idea of building a non-blocking server using Java – now that’s interesting (to me at least).
But like I said, I’m just not sure how often will it be useful.

It seems to me, that the revolution that node.js is starting isn’t really about Javascript on the server. It is more about
using a non-blocking server. But the problem is, probably 99% of our needs are already
satisfied with standard multithreaded servers. And we can already write scalable websites, without all those problems which node.js claims we have.
So let’s not rebuild the world based on non-blocking I/O, just because node.js has arrived. Let’s build/rebuild just those very special
cases, where non-blocking I/O will really help us.

Does node.js deserve the hype it is getting? I don’t think that it deserves hype because you can run Javascript on the server –
that’s a bad thing. Douglas Crockford (senior JavaScript architect at Yahoo!) even hints this, when he
says:

“The big surprise for me in this is we’re about to take maybe the most important step we’ve ever taken in terms of the technology of the web,
and JavaScript is leading the way.”

He seems to be saying that Javascript is leading the way in the most important step we are ever taking, and not that Javascript IS the most important
step we are taking. That is, non-blocking servers are the most important step, and having an event loop on the server is the way forward.
The way I understand it, he is saying that non-blocking is the revolution.
The problem I have in joining in on this revolution is that non-blocking servers are not ncessarily the best thing ever.
They only really help, when you have thousands of clients needing to keep their connections to the server open. HTTP (99% of the web) doesn’t need
non-blocking I/O to become the technology leader of the web – it already is. So instead of joining in this revolution and all the hype that
node.js is stirring up, I will ignore it and continue building software the way I have been.

The code for the examples given above are in two Eclipse projects, which you can download
here. The first project is the framework
itself, including the non-blocking server, container and relevant classes and interfaces. The second project is an example of how to
use the framework to build (business) apps. It contains the TCPServerRunner class for running a TCP echo server.
It also contains the VoipServerRunner which starts the VOIP Server.
To run the streaming example, first run the ListeningClient, followed by the SendingClient.
Change line 74 of the SendingClient to use your favourite MP3, and you should hear it stream the first 20 seconds.
The VOIP server itself isn’t 100% reliable – especially once clients have
disconnected. But I guess node.js wasn’t that stable after only a small number of hours of development. Good luck!

PS. Performance: streaming at 192 kilobits a second, the server told me it was running around half a percent load (i.e. the event loop was
idle 99.5% of the time). GSM, rather than high quality MP3 uses around 30 Kbps, so for a real VOIP server,
you could probably get away with 1200 simultaneous calls. That doesn’t seem that many, but I have no idea really.
I guess I have 2 CPU cores which I could use
so that I could use a load balancer to spread the load among two processes, which is the node.js way of using all the cores. But the load balancer
wouldn’t be doing much less work than my server does, so might not be able to handle any extra load itself. Or I could stick my handlers in a
thread pool, using the runAsync attribute of my @Handler annotation.
So a low cost commodity server could handle nearly 2500 simultaneous calls. Still not many? Again, I really don’t know. But I should say that I
didn’t try optimising the server. My packet sizes are based on sending one packet of sound data every 12 milliseconds.
Perhaps I could get away with sending them less frequently which might improve throughput and still have good call quality with low latency.
Who knows – anyone wanting to optimise it, let me know the results! One thing for sure is that a server using one thread per connection with
2500 simultaneous connections, will struggle. While my
previous blog article
showed it is possible to have many thousand threads open,
the context switching is likely to become the bottle neck. A non-blocking server is certainly the way forward for this use case.

Download the code here.

Copyright © 2011 Ant Kutschera