<< June 2015 | Home | August 2015 >>

Is Asynchronous EJB Just a Gimmick?

In previous articles (here and here) I showed that creating non-blocking asynchronous applications could increase performance when the server is under a heavy load. EJB 3.1 introduced the @Asynchronous annotation for specifying that a method will return its result at some time in the future. The Javadocs state that either void or a Future must be returned. An example of a service using this annotation is shown in the following listing:

The annotation is on line 4. The method returns a Future of type String and does so on line 10 by wrapping the output in an AsyncResult. At the point that client code calls the EJB method, the container intercepts the call and creates a task which it will run on a different thread, so that it can return a Future immediately. When the container then runs the task using a different thread, it calls the EJB's method and uses the AsyncResult to complete the Future which the caller was given. There are several problems with this code, even though it looks exactly like the code in all the examples found on the internet. For example, the Future class only contains blocking methods for getting at the result of the Future, rather than any methods for registering callbacks for when it is completed. That results in code like the following, which is bad when the container is under load:

This kind of code is bad, because it causes threads to block meaning that they cannot do anything useful during that time. While other threads can run, there needs to be a context switch which wastes time and energy (see this good article for details about the costs, or the results of my previous articles). Code like this causes servers that are already under load to come under even more load, and grind to a halt.

So is it possible to get the container to execute methods asynchronously, but to write a client which doesn't need to block threads? It is. The following listing shows a servlet doing so.

Line 1 declares that the servlet supports running asynchronously - don't forget this bit! Lines 8-10 start writing data to the response but the interesting bit is on line 13 where the asynchronous service method is called. Instead of using a Future as the return type, we pass it a CompletableFuture, which it uses to return us the result. How? Well line 16 starts the asynchronous servlet context, so that we can still write to the response after the doGet method returns. Lines 17 onwards then effectively register a callback on the CompletableFuture which will be called once the CompletableFuture is completed with a result. There is no blocking code here - no threads are blocked and no threads are polled, waiting for a result! Under load, the number of threads in the server can be kept to a minimum, making sure that the server can run efficiently because less context switches are required.

The service implementation is shown next:

Line 7 is really ugly, because it blocks, but pretend that this is code calling a web service deployed remotely in the internet or a slow database, using an API which blocks, as most web service clients and JDBC drivers do. Alternatively, use an asynchronous driver and when the result becomes available, complete the future as shown on line 9. That then signals to the CompletableFuture that the callback registered in the previous listing can be called.

Isn't that just like using a simple callback? It is certainly similar, and the following two listings show a solution using a custom callback interface.

Again, in the client, there is absolutely no blocking going on. But the earlier example of the AsyncServlet2 together with the Service3 class, which use the CompletableFuture are better for the following reasons:
  • The API of CompletableFuture allows for exceptions / failures,
  • The CompletableFuture class provides methods for executing callbacks and dependent tasks asynchronously, i.e. in a fork-join pool, so that the system as a whole runs using as few threads as possible and so can handle concurrency more efficiently,
  • A CompletableFuture can be combined with others so that you can register a callback to be called only when several CompletableFutures complete,
  • The callback isn't called immediately, rather a limited number of threads in the pool are servicing the CompletableFutures executions in the order in which they are due to run.
After the first listing, I mentioned that there were several problems with the implementation of asynchronous EJB methods. Other than blocking clients, another problem is that according to chapter 4.5.3 of the EJB 3.1 Spec, the client transaction context does not propagate with an asynchronous method invocation. If you wanted to use the @Asynchronous annotation to create two methods which could run in parallel and update a database within a single transaction, it wouldn't work. That limits the use of the @Asynchronous annotation somewhat.

Using the CompletableFuture, you might think that you could run several tasks in parallel within the same transactional context, by first starting a transaction in say an EJB, then creating a number of runnables and run them using the runAsync method which runs them in an execution pool, and then register a callback to fire once all were done using the allOf method. But you're likely to fail because of a number of things:
  • If you use container managed transactions, then the transaction will be committed once the EJB method which causes the transaction to be started returns control to the container - if your futures are not completed by then, you will have to block the thread running the EJB method so that it waits for the results of the parallel execution, and blocking is precisely what we want to avoid,
  • If all the threads in the single execution pool which runs the tasks are blocked waiting for their DB calls to answer then you will be in danger of creating an inperformant solution - in such cases you could try using a non-blocking asynchronous driver, but not every database has a driver like that,
  • Thread local storage (TLS) is no longer usable as soon as a task is running on a different thread e.g. like those in the execution pool, because the thread which is running is different from the thread which submitted the work to the execution pool and set values into TLS before submitting the work,
  • Resources like EntityManager are not thread-safe. That means you cannot pass the EntityManager into the tasks which are submitted to the pool, rather each task needs to get hold of it's own EntityManager instance, but the creation of an EntityManager depends on TLS (see below).
Let's consider TLS in more detail with the following code which shows an asyncronous service method attempting to do several things, to test what is allowed.

Line 12 is no problem, you can rollback the transaction that is automatically started on line 9 when the container calls the EJB method. But that transaction will not be the global transaction that might have been started by code which calls line 9. Line 16 is also no problem, you can use the EntityManager to write to the database inside the transaction started by line 9. Lines 4 and 18 show another way of running code on a different thread, namely using the ManagedExecutorService introduced in Java EE 7. But this too fails anytime there is a reliance on TLS, for example lines 22 and 31 cause exceptions because the transaction that is started on line 9 cannot be located because TLS is used to do so and the code on lines 21-35 is run using a different thread than the code prior to line 19.

The next listing shows that the completion callback registered on the CompletableFuture from lines 11-14 also runs in a different thread than lines 4-10, because the call to commit the transaction that is started outside of the callback on line 6 will fail on line 13, again because the call on line 13 searches TLS for the current transaction and because the thread running line 13 is different to the thread that ran line 6, the transaction cannot be found. In fact the listing below actually has a different problem: the thread handling the GET request to the web server runs lines 6, 8, 9 and 11 and then it returns at which point JBoss logs JBAS010152: APPLICATION ERROR: transaction still active in request with status 0 - even if the thread running line 13 could find the transaction, it is questionable whether it would still be active or whether the container would have closed it.

The transaction clearly relies on the thread and TLS. But it's not just transactions that rely on TLS. Take for example JPA which is either configured to store the session (i.e. the connection to the database) directly in TLS or is configured to scope the session to the current JTA transaction which in turn relies on TLS. Or take for example security checks using the Principal which is fetched from EJBContextImpl.getCallerPrincipal which makes a call to AllowedMethodsInformation.checkAllowed which then calls the CurrentInvocationContext which uses TLS and simply returns if no context is found in TLS, rather than doing a proper permission check as is done on line 112.

These reliances on TLS mean that many standard Java EE features no longer work when using CompletableFutures or indeed the Java SE fork-join pool or indeed other thread pools, whether they are managed by the container or not.

To be fair to Java EE, the things I have been doing here work as designed! Starting new threads in the EJB container is actually forbidden by the specs. I remember a test I once ran with an old version of Websphere more than ten years ago - starting a thread caused an exception to be thrown because the container was really strictly adhering to the specifications. It makes sense: not only because the number of threads should be managed by the container but also because Java EE's reliance on TLS means that using new threads causes problems. In a way, that means that using the CompletableFuture is illegal because it uses a thread pool which isn't managed by the container (the pool is managed by the JVM). The same goes for using Java SE's ExecutorService as well. Java EE 7's ManagedExecutorService is a special case - it's part of the specs, so you can use it, but you have to be aware of what it means to do so. The same is true of the @Asynchronous annotation on EJBs.

The result is that writing asynchronous non-blocking applications in a Java EE container might be possible, but you really have to know what you are doing and you will probably have to handle things like security and transactions manually, which does sort of beg the question of why you are using a Java EE container in the first place.

So is it possible to write a container which removes the reliance on TLS in order to overcome these limitations? Indeed it is, but the solution doesn't depend on just Java EE. The solution might require changes in the Java language. Many years ago before the days of dependency injection, I used to write POJO services which passed a JDBC connection around from method to method, i.e. as a parameter to the service methods. I did that so that I could create new JDBC statements within the same transaction i.e. on the same connection. What I was doing was not all that different to what things like JPA or EJB containers need to do. But rather than pass things like connections or users around explicitly, modern frameworks use TLS as a place to store the "context", i.e. connections, transactions, security info, etc. centrally. As long as you are running on the same thread, TLS is a great way of hiding such boilerplate code. Let's pretend though that TLS had never been invented. How could we pass a context around without forcing it to be a parameter in each method? Scala's implicit keyword is one solution. You can declare that a parameter can be implicitly located and that makes it the compilers problem to add it to the method call. So if Java SE introduced such a mechanism, Java EE wouldn't need to rely on TLS and we could build truly asynchronous applications where the container could automatically handle transactions and security by checking annotations, just as we do today! Saying that, when using synchronous Java EE the container knows when to commit the transaction - at the end of the method call which started the transaction. If you are running asynchronously you would need to explicitly close the transaction because the container could no longer know when to do so.

Of course, the need to stay non-blocking and hence the need to not depend on TLS, depends heavily on the scenario at hand. I don't believe that the problems I've described here are a general problem today, rather they are a problem faced by applications dealing with a niche sector of the market. Just take a look at the number of jobs that seem to be currently on offer for good Java EE engineers, where synchronous programming is the norm. But I do believe that the larger IT software systems become and the more data they process, the more that blocking APIs will become a problem. I also believe that this problem is compounded by the current slow down in the growth hardware speed. What will be interesting to see is whether Java a) needs to keep up with the trends toward asynchronous processing and b) whether the Java platform will make moves to fix its reliance on TLS.

Copyright © 2015, Ant Kutschera
Social Bookmarks :  Add this post to Slashdot    Add this post to Digg    Add this post to Reddit    Add this post to Delicious    Add this post to Stumble it    Add this post to Google    Add this post to Technorati    Add this post to Bloglines    Add this post to Facebook    Add this post to Furl    Add this post to Windows Live    Add this post to Yahoo!

Javascript everywhere

I can think of at least two scenarios when you might need to run the same algorithms in both a Javascript and a Java environment.

  • A Javascript client is running in offline mode and needs to run some business logic or complex validation which the server will need to run again later when say the model is persisted and the server needs to verify the consistent valid state of the model,
  • You have several clients which need access to the same algorithms, for example a Javascript based single page application running in the browser and web service which your business partners use which is deployed in a Java EE application server.

You could build the algorithm twice: once in Java and once in Javascript but that isn't very friendly in terms of maintenance. You could build it once in just Java and make the client call the server to run the algorithm, but that doesn't work in an offline application like you can build using HTML 5 and AngularJS, and it doesn't make for a very responsive client.

So why not build the algorithm just once, using Javascript, and then use the javax.script Java package which first shipped with Java SE 7 and was improved with Java SE 8, in order to execute the algorithm when you need to use it from Java? That is precisely what I asked myself, and so I set about building an example of how to do it.

The first thing I considered was deployment and how the browser could access the Javascript. I didn't want anything with complex build processes which for example copy code maintained in a Node.js environment into a repo like Nexus. Rather I wanted to just drop the Javascript into a Java source folder and be able to use it from there. By having Javascript in a source folder of a Java project, it is automatically deployed inside a web archive so I built a little Servlet capable of serving the Javascript to a client over HTTP. Listing 1 shows the Servlet.

@WebServlet("/ScriptLoader.js")
public class ScriptLoader extends HttpServlet {

  protected void doGet(HttpServletRequest request, ...

    String script = request.getParameter("script");

    response.setContentType("text/javascript");

    Classloader cl = this.getClass().getClassLoader();
    try (InputStream is = cl.getResourceAsStream(script)) {
      int curr = -1;
      while ((curr = is.read()) != -1) {
        response.getOutputStream().write(curr);
      }
    }
  }
}

Listing 1: A Servlet capable of reading Javascript deployed in a WAR

Using the Servlet, you can load the Javascript into the browser, using normal HTML:

    <script type="text/javascript" src="ScriptLoader.js?script=rules.js">

Let's take a look at the Javascript algorithm in Listing 2.

;
(function() {
    var _global = this;

    ///////////////////////////////////////////
    // some javascript that we want to be able
    // to run on the server, but also on the
    // client
    ///////////////////////////////////////////
    function rule419(input) {
        return _(input)
        .filter(function(e){ 
            return e.name === "John"; 
        })
        .value().length == 1 ? "OK" : "Scam";
    };

    ///////////////////////////////////////////
    //create and assemble object for exporting
    ///////////////////////////////////////////
    var maxant = {};
    maxant.rule419 = rule419;

    ///////////////////////////////////////////
    //export module depending upon environment
    ///////////////////////////////////////////
    if (typeof (module) != 'undefined' && module.exports) {
        // Publish as node.js module
        module.exports = maxant;
    } else if (typeof define === 'function' && define.amd) {
        // Publish as AMD module
        define(function() {
            return maxant;
        });
    } else {
        // Publish as global (in browsers and rhino/nashorn)
        var _previousRoot = _global.maxant;

        // **`noConflict()` - (browser only) to reset global 'maxant' var**
        maxant.noConflict = function() {
            _global.maxant = _previousRoot;
            return maxant;
        };

        _global.maxant = maxant;
    }
}).call(this);

Listing 2: An example of an algorithm that we want to run in both Java and Javascript environments

Lines 10-16 are the algorithm that we want to run. The rest of the script is standard Javascript boiler plate when you want to create a script that can be run in all kinds of environments, for example, the browser, Rhino/Nashorn, require (e.g. Node.js) or AMD. Lines 27-35 and lines 39-43 aren't really necessary in the context of this article because we only run the code in the browser or Rhino/Nashorn, and we don't really care about being kind enough to provide a function for making our script non-conflicting if some other script is loaded with the same "namespace" (in this case 'maxant'). The algorithm itself, rule419 isn't really that interesting, but notice how it makes use of lodash when it wraps the input by calling the function _(...). That shows that we are able to make use of other modules loaded into the global space who use script patterns similar to that shown in Listing 2. For demo purposes, I have created an algorithm which simply counts the number of times that 'John' is present in the model. The model is an array of objects with the name attribute. In reality, if we are going to go to the extent of writing an algorithm in just Javascript but making it possible to run it both in Java and Javascript environments, I hope that it would be a damn site more complicated that the algorithm shown here :-)

To make the algorithm useful, I have chosen to run it from a mini- AngularJS application. If you aren't familiar with AngularJS then all you need to know is that you can now call rule419 from Javascript like this:

    var result = maxant.rule419(model);

In AngularJS it is recommended to make modules injectable, which can easily be done as shown in Listing 3, when bootstrapping the application. See here for more details.

'use strict';

// Declare app which depends on views, and components
angular.module('app', [
  'ngRoute',
  'app.view1'
])...

//make rules injectable
.factory('maxant', function(){return window.maxant;})
...

Listing 3: app.js - Making rules injectable in AngularJS

The AngularJS controller can then inject the module as shown in Listing 4.

...
.controller('View1Ctrl', ['$scope', '$http', '$routeParams', 'maxant',
  function($scope, $http, $routeParams, maxant) {

    //create a model
    var model = [
                 {name: 'Ant'}, 
                 {name: 'John'}
                ];
...    
    //execute javascript that can also be executed on the server using Java
    $scope.clientResult = maxant.rule419(model);    
...
}]);

Listing 4: Controller code making use of the injected module in order to run the algorithm in question

Lines 2-3 inject the module named maxant and line 12 then uses the rule419 function and puts the result into scope so that it can be displayed to the user.

The harder part of this exercise is getting the Javascript to be runnable in a Java environment like a Java EE application server or a batch program. This Java 7 link and this Java 8 link give examples of how to run Javascript from Java. I've encapsulated code from those examples to provide a very simple API which Java application code can use to run the Javascript.

//instantiate the engine
Engine engine = new Engine(Arrays.asList("lodash-3.10.0.js", "rules.js"));

//prepare data model - the input to the javascript
Person[] people = new Person[]{new Person("John"), new Person("John")};

//invoke engine
String result = engine.invoke(people, "maxant", "rule419");

//evaluate output
assertEquals("Scam", result);

Listing 5: Calling the Javascript algorithm from Java

Listing 5 starts on line 2 by instantiating the abstraction that I have created (see Listing 6). It simply needs a list of Javascript file names which it should load into the scripting engines' space. The two Javascript files used in this example live in the src/main/resources folder so that Maven packs them into the JAR/WAR that is built. That makes them accessible via the Classloader as shown in Listings 1 and 6. Listing 5 then continues on line 5 by creating some kind of data model which is the input to the algorithm. The Person class has an attribute named name which is private, but nonetheless used by the algorithm. In Java we would use getter/setter methods to access that data, but notice line 13 of Listing 2 just references the attribute by its name and not by a bean-style accessor method. More on that shortly... Line 8 of Listing 5 then executes the algorithm, by telling the engine to invoke rule419 of the maxant module, using people as the input. The result of the Javascript execution which happens under the hood can then be used, for example on line 11. It doesn't have to be a String, it could also be a Java object which the rule returns.

The implementation behind that simple API is shown in Listings 6, 7 and 8. That code can be thought of as library code.

public Engine(List<String> javascriptFilesToLoad) {
  ScriptEngineManager engineManager = new ScriptEngineManager();
  engine = engineManager.getEngineByName("nashorn");
  if (engine == null) {
      //java 7 fallback
      engine = engineManager.getEngineByName("JavaScript");
  }

  //preload all scripts and dependencies given by the caller
  for (String js : javascriptFilesToLoad) {
      load(engine, js);
  }

  referenceToJavascriptJSONInstance = engine.eval("JSON");
}

Listing 6: The implementation behind the neat API, part 1

Listing 6 shows the constructor which first instantiates a ScriptEngineManager from the javax.scripting package and uses it to create the real Javascript engine which executes the algorithm in question, rule419. Notice the fallback used for making the code compatible with Java 7 and 8. I didn't try it but probably just getting the engine by the name "JavaScript" suffices... The constructor does two more things: it loads all the scripts which the caller wants in scope and then gets itself a reference to the Javascript JSON object which is used later for converting JSON to Javascript objects.

private void load(ScriptEngine engine, String scriptName) {
  //fetch script file from classloader (e.g. out of a JAR) 
  //and put it into the engine
  ClassLoader cl = this.getClass().getClassLoader();
  try (InputStream script = cl.getResourceAsStream(scriptName)) {
      engine.eval(new InputStreamReader(script));
  }
}

Listing 7: The implementation behind the neat API, part 2

Listing 7 is called from Listing 6 and shows how the Javascript files are loaded using the Classloader and then sent to the ScriptEngine on line 6. Finally, Listing 8 shows the process of invoking the Javascript.

public <T> T invoke(Object input, String module, 
                              String functionNameToExecute) {

  final Invocable invocable = (Invocable) engine;

  //convert
  ObjectMapper om = new ObjectMapper();
  String dataString = om.writeValueAsString(input);
  Object data = invocable.invokeMethod(
                          referenceToJavascriptJSONInstance, 
                          "parse", dataString);

  //execute
  Object moduleRef = engine.get(module);
  Object result = invocable.invokeMethod(moduleRef, 
                                  functionNameToExecute, data);
  return (T) result;
}

Listing 8: The implementation behind the neat API, part 3

First, lines 6-7 use Jackson to convert the Java model into a JSON string. That JSON string is then parsed inside the ScriptEngine on lines 9-11, which is equivalent to calling JSON.parse(dataString) in Javascript. The conversion from Java to JSON and then Javascript isn't strictly necessary since Rhino/Nashorn know how to use Java objects. But I've chosen to do it that way so that you can skip wrapping all the attributes in accessor code like getXYZ(...). This way, the Javascript can simply access fields like this: input[2].name which gets the third elements attribute called name, rather than say input.get(2).getName(), like you would have to do in plain old Java.

There is one thing to note: as far as I can tell, Rhino and Nashorn are not thread-safe, so you might need to think about that in any implementation that you write. I would probably use a pool of Engine objects, each pre-loaded with the necessary scripts, because it is the instantiation and setup that takes the longest amount of time.

So, to conclude, it is indeed possible to write algorithms once and run them in both Javascript and Java environments, but unlike Java and its "write once, run anywhere" slogan, the algorithms are written in Javascript.

All the code for this demo is available at Github/maxant/javascript-everywhere.

Copyright ©2015, Ant Kutschera

Social Bookmarks :  Add this post to Slashdot    Add this post to Digg    Add this post to Reddit    Add this post to Delicious    Add this post to Stumble it    Add this post to Google    Add this post to Technorati    Add this post to Bloglines    Add this post to Facebook    Add this post to Furl    Add this post to Windows Live    Add this post to Yahoo!