Graphical User Interfaces (GUIs) tend to be event driven. A user performs an action and the GUI sends that event to any interested components through a Model-View-Controller mechanism. When the events being sent represent fine grained actions (for example a single field on a form changed, as opposed to coarse grained events like the form being submitted), the performance of the GUI can become an issue. Other examples of when performance of the GUI might become an issue are:
- when the model in the GUI is large,
- when the GUI needs to process the data a lot in preparation for displaying it (either client or server side),
- when the GUI is poorly implemented and duplicate listeners exist meaning that components are refreshed multiple times, unnecessarily.
The following strategy has been used to help improve GUI performance. This list is ordered with the easiest / most important tasks (least effort, best improvement) at the top:
- ensure a clean MVC implementation, without duplicate listeners,
- add data caches,
- optimise caches (e.g. they hold more than one object graph),
- optimise refreshing of non-active elements.
1) Clean MVC Implementation
First of all, read the MVC article on this blog. Then, using a profiler or debugger follow your controller as it fires model change events. Are any model listeners fired more than once for any reason? If they are, it is less than optimal and you need to first look for mistakes in the programming logic. If there really is a good reason that the event is fired for a component multiple times, look at getting the component to cache events and process them all on the last event. This can get messy however, and you need to ask yourself if the event model is coarse grained enough. Why does a component get multiple events in the first place?
2) Add Data Caches
In a good implementation you should be using design patterns such as the Business Delegate. This pattern hides the details of where data is retrieved from. It might come from a service or maybe directly from a database. Its client doesn't actually care. So this is the ideal place to start caching your model data, if not directly within the model (if you have one central model for your entire GUI). Either way, the aim of a cache is to only retrieve data from a slow method call if it's really required. For example, the user saves a sales order. In order to get updated stock levels, you probably need to call a service which will take some time. In this case, as soon as the sales order is saved, the stock cache becomes dirty, meaning that it needs to be updated before it can be used safely. On the other hand, if a user adds a customer to the system, this has no effect on the stock levels, so there is no need to reload the stock cache. It is not marked as dirty.
Since this cache is reliant on listening to the model to determine if it is dirty or not, it needs to be client side. Whether it is stored in the business delegate or seperately is the designers choice.
3) Cache Optimisation
Taking the above example of stock and sales, consider the following example. A user wants to add an item to a sales order. To do so, they need a list of products to choose from. That list of products comes from a products cache, however it might be dependent on the stock list, if they can only sell things which are in stock. As soon as they add the item to the sales order, the stock cache becomes dirty, since stock has been removed.
A cache of the stock in a warehouse might hold a lot of details about the stock. For example, each stock item might have a reference to its supplier which might be a fully loaded supplier object containing all the details of that supplier. It would probably be better to look up the supplier in a supplier cache, but that in turn is poor for performance. So reloading this stock cache and all its deep object graph (all the supplier info) might take a long time.
As soon as the user wants to sell something else from stock, the system needs that stock cache to be reloaded, because it is now dirty from the previous sale. But above it was just stated that that will take a long time. So let's think about the data that is actually required. Do you need supplier information for selling stuff? Unlikely. So why load it all? A better option is to hold two caches. The first is the original with references to supplier information. The second is a light weight cache with a shallow object graph, meaning you simply have a list of stock items with as few details as required during sales. To load the list of products that are in stock is now much quicker. The trade off is that you hold the list of stock twice, but at least the second list is light weight.
4) Optimise Refreshing of Non-active Elements
Have a look at the GUI below:
Each tab is a different view of the overall model. Generally speaking, they all have a table which shows data in rows and a form which drills down to show the details of the selected row. But only the current tab needs to be refreshed if the relevant model change event comes along. Since all tabs inherit from a common class, that common class was changed to decide for itself if it would refresh straight away, or store the event and refresh later, when the view becomes active.
In fact, as well as remembering what to refresh and deferring until later, it also optimises what needs refreshing. First of all, it wouldn't refresh the table twice! Second, there are several levels of refreshing. First, the table can be refreshed without reloading data from the cache (if for example the objects in its model contain the changes that need showing, which is what happens when the user updates data in the form). Second, the table can be refreshed after reloading its model from the cache. Finally, as well as reloading the table, it can change the selected row and update the form. If an event for this final refresh level comes in, while the view is inactive, there is no need to complete refreshes for the other two levels, since this final level already incorporates the other two. Same goes for the second level, in that it already includes the first level. So the code works out what really needs refreshing and defers until the point where you really need that refresh, namely when the view is activated. The trade off is that activating a view (by selecting its tab) is slower, but overall far fewer cache reloads and table/form refreshes take place.
This optimisation strategy was used on maxant BookStore. The results were a fully usable GUI that stores many thousand rows of data in memory. The GUI is quick and satisfying to use. Before the optimisations, the GUI was very painful to use indeed, and would easily have put off potential customers from buying the product.
A year ago I was the architect of a small project (Project A) building a Client / Server application, based on Eclipse SWT/JFace/RCP, Websphere, Oracle, JMS and an Object Relation Mapping (ORM) tool, Hibernate. We were 3.5 developers on average, and we finished in 7 months. We used Feature Driven Development (FDD, similar to eXtreme Programming) as its implementation methodology.
This year I worked as a lead developer on a much larger project (Project B), which I joined in its second half. The project as a whole had nearly 50 people on it at times, although our part was again an Eclipse SWT/JFace/RCP Client using a Websphere Server with Oracle and a different ORM tool, Toplink. We were 5 developers on average, finishing in 9 months. Here a kind of waterfall based methodology was used.
In my spare time, I work on a tiny project, BookStore. Again an Eclipse SWT/JFace/RCP Client but not using Websphere, and instead of Oracle it uses MySQL but also with Hibernate. Its developed by just me. I do however track time spent on implementation, as well as metrics. The methodology used here is an extremely Agile one, where the documents are in the code (unless its really major, then it gets its own document), priority is given to bug fixing, then feature development, then product improvement. The aim is to be able to release a stable version at any time so that bugs can be fixed super quick. The product auto-updates from the web, to enable this.
Last week was interesting when we measured some generic metrics for Project B. I did this as part of every release for Project A and BookStore, since I find these things interesting.
A summary of the results are as follows:
|Metric Name||Project A||Project B||BookStore|
|Total Lines of Code||48,000||46,000||24,000|
|Approximate Man Months||24||38||2|
|Depth of Inheritance Tree||2.7||1.9||2.8|
|Number of Classes||416||223||247|
|Number of Methods per Class||7.3||10.3||6.3|
What does all this mean?
The Total Lines of Code, Approximate Man Months and Productivity (Thousand Lines of code per month) all help to give a feeling of the size and complexity of the projects. Interesting would be to count the number of features, but that is sooo subjective, there is no point in even starting it. What is interesting, is that Projects A and B produced a similar amount of code, and from my experience of both of them, they were of roughly the same complexity. However the shear size of Project B, in terms of other activities and applications (for example a web application, integration with systems of other companies, etc.) means that the number of lines of communication are greatly increased. Getting information in order to implement something was much much harder. So that is a good explanation of why the project was so much less productive. Comparing Project A to BookStore is also interesting. Although BookStore was half the size, it was hugely more productive. Even if you scale its figures up and say that to get it to the same number of lines of code it takes 4 times longer, the productivity index is 6.0, three times higher than for Project A. The reasons are again simple, in that its a single developer project. There are almost no lines of communication (except with Beta Testers and Customers), so no time is wasted hunting for the right answers. I think it is in Peopleware by Tom DeMarco and Timothy Lister, where they talk about 50% of your time being spent on non-software related activities, like phone calls and email. Well for BookStore probably only 10% of time was spent on such activities, precisely because the number of lines of communications was so low. Sadly controlling the number of lines of communication on a project might be impossible to reduce below a certain size, because of stake holders in the project. If you integrate with external systems and other projects, you have no way to reduce those communications, beyond strategies like having one point of contact per technical area. But if that one point of contact needs time themselves to determine information, it might not help. Simply put, the communication network is hard to control, especially as a techie / non-senior manager.
The other metrics, such as Depth of Inheritance Tree, Abstractness, Number of Classes and Number of Methods per Class tell a different story. These relate much more to the design of the software, something that can be controlled by the technical team. My aim when designing software (which incedentally does not need to happen before you code, it can happen during coding, by means of agile refactoring) is NOT to have the perfect design... But I do believe in using design patterns and inheritance when its useful. I might add it by refactoring, if I am unsure it will be of use at the start (see my article on DT, EMV and NPR revisited). Anyway, the results above show that when I am fully in charge (BookStore) a clean design results. When I am mostly in charge (Project A) a clean design results. When I have only a little influence, the design is less clean (not only shown by figures, but also by code reviews conducted by myself). So what? Well that might have something to do with the productivity. But it also has a lot to do with maintainance and future releases of the software. For Project B, an entire new team is taking over and it is unlikely to have major changes for 6 months after delivery, so a lot of knowledge will get lost. A clean design would be useful in this case, since ramp up time for new team members would be smaller. One day I might sell BookStore to a company who wants to take it further, and they would want a clean design to do that.
There are of course other factors at play here. Project B had a user interface that had requirements to be fully useable without a mouse. Implementing radio buttons within a table which are selectable with the space bar are rock hard to build, compared to just using an SWT or Swing radio button. And all that effort does not really show up as some great feature!
BookStore on the other hand shyed away from anything difficult. It was built under the paradigm "Keep it simple, stupid". This was precisely because I know that complex requirements that give little usability are dangerous to productivity. Normally I recommend my clients to steer clear, but Project B was so big, and so politically charged, with a customer who didn't seem to really care about costs, that it was not possible to change the requirements in order to aid productivity.
I also like to think that agile projects, like Project A and BookStore, are more productive because they are agile. Fixing bugs as you find them, developer testing, quick releases to help testers retest bugs, etc. are all good things in modern software development. Refactoring as you go to ensure you have the ideal depth of design means that overall you keep your costs down - you don't spend unnecessarily on design at the start, but you do spend on it when required to keep the code maintainable for the long term.
For me, lessons learned from these results are summed up as follows. If you want to keep productivity high, and make the code maintainable in the long term, then:
- develop using an agile method,
- control requirements changes, to keep costs due to change at a minimum,
- refactor continuously to keep a clean design,
- throw out expensive requirements that give little or no return on investment,
- reduce lines of communication to a minumum - one point of contact per technical area
- employ decision makers - people who can make good decisions reduce the need for changes later and messing about now
OK, that last point seems to have come out of nowhere, but its still valid :-) And all the others are nothing new if you have read a book about agile development. But the results from these three projects seem to back up what the literature says.
Incidentally, the metrics shown here were measured using an Ecplipse Plugin available from http://metrics.sourceforge.net. That site contains further information about these metrics and their meanings.