Tag: Architecture

Eleven Patterns, Problems & Solutions related to Microservices and in particular Distributed Architectures

The wish to fulfil certain system quality attributes lead us to choose microservice architectures, which by their very nature are distributed, meaning that calls between the services are effectively calls to remote processes. Even without considering microservices, we distribute software in order to meet scalability and availabilty requirements, which also causes calls to be remote. By choosing to support such quality attributes, we automatically trade off others, typically data consistency (and isolation), because of CAP theorem. In order to attempt to compensate for that, so that data eventually becomes consistent, we introduce mechanisms to retry remote calls in cases where they fail, for example using the transactional outbox pattern (based either on CDC to tail changes to the command table, or building a polling mechanism such as here). This in turn forces us to make our interfaces idempotent, so that duplicate calls do not result in unwanted side effects. Retry mechanisms can also cause problems related to the ordering of calls, for example if the basis for a call has changed, because while the system was waiting to recover before retrying the call again, another call from a different context was successful. Ordering problems might also occur simply because we add concurrency to the system to improve performance. Data replication is another mechanism that we might choose in order to increase availability or performance, so that the microservice in hand does not need to make a call downstream, and that downstream service does not even need to be online at the same time. Replication can however…

Read more

Playing with Cache Performance

My current client has a service which connects to an old IBM z/OS application (legacy system). The data centre charges for each message sent to this legacy system, rather than using a processor or hardware based pricing model. The output from this legacy system is always the same, since the calculations are idempotent. The application calculates prices for travelling along a given route of the train network. Prices are changed only twice a year through an administration tool. So in order to save money (a hundered thousand dollars a year), the service which connects to this legacy system had an in-memory least-recently-used (LRU) cache built into it, which removes the least recently used entries when it gets full in order to make space for new entries. The cache is quite large, thus avoiding making costly calls to the legacy system. To avoid losing the cache content upon server restarts, a background task was later built to periodically persist the latest data inside this cache. Upon starting a server, the cache content is re-read. Entries within the cache have a TTL (time to live) so that stale entries are discarded and re-fetched from the legacy system. This cache was great in the beginning because it saved the customer a lot of money, but in the mean time several similar caches have been added, as well as more general caches for avoiding repeated database reads, causing our nodes to need over 1.5 GB of RAM. Analysis has showed that the caches are…

Read more

DCI and Services (EJB)

Data, Context and Interaction (DCI) is a way to improve the readability of object oriented code. But it has nothing specific to say about things like transactions, security, resources, concurrency, scalability, reliability, or other such concerns. Services, in terms of stateless EJBs or Spring Services, have a lot to say about such concerns, and indeed allow the cross-cutting concerns like transactions and security to be configured outside of the code using annotations or deployment descriptors (XML configuration), letting programmers concentrate on business code. The code they write contains very little code related to transactions or security. The other concerns are handled by the container in which the services live. Services however, constrain developers to think in terms of higher order objects which deliver functionality. Object orientation (OO) and DCI let programmers program in terms of objects; DCI more so than OO. In those cases where the programmer wants to have objects with behaviour, rather than passing the object to a service, they can use DCI. If objects are to provide rich behaviour (ie. behaviour which relies on security, transactions or resources), then they need to somehow be combined with services, which naturally do this and do it with the help of a container, so that the programmer does not need to write low level boiler plate code, for example to start and commit transactions. The idea here, is to explore combining a service solution with a DCI solution. Comparing SOA to DCI, like I did in my white paper, shows…

Read more

DCI Plugin for Eclipse

The Data, Context, and Interaction (DCI) architecture paradigm introduces the idea of thinking in terms of roles and contexts. See some of my white papers for a more detailed introduction into DCI, but for this blog article, consider the following example: a human could be modelled in object oriented programming by creating a super huge massive class which encapsulates all a humans attributes, their behaviours, etc. You would probably end up with something much too complex to be really maintainable. Think about when a human becomes a clown for a kids party; most of that behaviour has little to do with being a programmer, which is a different role which the human could play. So, DCI looks at the problem differently than OOP and solves it by letting the data class be good at being a data class, and putting the behaviours specific to certain roles into "roles", which in my DCI Tools for Java library are classes. Certain roles interact with each other within a given context, such as a clown entertaining kids at a birthday party. The roles which belong to an interaction are part of the context, and in DCI the context is a class which puts data objects into specific roles, and makes them interact. The context and its roles form the encapsulation of the behaviour. I have updated my library, so that there are two new Annotations, namely the @Context and @Role annotations. The @Context annotation is simply a marker to show that a class…

Read more

User Mental Models

I've spent the last 6 weeks looking into an interesting paradigm called Data, Context & Interaction (DCI). I've written a few introductory papers, and some tools too. DCI has the following goals: To improve the readability of object-oriented code by giving system behavior first-class status; To cleanly separate code for rapidly changing system behavior (what the system does) from that for slowly changing domain knowledge (what the system is), instead of combining both in one class interface; To help programmers reason about system-level state and behavior instead of only object state and behavior; To support an object style of thinking that is close to peoples' mental models, rather than the class style of thinking that overshadowed object thinking early in the history of object-oriented programming languages. The problem recognised by this paradigm is that system behaviour gets fragmented in traditional OOP, over a large number of classes and sub-classes. The code directly related to a use-case gets fragmented and "lost". The boundaries for the behaviour end up being class boundaries and pieces of behaviour are stuck with the data on which they operate, so that the code has high cohesion. If you want to read the code related to a use-case, you will struggle to extract it and understand it. I agree with these problems, but have not really encountered them personally for a long time, because I do most of my work predominantly in the SOA paradigm. When I first read these goals and all the articles I could…

Read more

How Brontosaurs kill Raptors

The board of a large organisation appoints a CIO who delegates technical decisions to his employees. In the Enterprise Architecture (EA) department, an expert on Business Intelligence (BI) decides that the strategy for this company is to have a single Enterprise Data Warehouse (EDW). This makes sense, because experience has shown that in the long run, it is cheaper and more efficient to build one Data Warehouse, rather than start by building "islands" and then later try to merge them together. In the mean time, some business people working for the company have rallied the budget holders, and started a project to build a new online sales application. They need to generate some reports based on the sales from this application, in order to set themselves targets and measure their performance. But, they are not experts when it comes to reporting. The business sector at which their application is aimed is cutting edge and brand new, so they don't know what to expect, they don't really know what kind of reports they will need, and they will certainly need to adapt to the market very quickly. A project team in the IT department then get the assignment to build the application, including the report generation. Also knowing little about reporting, they start talking to EA, who set up a meeting, between themselves, the development team, some people from the Quality Assurance (QA) department and some experts from the BI team. Hours of discussion later, they have made the following conclusions:…

Read more

Taking Advantage of Parallelism

A while ago some colleagues attended a lecture where the presenter introduced the idea that applications may not take full advantage of the multi-core servers which are available today. The idea was that if you have two cores but a process which is running on a single thread, then all the work is done on one single core. Application servers help in this respect, because they handle multiple incoming requests simultaneously, by starting a new thread for each request. So if the server has two cores it can really handle two requests simultaneously, or if it has 6 cores, it can handle 6 requests simultanously. So multi-core CPUs can help the performance of your server if you have multiple simultaneous requests, which is often the case when your server is running near its limit. But it's not often the case that you want your servers running close to the limit, so you typically scale out, by adding more nodes to your server cluster, which has a similar effect to adding cores to the CPU (you can continue to handle multiple requests simultaneously). So once you have scaled up by adding more cores, and scaled out by adding more servers, how can you improve performance? Some processes can be designed to be non-serial, especially in enterprise scenarios. The Wikipedia article on multi-core processors talks about this. Imagine a process which gathers data from multiple systems while preparing the data which it responds with. An example would be a pricing system. Imagine…

Read more

Houses of Cards

I have often heard of software systems being compared to a house of cards, meaning that they were poorly built and are ready to topple at any time. The system I am currently helping to maintain has from time to time also been labeled in that way, yet it always manages to go live with quite a few new functions, and apart from the odd all-nighter to fix a few last bugs, it works to the customers satisfaction. So while I was doing the washing up today, I noticed something. I am well known amongst family and ex-house mates for building what look like unstable piles of washed up dishes, as they dry. Yet I have never ever had a pile collapse on me. Never once have I lost it all. And remember, there are slippery suds involved in holding up these piles! As I thought a little more about it, it didn't take long to come to the simple conclusion that a pile of washing up is stronger than a house of cards. Cards are uniformly shaped and have no edges or surfaces which help to lock their neighbours in place.  Plates, dishes, cups, pans and cutlery on the other hand can be placed tactically so that they lock together forming a strong structure. Notice in the picture below that the heavy pan is placed at the top! It still looks ugly though. But as long as its just left to dry it will be safe. Just like our…

Read more

Idempotency and Two Phase Commit

The requirement for services to be idempotent is often stated as being important for enterprise applications. But what does that mean, and why? Idempotent means that a service can be called multiple times with the same data, and the result will always be the same. For example, if a service call results in a value being written to a database, the same service call made again would result in the same value being written to the database. As such, additative processes where values are incremented cannot be idempotent, for example an insert statement in a database is not idempotent, whereas an update statement usually is. Imagine the case of purchasing a ticket from a web service offering airline tickets. The process probably includes getting an offer to see the price and tarif, reserving an instance of that ticket and finally when the shopping cart is full, confirming that ticket by booking it. Getting an offer would be an idempotent call, since we are just effectively reading data, not writing it. Reserving the ticket cannot be idempotent because each call should result in an individual seat being temporarily reserved - you don't want to reserve the same seat for two passengers. However, should such a reserved ticket not be booked, a background process would need to cancel the temporary reservation, so idempotency is effectively achieved. In the final call, to book a reservation (to guarantee the seat), the call should be idempotent - setting the status of the ticket to "booked"…

Read more

Building a Webmail Solution on top of Apache James Mail Server

Part of maxant's offering to small businesses is email hosting. As well as standard POP3/SMTP access, maxant offers webmail access. A quick search on the web shows that there are several open source webmail solutions available. The problem with all of them is that they communicate with the email server through the SMTP protocol. For example, if you wish to preview a list of emails, the web application needs to access the email server and ask for details of each email (while leaving them on the email server, so they can be downloaded at a later time via POP3). Reading all the emails is inefficient and the larger the number of emails in your inbox, the longer it takes to just see a list of emails. The solution built by maxant is based on the Java Mail API from Sun. This API lets you access individual emails in your inbox using an ID. But Apache James Mail Server (James for short) doesn't maintain the index, if a new mail is put in the inbox, so if you have a list of all emails and decide to access one, and in the mean time you have received email, the chances are that you won't be able to read that email! The next problem is how to deal with keeping a copy of sent emails for your "sent items" folder. If you just use the Java Mail API, the only solution for getting a mail into your email server so that it…

Read more