In this article I want to talk about an architectural pattern called Command Query Responsibility Segregation (CQRS). I will use an example to illustrate this pattern based on a recent problem we had at the company I work for and how we solved that problem. The system in question is an automated payment processor and runs on a schedule through the night and collects payments from our customers. The original CQRS pattern was written about by Martin Fowler over on his blog.
The concepts behind CQRS are very simple, but they give you some powerful advantages. What CQRS essentially does is get you to use a different model to update information (COMMAND) than the mode you use to read information (QUERY). If you think of a typical layered application you may have a client, a user interface or a background process that communicates with a business logic layer that may be comprised of a static set of libraries, Web Services or REST based services.
This business logic layer acts as a model into your domain that lets you perform operations against that model. These operations generally allow you to create, retrieve, update, or delete data associated with the model. For most scenarios this age old way of working is fine and that is certainly true nowadays. This way of working can cause you problems for systems that need to have high performance and be more scalable.
To illustrate this pattern I will discuss a payment system developed by 2 engineers on my team, Graham Johnson and Hugh Hulme. For the purposes of this article we will call it the Automated Payment System (APS). The actual system itself had a different name but APS will do for the purpose of this article. The system would run through the night and collect payments that where due from our customers. The initial system was completely synchronous and would execute in the following steps:
1. APS sends a payment request to a payment gateway service (this was our service interface to a 3rd party payment provider).
2. The payment gateway decrypts the card/customer data and sends a payment request to the 3rd party.
3. The 3rd party attempts to take the payment and return a success, decline; refer etc. to the payment gateway.
4. The payment gateway then returns to APS.
The above process would happen for each payment attempt. This has a number of problems. First, it was slow. The APS system was waiting for each payment to be attempted and return from the 3rd party. You can speed up the system by either making it multi-threaded or multi-headed. Multi-heading is where the program runs in a single thread but you run multiple versions of the APS client. The client would then need to be clever enough to divvy out the work to each head without making multiple payment requests for the same customer.
There is nothing inherently wrong with the multi-heading process. A number of our system have used this pattern and have all worked ok, but they are harder to scale and harder to manage. The other option is to use CQRS, and this is exactly what we did for one of our main payment systems.
The diagram above shows the same system but using the CQRS pattern. What this version does is decouple the posting of payment requests from the actual processing of the payments. The system doesn’t need to be synchronous, APS just needs to work out what payments to make and post them, and it doesn’t have to wait for a result from each payment request. The steps are as follows:
1. APS posts a payment request to a request queue.
2. The Payment Request Queue Processor takes the payment request from the queue.
3. The De-Queued payment request is attempted by the 3rd party payment processor.
4. The 3rd party payment processor sends a response back the Payment Request Queue Processor.
5. A payment result message is posted onto the Payment Complete Notification queue.
6. The payment result message is taken of the Payment Complete Notification queue and the results of a customer’s payment are written back into the system.
The box on the left of the picture, the Automated Payment System, now has a reduced set of responsibilities. All it has to do is determine what work needs to be carried out and post the payments to a queue. It doesn’t care how the payment is taken, as soon as it has posted the payment its work is finished.
You can use different technologies to represent the queues. I work on a Microsoft platform so my choices include using a queuing technology like MSMQ, RabbitMQ etc. I would also use a file backed store where I write request files into a disk location which I then process in “date create” order, or I could use a database table as a queue. This could be done in a more traditional SQL database like SQL Server/Oracle, or using a NoSQL database like RavenDB, MongoDB, and Cassandra etc.
The dotted box in the middle of the picture is a separate system that waits for messages on the queue and processes them as they come in. In our system this is implemented as a windows service that listens on the queue. If anything was to happen to that service, i.e. it crashed, then the data is still preserved on the queue. As soon as you restart the service, it will carry on processing.
At step 5 and 6 where the response from the payment is posted onto a queue you can either have the APS system itself pull the message from the queue and write the payments back into the database, or you could have that box pull the data from the queue and write back into the database (as illustrated in the diagram above), thus containing the whole processing/response interaction within the windows service. This means that the APS system only cares about posting the payments and doesn’t worry about processing the responses, it can just assume that this will happen elsewhere.
Scaling the System
The design just discussed has just 1 subsystem processing the payment requests. Generally this may be fine depending on the load you are putting through it, but eventually you will want to scale the system out to handle the higher load. The options I discuss below are
- Single Endpoint, Single Set of Queues with Multiple Processors
- Multiple Endpoints with Queues, and Processor
Single Endpoint, Single Set of Queues with Multiple Processors
The first example of scaling the solution maintains the single Request Queue and Payment Complete Notification Queue, but has multiple processors feed from the queues. In the original example there was one sub component (represented with the circular arrows) that would pull requests off the queue, make the payment, and place the response notifications on the Payment Complete Notification Queue.
As load on the system increases and the individual processors reach capacity you can add additional processors to feed from the Request Queue. This gives you good horizontal scaling where you can add additional payment processors, but your queues are a single point of failure. If you are using transactional queues that are persisted then this may be fine for your own circumstances. If you really want to go the extra mile you can look at an alternative, but it adds a lot of additional complexity.
Multiple Endpoints with Queues, and Processor
If you do not want to have the single point of failure with the queues, then another option is to encapsulate the processing subsystem behind a service, so that each service endpoint has its own set of Request Queues and Payment Complete Notification queues. You would then place the entry endpoint to the service behind a load balancer as illustrated below.
As the APS payment system makes payment requests, the load balancer will distribute the calls between the different endpoints. Each service endpoint, which would sit on a separate server, would maintain its own instance of the queues, and Payment Request Queue Processors. This removes the single point of failure in having the single queue instances with multiple Payment Request Queue Processors. It is a better solution overall, but it does add more complexity so you need to decide how important the separate queues are to your own particular needs.
This article has discussed an application of the Command Query Responsibility Segregation (CQRS) architectural pattern using an automated payment processor as an example. This is just one use of the pattern. You can use it anywhere you need to decouple the request of a system (COMMAND) from the processing of that request (QUERY). Other examples of where this may be useful are a system where you need to apply for a product, a loan, for example. You have a website, or point of sale system that captures application details from a customer and then places the application in a product application queue. Then you have separate product application processing systems that process the applications. This makes your systems very easy to scale horizontally as you then just need to add extra request processors to cope with the load.