While building the distributed systems, one of the core aspect that must be considered thoughtfully is the overall performance of the system. Although scalability, reliability, high availability and security are equally important for any distributed systems, however, performance of the system should always be on a priority list while designing and implementing the distributed systems, as per my viewpoint.
Hence, if you are working on any distributed systems or building up multi-tenant cloud agnostic SaaS applications and worried about the performance of your application, then, here are the 5 Design patterns you must know about.
Let's get started & understand at high-level, what are these design patterns & how they can help to improve the performance of any distributed systems.
Cache-a-side pattern
One of the most common component you may have seen in any distributed systems to improve the performance is caching implementation. For every request, if you are hitting the underlying data store to retrieve the same data that is being requested again and again, then it might be a costly affair and can impact the performance. Instead of that, it would be a good idea to hold the data in cache and return from there itself if it's available in cache. Retrieval from caching store is always going to perform well in comparison to a fetch from underlying data store.
Now, with caching layer, you may also have to worry about different factors such as - consistency of the data in cache with the data in underlying data store, how to make effective use of cache memory, eviction pattern to be used for cache content and so on.
One of the simplest pattern to be used for improving the read performances is to use cache-a-side pattern, wherein first check if the data is available in cache. If not then retrieve the data from underlying data store and add it to cache and then return the data. If the data is already available in cache then directly return the data from there. Whenever any modification to the data happens then you also have to evict the data held in the cache memory and load it on demand whenever it is requested again.
Choreography pattern
Microservices architectural pattern is one of the preferred choice while building distributed systems, wherein multiple small services collaboratively accomplish a business transaction end-to-end. Now, the easiest solution to use for microservices communication is Orchestrator pattern. Although at a first glance it seems to be an easy solution, however, you may come across performance inefficiencies. Here is the very simple reason behind it. The main role of orchestrator would be to acknowledge all incoming requests and delegate it to the corresponding services. Now, when multiple services are involved to complete a business transactions, it may happen that based on the response of one service, the other service would complete the required operation. Hence, orchestrator would have to make sure that it is centrally managing all these aspects and may need to be aware about the domain knowledge as well. Looking at all of this, it can be easily derived that, orchestrator may become a single point of failure as it has to deal with all the transactions.
Alternate approach to microservices communication is to use Choreography pattern, wherein, instead of relying upon the central orchestrator for communication with other microservices, let each services take the charge and decide when and how the operation should be processed. Asynchronous messaging pattern is one of the most common way to implement choreography pattern using a message broker. A request will send a message to the queue and the message will be sent to all the subscriber services interested for that message. Each services will then process the message and respond back to the message queue with success/failure. In case of failure, the message broker may retry the operation and in case of success, the other services will pick up the message and do the processing as required. This is a classic example wherein services choreograph the inter-communication through messages and collaboratively completes the business transaction end-to-end.
CQRS - Command and Query Responsibility Segregation
This pattern is all about separating out read and write operations from the underlying datastore. For distributed systems, it is very well possible that, there are asymmetrical work loads for read or write operations and at the same time have different requirements for scalability and performance of read or write operations. If these requirements are not considered thoughtfully then it may possibly lead to some of the problems such as load on data store, contention of data in case of parallel operations and so on.
From the performance standpoint of view, it would be a good idea to separate out read and write operations. CQRS pattern is all for this, wherein Commands would be used for write operations to update the data and Query would be used to read the data and you can also use different models for read/update operations. With CQRS, it is easily possible to have independent scaling of read and write operations. You can also have the physical separation of data and can have efficient and performant read operations using own data schema optimized for queries. Event drive architecture style can be used to keep data store used read and write operations in sync.
Static Content hosting pattern
Generally, there's always going to be some amount of static content your distributed system will be dealing with. These static content could be some images, some documents, style sheets, javascripts or in other words say some html pages as well. In most cases in traditional web applications, web server is generally used to serve these resources. Although web servers may have been optimized with features such as caching and dynamic rendering, however, still while serving these static resources, there is always some bit of processing cycle get consumed and this can be better used in order to improve the performance. To cater to this, static content hosting pattern is a great solution, wherein you can have the static content deployed to cloud based storage service and through that deliver these content directly to the clients. This way, what we are eventually doing is, we are reducing the overall load on the compute resources that serves the other web requests. For better performance, Content Delivery Network can be considered as well for caching the content.
Throttling pattern
For any multi-tenant application which is serving to the requests of multiple tenants, you never know when the load from the users of one tenant can start affecting the users of another tenant. As a solution provider, you may never want to have such situation. And thus, to deal with such situations, you may have already thought about auto scaling. However, auto scaling can take some amount of time for adding the additional resources and there may possibly happen that there is a time duration for which there is a crunch of resources. The best strategy that could be used to avoid such scenario at all is to use Throttling pattern, wherein you allow applications to use resources to only a specific limit and as soon as the limit is reached, throttle them. This strategy allows to ensure that, all the tenants basically are allocated a threshold for the resource usage and with that, users of one tenant will not be able to affect the usage of the system for the users of the other tenant. This was a very quick summary of throttling pattern.
Summary
We have understood the 5 design patterns that can potentially help to improve the performance of cloud based multi-tenant SaaS distributed systems. Hope the article was useful to you at some extent. Do share your comments/feedback at techfundas.in@gmail.com. Thank you!