Architecture Consideration for Cloud Native Applications
- End users
- Input and output process
- Engineering teams
- Engineering resources
- Financial resources
- Internal knowledge
Prior to an organization delivers a product, the engineering team needs to decide on the most suitable application architecture. In most of the cases 2 distinct models are referenced: monoliths and microservices.
Regardless of the adopted structure, the main goal is to design an application that delivers value to customers and can be easily adjusted to accommodate new functionalities.
Also, each architecture encapsulates the 3 main tires of an application:
- UI (User Interface) - handles HTTP requests from the users and returns a response
- Business logic - contained the code that provides a service to the users
- Data layer - implements access and storage of data objects
In a monolithic architecture, application tiers can be described as:
- part of the same unit
- managed in a single repository
- sharing existing resources (e.g. CPU and memory)
- developed in one programming language
- released using a single binary
Diagram showcasing how a booking application is implemented using a monolithic architecture
A booking application referencing the monolithic architecture
Imagine a team develops a booking application using a monolithic approach. In this case, the UI is the website that the user interacts with. The business logic contains the code that provides the booking functionalities, such as search, booking, payment, and so on. These are written using one programming language (e.g. Java or Go) and stored in a single repository. The data layer contains functions that store and retrieve customer data. All of these components are managed as a unit, and the release is done using a single binary.
In a microservice architecture, application tiers are managed independently, as different units. Each unit has the following characteristics:
- managed in a separate repository
- own allocated resources (e.g. CPU and memory)
- well-defined API (Application Programming Interface) for connection to other units
- implemented using the programming language of choice
- released using its own binary
Diagram showcasing how a booking application is implemented using a microservice architecture
A booking application referencing the microservice architecture
Now, let's imagine the team develops a booking application using a microservice approach.
In this case, the UI remains the website that the user interacts with. However, the business logic is split into smaller, independent units, such as login, payment, confirmation, and many more. These units are stored in separate repositories and are written using the programming language of choice (e.g. Go for the payment service and Python for login service). To interact with other services, each unit exposes an API. And lastly, the data layer contains functions that store and retrieve customer and order data. As expected, each unit is released using its own binary.
- Monolith: application design where all application tiers are managed as a single unit
- Microservice: application design where application tiers are managed as independent, smaller units
These trade-offs cover development complexity, scalability, time to deploy, flexibility, operational cost, and reliability.
Development complexity represents the effort required to deploy and manage an application.
Monoliths - one programming language; one repository; enables sequential development
Microservice - multiple programming languages; multiple repositories; enables concurrent development
Scalability captures how an application is able to scales up and down, based on the incoming traffic.
Monoliths - replication of the entire stack; hence it's heavy on resource consumption
Microservice - replication of a single unit, providing on-demand consumption of resources
Time to Deploy
Time to deploy encapsulates the build of a delivery pipeline that is used to ship features.
Monoliths - one delivery pipeline that deploys the entire stack; more risk with each deployment leading to a lower velocity rate
Microservice - multiple delivery pipelines that deploy separate units; less risk with each deployment leading to a higher feature development rate
Flexibility implies the ability to adapt to new technologies and introduce new functionalities.
Monoliths - low rate, since the entire application stack might need restructuring to incorporate new functionalities
Microservice - high rate, since changing an independent unit is straightforward
Operational cost represents the cost of necessary resources to release a product.
Monoliths - low initial cost, since one code base and one pipeline should be managed. However, the cost increases exponentially when the application needs to operate at scale.
Microservice - high initial cost, since multiple repositories and pipelines require management. However, at scale, the cost remains proportional to the consumed resources at that point in time.
Reliability captures practices for an application to recover from failure and tools to monitor an application.
Monoliths - in a failure scenario, the entire stack needs to be recovered. Also, the visibility into each functionality is low, since all the logs and metrics are aggregated together.
Microservice - in a failure scenario, only the failed unit needs to be recovered. Also, there is high visibility into the logs and metrics for each unit.
These practices are focused on health checks, metrics, logs, tracing, and resource consumption.
Health checks are implemented to showcase the status of an application. These checks report if an application is running and meets the expected behavior to serve incoming traffic. Usually, health checks are represented by an HTTP endpoint such as /healthz or /status. These endpoints return an HTTP response showcasing if the application is healthy or in an error state.
Screenshot showcasing a /status health check that returns an "OK - healthy" response
/status health check that showcases that the application is healthy
Metrics are necessary to quantify the performance of the application. To fully understand how a service handles requests, it is mandatory to collect statistics on how the service operates.
For example, the number of active users, handled requests, or the number of logins. Additionally, it is paramount to gather statistics on resources that the application requires to be fully operational.
For example, the amount of CPU, memory, and network throughput. Usually, the collection of metrics are returned via an HTTP endpoint such as /metrics, which contains the internal metrics such as the number of active users, consumed CPU, network throughput, etc.
Screenshot showcasing a list of metrics that counts the handled requests for different HTTP code
/metrics endpoint that list of metrics counting the amount requests by the HTTP code returned
Log aggregation provides valuable insights into what operations a service is performing at a point in time. It is the nucleus of any troubleshooting and debugging process.
For example, it is essential to record if a user logged in successfully into a service, or encountered an error while performing a payment.
Usually, the logs are collected from STDOUT (standard out) and STDERR (standard error) through a passive logging mechanism. This means that any output or errors from the application are sent to the shell. Subsequently, these are collected by a logging tool, such as Fluentd or Splunk, and stored in backend storage. However, the application can send the logs directly to the backend storage. In this case, an active logging technique is used, as the log transmission is handled directly by the application, without a logging tool required.
There are multiple logging levels that can be attributed to an operation. Some of the most widely used are:
- DEBUG - record fine-grained events of application processes
- INFO - provide coarse-grained information about an operation
- WARN - records a potential issue with the service
- ERROR - notifies an error has been encountered, however, the application is still running
- FATAL - represents a critical situation, when the application is not operational
As well, it is common practice to associate each log line with a timestamp, that will exactly record when an operation was invoked.
Screenshot of multiple INFO log lines recorded when Prometheus service started
Multiple INFO log lines recorded when a Prometheus service started
Tracing is capable of creating a full picture of how different services are invoked to fulfill a single request. Usually, tracing is integrated through a library at the application layer, where the developer can record when a particular service is invoked. These records for individual services are defined as spans. A collection of spans define a trace that recreates the entire lifecycle of a request.
Resource consumption encapsulates the resources an application requires to be fully operational. This usually refers to the amount of CPU and memory that is consumed by an application during its execution. Additionally, it is beneficial to benchmark the network throughput, or how many requests can an application handle concurrently. Having awareness of resource boundaries is essential to ensure that the application is up and running 24/7.
Screenshot of the graph with the CPU consumption of the coredns container
A graph showcasing the CPU consumption of the coredns container
Health Checks - explore the core reasons to introduce health checks and implementations examples
Prometehus Best Practices on Metrics Naming - explore how to name, label, and define the type of metrics
Application Logging Best Practices - read more on how to define what logs should be collected by an application
Logging Levels - explore possible logging levels and when they should be enabled
Enabling Distributed Tracing for Microservices With Jaeger in Kubernetes - learn what tools can be used to implement tracing in a Kubernetes cluster
Some of the most encountered operations in the maintenance phase are listed below:
Screenshot of operations to occur in the maintenance phase, including split, merge, replace and stale operations
Application operations to occur in the maintenance phase
A split operation - is applied if a service covers too many functionalities and it's complex to manage. Having smaller, manageable units is preferred in this context.
A merge operation - is applied if units are too granular or perform closely interlinked operations, and it provides a development advantage to merge these together. For example, merging 2 separate services for log output and log format in a single service.
A replace operation - is adopted when a more efficient implementation is identified for a service. For example, rewriting a Java service in Go, to optimize the overall execution time.
A stale operation - is performed for services that are no longer providing any business value, and should be archived or deprecated. For example, services that were used to perform a one-off migration process.
Performing any of these operations increases the longevity and continuity of a project. Overall, the end goal is to ensure the application is providing value to customers and is easy to manage by the engineering team. But more importantly, it can be observed that the structure of a project is not static. It amorphous and it evolves based on new requirements and customer feedback.
Modern Banking in 1500 Microservices - watch how Monzo is managing thousands of microservices and evolves their ecosystem.© TruongIdeas & Feedback