Sunday, 6 December 2015

Challenges Faced by Microservice Client and its Resolution

The term "Microservice Architecture" has sprung up over the last few years to describe a particular way of designing software applications as suites of independently deployable services. While there is no precise definition of this architectural style, there are certain common characteristics around organization around business capability, automated deployment, intelligence in the endpoints, and decentralized control of languages and data.

While services were present in the monolith world too; they differ where the business logic processing happens. Since each microservice looks at its business domain a lot of responsibility is offloaded to the clients.

Let us consider a simple search page which shows items matching your search term in a 4 x 5 grid. The search results page (client) may actually end up talking to 3 - 4 different microservices:

 - search service : to get the actual search results
 - product service : to get the latest product description and assets
 - price service : to get the latest prices
 - stock service : to get the last minute stock details  - in order to get any meaningful performance. Imperative programming styles for call compositions dont scale. The problems are further compounded by the fact that at any given moment in time there could be multiple clients talking to these microservices. 

It is thus our assertion that while building microservices is fun; building clients which remain resilient, performing and meaningful over extended periods of time has become much more challenging.

In this blog post we will talk about challenges which such microservice clients face and how has the problem been solved.

Communication mechanism for microservice clients: API Gateway Pattern

Let’s assume we are building an e-commerce application product detail page for two types of client – namely
 - desktop based web browser (HTML5) and
 - native mobile app

In addition, the application must expose product details via a REST API for use by 3rd party applications.
A product details UI can display a lot of information about a product. For example:
Basic information about the book such as title, author, price, etc.
Your purchase history for the book
Availability
Buying options
Other items that are frequently bought with this book
Other items bought by customers who bought this book
Customer reviews
Sellers ranking

Since the application uses the Microservices pattern the product details data is spread over multiple services. For example,

Catalog Service - basic information about the product such as title, author
Order service - purchase history for product
Recommendation
Inventory service - product availability
Review service - customer reviews
Customer service – help chat  …

Consequently, the code that displays the product details needs to fetch information from all of these services. So the client ends up calling the services as shown in the figure below:

Problem:

How do the clients of a Microservices-based application access the individual services?

The granularity of APIs provided by microservices is often different than what a client needs. Microservices typically provide fine-grained APIs, which means that clients need to interact with multiple services. For example, as described above, a client needing the details for a product needs to fetch data from numerous services.
 - Different clients need different data. For example, the desktop browser version of a product details page desktop is typically more elaborate then the mobile version.
 - Network performance is different for different types of clients. For example, a mobile network is typically much slower and has much higher latency than a non-mobile network. And, of course, any WAN is much slower than a LAN. This means that a native mobile client uses a network that has very difference performance characteristics than a LAN used by a server-side web application. The server-side web application can make multiple requests to backend services without impacting the user experience where as a mobile client can only make a few.
 - The number of service instances and their locations (host+port) changes dynamically
 - Partitioning into services can change over time and should be hidden from clients.

·      

Solution:

Rather than provide a one-size-fits-all style API a much better approach is for clients to make a small number of requests per-page, perhaps as few as one, over the Internet to a front-end server known as an API gateway (or example, the Netflix API gateway runs client-specific adapter code that provides each client with an API that's best suited to it's requirements), which is shown in the figure below:
 

The API gateway might also implement security, e.g. verify that the client is authorized to perform the request.

The API gateway sits between the application’s clients and the microservices. It provides APIs that are tailored to the client. The API gateway provides a coarse-grained API to mobile clients and a finer-grained API to desktop clients that use a high-performance network. In this example, the desktop clients makes multiple requests to retrieve information about a product, where as a mobile client makes a single request.

The API gateway handles incoming requests by making requests to some number of microservices over the high-performance LAN. Netflix, for example, describes how each request fans out to on average six backend services. In this example, fine-grained requests from a desktop client are simply proxied to the corresponding service, whereas each coarse-grained request from a mobile client is handled by aggregating the results of calling multiple services.

Not only does the API gateway optimize communication between clients and the application, but it also encapsulates the details of the microservices. This enables the microservices to evolve without impacting the clients. For examples, two microservices might be merged. Another microservice might be partitioned into two or more services. Only the API gateway needs to be updated to reflect these changes. The clients are unaffected.

1       Handling Partial Failure: Circuit Breaker

One issue we have to address when implementing an API Gateway is the problem of partial failure. This issue arises in all distributed systems whenever one service calls another service that is either responding slowly or is unavailable. The API Gateway should never block indefinitely waiting for a downstream service. However, how it handles the failure depends on the specific scenario and which service is failing.

If there are composite services developed (services merged in API gateway) which depends on other core services, failing of any of the core service will have impact in the composite service. In general we call this type of problem a chain of failures, where an error in one component can cause errors to occur in other components that depend on the failing component. This needs special attention in a microservice based system landscape where, potentially a large number of, separately deployed microservices communicate with each other.

Solution:

For example, if the recommendation service is unresponsive in the product details scenario, the API Gateway should return the rest of the product details to the client since they are still useful to the user. The recommendations could either be empty or replaced by, for example, a hardwired top ten list. If, however, the product information service is unresponsive then API Gateway should return an error to the client.

The API Gateway could also return cached data if that was available. For example, since product prices change infrequently, the API Gateway could return cached pricing data if the pricing service is unavailable. The data can be cached by the API Gateway itself or be stored in an external cache such as Redis or Memcached. By returning either default data or cached data, the API Gateway ensures that system failures do not impact the user experience.

Netflix Hystrix is an incredibly useful library for writing code that invokes remote services. Hystrix times out calls that exceed the specified threshold. It implements a circuit breaker pattern, which stops the client from waiting needlessly for an unresponsive service. A circuit breaker typically applies state transitions like:

 

If the error rate for a service exceeds a specified threshold, Hystrix trips the circuit breaker and all requests will fail immediately for a specified period of time. Hystrix lets you define a fallback action when a request fails, such as reading from a cache or returning a default value. If you are using the JVM you should definitely consider using Hystrix. And, if you are running in a non-JVM environment, you should use an equivalent library.

Handling Scale-Out: Service Discovery

While using an API Gateway is better than having the clients talk directly to the services we can do a little better.  Here is another problem to consider, what happens when we scale a service in microservice based architecture? For example consider the following diagram where we have scaled the catalog service but the API Gateway has no idea about the new service instance so how can it take advantage of the additional instance? 


We would have to change the API Gateway so it knows about the second instance.  With features like auto-scaling provided by the cloud platform we are deployed to and the fact we might have 100s or services, modifying the API Gateway would just not scale.

Soution:

To solve this problem, microservice applications typically use a Service Discovery application which allows all microservices to register themselves and then broadcast their existence to other services in the application. 

With our Service Discovery application now deployed, each service instance registers itself with the Service Discovery application and in turn also queries the Service Discovery application for what other services are available to it.  This solves the scaling problem with our API Gateway.  As we bring up more instances of XYZ service they will all register themselves with the Service Discovery application and the API Gateway will periodically query the Service Discovery application to make sure it has an updated list of services.  Once the new instances have been registered with the Service Discovery application the API Gateway will get them and then be able to leverage those new instances when making requests to the service.  All this can be done dynamically without changing and code in the API Gateway, which is exactly what we want.

 

There are additional benefits to using a Service Discovery application as well.  Consider the case where we have the same microservices deployed across multiple datacenters.  Each datacenter has a deployment that looks like the architecture above.  The Service Discovery applications in each datacenter can share data about the services running in that datacenter with each other.  This would allow the application to be unaffected by a complete outage of a given service in one datacenter given the service is still available in another datacenter. This type of functionality will give our application even more resiliency to failures.




References:





No comments:

Post a Comment