Sunday, 3 July 2016

Node.js Event Model: A Paradigm Shift

When I first started working with node.js few years back I found it little difficult to understand some of the basic concept of the event model of it initially. Especially the way we call functions from another functions in node.js is totally different as opposed to the traditional approach. In general all the function call comes with a callback function in node.js. Why is that? What was wrong in the traditional way (synchronous) of calling a method.

A Quick Look at the Threaded Model

Let's get a step back and see how the conventional way of function calls work. We call it a threaded model as each request comes to server initiates a new thread and all the subsequent work happens in the same thread until the request is complete and response is sent to the client (caller). The figure below tries to illustrate the same. 

Fig 1: Request in a threaded model

There are two requests to get a file (downloadFile) and to get some user data (getUserDetails). The downloadFile request opens the file, reads the contents, and then sends the data back in a response. All this occurs in order (sequence) on the same thread. The getUserDetails request connects to the database, queries the necessary data, and then sends the data in the response. 


The problem arises when any of the threads is working on some I/O operation and waiting for some response. For example when Thread 2 tries to connect to the DB, the thread will become idle until the response comes from the DB. If the network latency is little high it will take a while and during that time the CPU cycles are not utilized. Likewise if there are lot of concurrent requests and threads are busy in blocking I/O, after some point of time there may be a crunch situation if the existing thread pool. Hence the subsequent request won't be served. This also leads to high memory usage as the threads pool is exhausted.

How Node.js Approaches the Same Problem

Node.js provides scalability and performance through its powerful event-driven model. Node.js applications run in a single-threaded event-driven model. Although Node.js implements a thread pool in the background to do work, the application itself doesn’t have any concept of multiple threads. “Wait, what about performance and scale?” you might ask. At first a single-threaded server might seem counterintuitive, but once you understand the logic behind the Node.js event model, it all makes perfect sense.

Instead of executing all the work for each request on individual threads, Node.js adds work to an event queue and then has a single thread running an event loop pick it up. The event loop grabs the top item in the event queue, executes it, and then grabs the next item. When executing code that is longer lived or has blocking I/O, instead of calling the function directly, it adds the function to the event queue along with a callback that will be executed after the function completes. When all events on the Node.js event queue have been executed, the Node.js application terminates.

The below figure illustrates how Node.js handles the downloadFile and getUserDetails requests. It adds he downloadFile and getUserDetails requests to the event queue. It first picks up the downloadFile request, executes it, and completes by adding the Open() callback function to the event queue. Then it picks up the getUserDetails request, executes it, and completes by adding the Connect() callback function to the event queue. This continues until there are no callback functions to be executed. Notice that the events for each thread don’t necessarily follow a direct interleaved order. For example, the Connect request takes longer to complete than does the Read request, so Send(file) is called before Query(db).

Fig2: Node.js Event-driven Model


Blocking I/O in Node.js

Some examples of blocking I/O are:


  • Image Reading a file
  • Image Querying a database
  • Image Requesting a socket
  • Image Accessing a remote service

Node.js uses event callbacks to avoid having to wait for blocking I/O. Therefore, any requests that perform blocking I/O are performed on a different thread in the background. Node.js implements a thread pool in the background. When an event that blocks I/O is retrieved from the event queue, Node.js retrieves a thread from the thread pool and executes the function there instead of on the main event loop thread. This prevents the blocking I/O from holding up the rest of the events in the event queue.After the execution gets completed the thread will go back to the thread pool.

The function that is executed on the blocking thread can still add events back to the event queue to be processed. For example, a database query call is typically passed a callback function that parses the results and may schedule additional work on the event queue before sending a response.

The figure below illustrates the full Node.js event model, including the event queue, event loop, and thread pool. Notice that the event loop either executes the function on the event loop thread itself or, for blocking I/O, executes the function on a separate thread.

Fig 3: Node.js Event-driven model with thread pool


Conclusion

The advantage of this approach is the main thread is never become idle and no CPU cycles are wasted. At the same time memory foot print of the server is also pretty less. The event-driven model is doing the trick for node.js.

Reference:
Link: https://nodejs.org/en/docs/

Book: Node.js, MongoDB, and AngularJS Web Development by Brad Dayley