Node.js Worker Thread: A Detailed Explanation

Node.js is single threaded and when there is a task that is CPU intensive, then the main thread can get blocked. Here comes the use of a Worker thread.

CPU intensive task

If the CPU needs a lot of memory to process a user request or a logic, then that task can be CPU intensive. Sometimes a function or code block needs large memory and time to process data, then the Node.js main thread can be blocked. A further request to the app or server will get stuck by waiting for the call stack to get free. Due to CPU-intensive tasks, the Node.js call stack remains busy.

Here is a sample of code block that is CPU intensive operation:

const apiResponses = [];
for (let i = 0; i < 1000; i += 1) {
    const apiResponse = <fetch data>;
    apiResponses.push(...apiResponse);
}

// Let's say we have 10 million records in apiResponses array
// Now write excel file
ExcelService.write('large_file.xls', apiResponses);

// Meanwhile the service is writing the file, send another request to the server
// You will see the effect
CPU intensive operation

In node.js, worker threads can be used to handle such CPU intensive operations and to avoid blocking of main thread.

Worker thread

A node.js worker thread has its own instance of V8 and Event Loop. A worker thread runs in isolation that's why it doesn't block the main thread and you can use it to run some code parallelly. Node.js provides a core module called worker_threads for worker thread.

The node.js main thread is called the parent thread and all other worker threads initiated using  worker_threads module are called child threads. All child threads can communicate with the parent threads through messages.

Use the below code to access the worker thread:

const {
    Worker, 
    parentPort, 
    isMainThread, 
    workerData,
} = require('worker_threads');
Node.js worker thread module

Here is a description of some properties of the worker_thread:

  • Worker: The Worker class represents an independent JavaScript execution thread. Most Node.js APIs are available inside of it. This is used to initiate a new thread, close the thread, and communicate with the thread.
  • parentPort: This is an instance of MessagePort class and is used to communicate with the parent thread. This is used inside the child worker script to communicate with the parent thread.
  • isMainThread: This is a boolean to detect if the script is running on the main thread or not. Usually, this is used when the same script is used to initiate the thread and as a worker thread script.
  • workerData: This is an input for the worker script when a worker thread is initialized. We can pass some input to the worker script when a worker thread is initialized, the same data will be received by the child worker script using this variable.

Initialize a worker thread

Here is how you can initialize a new worker thread:

const {Worker} = require('worker_threads');
const newWorker = new Worker(workerFile, {
    workerData: {},
    resourceLimits: {},
});

// workerFile - This point to the worker script (a .js file). And the path should be either an absolute or relative path (relative to current working dir).
// resourceLimits - This is very important parameter if you are processing some heavy data.
Initiating a new worker thread

When a new object of Worker class is initialized, node.js start creating a new thread. It may take few seconds because node.js creates a new environment for the thread. This is a costly operation. It should not be repeated everything unless it is necessary.

The resourceLimits parameter can be used to increase or decrease resource limits used by the JS engine for the thread.

resourceLimits.maxOldGenerationSizeMb

This is a very important parameter and this defines the max memory used by the worker thread. By default, it depends on the system how much max memory a worker thread can use but it is always good to set the max limit otherwise the worker thread will get terminated on reaching the limit.

How does worker thread communicate?

I am going to define two files to explain the communication.

  1. worker.js: We will use this file to initiate a worker thread and send a message to the worker script and receive the message.
  2. file_writer.js: We will use this file to receive a message from the parent thread that initialized the worker thread and send a message back after processing.

worker.js

Here is the below example a Worker instance fileWriteWorker is created. To send a message to the worker script, we have used the fileWriteWorker.postMessage method. This method sends data to the child worker script. To receive data from the worker script, we need to add listeners to the events like message, error, messageerror, and exit etc.

Here is the sample code:

const {Worker} = require('worker_threads');
const workerFile = './file_writer.js';
const fileWriteWorker = new Worker(workerFile);

const workerOnMessage = (message) => {
    // response send by child worker
    console.log('worker thread message log', message);
};

const workerOnError = (error) => {
    // this method receives error if worker terminated due to some error
};

const workerOnExit = (code) => {
    // executed after worker exited
};

fileWriteWorker.on('message', workerOnMessage);
fileWriteWorker.on('error', workerOnError);
fileWriteWorker.on('messageerror', workerOnError);
fileWriteWorker.on('exit', workerOnExit);

// send message to child worker to perform cpu intensive task
fileWriteWorker.postMessage({
    filePath: '/tmp/test.txt',
    data: {
        name: 'Test File',
    },
});
worker.js file

file_writer.js

This is the worker script. We use parentPort a property of worker_threads module to receive and send messages to the parent thread that initialized the child worker thread. We need to add listeners to the events on the parentPort like message events. And to send messages to the parent we use the parentPort.postMessage method.

Here is the sample code:

const {parentPort} = require('worker_threads');
parentPort.on('message', (message) => {
    // code to write file here

    console.log('worker script message log', message);

    // send message back to main thread
    parentPort.postMessage({
            success: true,
            error: null,
    });
});
file_writer.js file

How to run?

You can run the above scripts. Put both the scripts worker.js and file_writer.js in a folder and run node worker.js on your terminal. You should see the below output:

worker script message log { filePath: '/tmp/test.txt', data: { name: 'Test File' } }
worker thread message log { success: true, error: null }
log

The first message belongs to file_writer.js and the second message belongs to worker.js the file. That shows that a thread is initialized and message communication was established.

Terminate worker

Use worker.terminate() method to terminate a worker thread.

Initializing a new worker thread every time we want to use that is costly. So always initialize the thread once and keep track of initialized thread for better performance.