GRPC Node Client high memory usage, likely a memory leak #686

Vijendra07Kulhade · 2018-12-22T10:20:59Z

Problem description

GRPC NodeJS client is consuming lot of heap memory and crashing the node application.

Reproduction steps

Call the GRPC Node client for 10000 times in a loop. and it shows very high heap memory used.
I tested it with Simple RPC and Client streaming RPC, It consumes huge heap memory.

Environment

OS name, version and architecture: [Mac Mojave]
Node version [10.7.0 and 9.11.2]
Node installation method [brew, yum]
If applicable, compiler version [Apple LLVM version 10.0.0 (clang-1000.11.45.5) ]
Package name and version [[email protected] and [email protected] ]

Additional context

Using the GRPC client for sending the metrics to the GRPC Server. GRPC Server is created in Java.
Node Application is consuming data from Kafka and generate the metrics.
When ran the test for 10000 Records below is the heap usage observed for simple RPC

Memory usage in the beginning  { rss: 39030784,
  heapTotal: 23445504,
  heapUsed: 12618584,
  external: 82137 }
Memory usage at the end  { rss: 2993180672,
  heapTotal: 1056776192,
  heapUsed: 961288312,
  external: 12063228 }

Client Streaming RPC seems like costlier than previous

Memory usage in the beginning  { rss: 37113856,
  heapTotal: 22396928,
  heapUsed: 14024248,
  external: 600014 }
Memory usage at the end  { rss: 3347972096,
  heapTotal: 926752768,
  heapUsed: 831483600,
  external: 24105468 }

The text was updated successfully, but these errors were encountered:

murgatroid99 · 2019-01-03T17:36:25Z

Can you show some specific code that experiences this problem?

Vijendra07Kulhade · 2019-01-05T09:27:21Z

@murgatroid99 Thanks for your reply. I did a simple test with grpc node example here.
https://github.com/grpc/grpc/tree/master/examples/node
I started the grpc server from this code
https://github.com/grpc/grpc/blob/master/examples/node/dynamic_codegen/greeter_server.js

and made a small change in client.

function main() {
  var client = new hello_proto.Greeter('localhost:50051',
                                       grpc.credentials.createInsecure());
  var user;
  if (process.argv.length >= 3) {
    user = process.argv[2];
  } else {
    user = 'world';
  }
console.log('Memory before ',JSON.stringify(process.memoryUsage()));
  for(let i=0;i<100000;i++){
     client.sayHello({name: user}, function(err, response) {
       console.log('Greeting:', response.message);
     });
}
 console.log('Memory After grpc call ',JSON.stringify(process.memoryUsage()));
}

I observed rss grows quickly. so does heap. but both of them are not proportionate.

murgatroid99 · 2019-01-07T18:55:10Z

gRPC client requests are asnychronous. What you are doing here is initiating 100,000 requests, and then checking the memory usage again without waiting for any of them to complete. Each request uses some memory while it is active, so it is completely normal that the memory usage would be higher at that time.

Even accounting for that, increased memory usage alone is not sufficient to conclude that there is a memory leak. If you could show that after allowing calls to complete and the garbage collector to run, there is still increased memory usage proportional to the number of calls, then there is a memory leak.

Vijendra07Kulhade · 2019-01-07T19:24:49Z

Yes, You are right. I have been observing a memory leak in my production code. when I am using grpc client to support logging. Let me come up with correct test code. Thanks

dpkirchner · 2019-03-07T18:29:35Z

Here's a modified example that demonstrates the issue:

function main() {
  var client = new hello_proto.Greeter('localhost:50051',
                                       grpc.credentials.createInsecure());
  var user;
  if (process.argv.length >= 3) {
    user = process.argv[2];
  } else {
    user = 'world';
  }

  global.gc();
  console.log('Memory before ',JSON.stringify(process.memoryUsage()));

  var count = 100000;
  var done = 0;
  for(let i=0;i<count;i++){
     client.sayHello({name: user}, function(err, response) {
       //console.log('Greeting:', response.message);
       done++;

       if (done == count) {
         console.log('Memory After grpc call ',JSON.stringify(process.memoryUsage()));
         global.gc();
         console.log('Memory After global gc ',JSON.stringify(process.memoryUsage()));
       }
     });
  }
}

I ran this against the example server, unmodified, and here's the output:

$ node --expose_gc ./dynamic_codegen/greeter_client.js 
Memory before  {"rss":42614784,"heapTotal":19783680,"heapUsed":9054992,"external":722523}
Memory After grpc call  {"rss":1805291520,"heapTotal":290840576,"heapUsed":257489912,"external":941424}
Memory After global gc  {"rss":1575428096,"heapTotal":59629568,"heapUsed":9215912,"external":811472}

The heap stayed about the same, after GC, but rss shot up and stayed up. Perhaps some memory leak in the native code (code not using node's heap)?

This is with:

$ cat package.json
{
  "name": "grpc-examples",
  "version": "0.1.0",
  "dependencies": {
    "@grpc/proto-loader": "^0.1.0",
    "async": "^1.5.2",
    "google-protobuf": "^3.0.0",
    "grpc": "^1.11.0",
    "lodash": "^4.6.1",
    "minimist": "^1.2.0"
  }
}
$ node --version
v8.15.1

murgatroid99 · 2019-03-07T18:47:33Z

The fact that memory increases here does not imply that there is a memory leak. In this example, the client object is still in scope, and it makes sense that the internal objects associated with it will use an approximately fixed amount of memory after making some requests as opposed to before.

If you want to show that calls are leaking memory, you have to show that the increase in memory usage after garbage collection is proportional to the number of calls made. In other words, you need more data points.

nicolasnoble · 2019-03-07T19:13:02Z

The fact RSS stays up is a normal behavior. Worrying about this value is the wrong thing to do.

RSS isn't physically allocated memory. It's "reserved" memory, and its behavior is kind of never go down once it went up. The pages inside the RSS sections are sent back to the kernel, but the RSS high watermark doesn't move because there might be pages there that are still used.

You can end up with a machine that has 10 times more memory in the RSS sections of its processes than the total physical memory + swap space. That's not an indication of bad behavior. It's only an indication that the memory usage of said processes went up at some point, and helps gauging peak memory usage of processes.

Closing this one.

murgatroid99 · 2019-03-07T19:20:26Z

As an additional note, the heapUsed and external values show actual memory usage, and the increases to those values are so small compared to the number of calls that this test almost surely rules out a per-call memory leak.

The much greater increase to RSS comes from the fact that you had 100,000 calls pending at the same time.

nicolasnoble · 2019-03-07T19:38:51Z

For further proof, if I modify further the code to dump memory usage after every RPC, I get the following graphs:

If there was a leak demonstrable with your numbers, you'd see a linear growth somewhere. This isn't the case here.

Link to the data + live graph:

https://docs.google.com/spreadsheets/d/1jXcZ88EnR15k5QUcmgvFF0BI5BRe7jRaS-hVPOkjwTo/edit?usp=sharing

New code:

  let count = 100000
  for (let i = 0; i < count; i++) {
    client.sayHello({name: user}, function(err, response) {
      let usage = process.memoryUsage()
      console.log(usage.rss + ',' + usage.heapTotal + ',' + usage.heapUsed + ',' + usage.external)
    })
  }

nicolasnoble · 2019-03-07T20:58:15Z

Also, as @murgatroid99 pointed out, the methodology of running all rpcs at once is inherently flawed. If we run sequentially however, this is much more relevant, and here's the corresponding graphs we get, which shows very clearly that there is no memory leak:

Link to the data:

https://docs.google.com/spreadsheets/d/12i9MC7Nl1Zkg3gInOhrhk_JRVAA17Rc-6UF5dxRCPAE/edit?usp=sharing

Associated code:

  let count = 100000
  let done = 0
  let startCall
  let cb = function(err, response) {
    let usage = process.memoryUsage()
    console.log(usage.rss + ',' + usage.heapTotal + ',' + usage.heapUsed + ',' + usage.external)
    if (++done != count) startCall()
  }
  startCall = function() {
    client.sayHello({ name: user }, cb)
  }
  startCall()

nicolasnoble · 2019-03-07T21:16:19Z

By comparison, here's what a memory leak looks like:

(taken from nodejs/help#1518)

dpkirchner · 2019-03-07T22:05:38Z

OK. Maybe I have a leak somewhere else. I'm getting a lot of GRPC-related errors in my logs that coincide with extremely high and ever-growing RSS (>10GB) matching OP's symptoms. I'll poke around some more. Thanks for investigating.

nicolasnoble · 2019-03-07T22:54:26Z

It might still be grpc-related, I'm not contesting this, but your reproduction case definitely wasn't demonstrating any memory leak. I'm not excluding the possibility that some error may leak memory for instance, but you'd need to come up with a reproduction case that causes this error and demonstrates a leak.

dpkirchner · 2019-03-07T23:32:23Z

Yeah, I was mostly just adjusting the original reproduction case to address the concerned raised (the one about the async nature), hoping that it would help highlight some issue.

As for the node graph, I'm trying to figure out the difference between it and the graph from the reproduction case, other than the fact that the node graph was over 2 days time and the grpc graph was immediate (because we are running all 100k requests at once). If I can show slow but steady growth over time using that sample script, but slowed down substantially, would that help?

nicolasnoble · 2019-03-08T00:03:18Z

The graph I produced took samples over a few seconds, and it plateaued within milliseconds. I could let it run forever and it would probably still plateau similarly. Also, the location of the plateau is correlated with the moment V8's garbage collector changed strategy:

The other graph took days to gather, and it was growing linearly over the same sort of timeframe that I did. It started slowing down after some hours because the machine was thrashing due to having less swap space to work, but I'd be certain if their X axis was "number of operations" like mine instead of "time", the RSS section would be fully linearly growing.

So yes, if you can come up with some code sample that shows a linear growth in memory usage, that'd be useful. It should probably look like this over a couple of minutes: (yay paint)

nicolasnoble added the requires reporter action label Jan 4, 2019

nicolasnoble closed this as completed Mar 7, 2019

lock bot locked as resolved and limited conversation to collaborators Jun 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRPC Node Client high memory usage, likely a memory leak #686

GRPC Node Client high memory usage, likely a memory leak #686

Vijendra07Kulhade commented Dec 22, 2018

murgatroid99 commented Jan 3, 2019

Vijendra07Kulhade commented Jan 5, 2019

murgatroid99 commented Jan 7, 2019

Vijendra07Kulhade commented Jan 7, 2019

dpkirchner commented Mar 7, 2019 •

edited

Loading

murgatroid99 commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

murgatroid99 commented Mar 7, 2019 •

edited

Loading

nicolasnoble commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

dpkirchner commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

dpkirchner commented Mar 7, 2019

nicolasnoble commented Mar 8, 2019

GRPC Node Client high memory usage, likely a memory leak #686

GRPC Node Client high memory usage, likely a memory leak #686

Comments

Vijendra07Kulhade commented Dec 22, 2018

Problem description

Reproduction steps

Environment

Additional context

murgatroid99 commented Jan 3, 2019

Vijendra07Kulhade commented Jan 5, 2019

murgatroid99 commented Jan 7, 2019

Vijendra07Kulhade commented Jan 7, 2019

dpkirchner commented Mar 7, 2019 • edited Loading

murgatroid99 commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

murgatroid99 commented Mar 7, 2019 • edited Loading

nicolasnoble commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

dpkirchner commented Mar 7, 2019

nicolasnoble commented Mar 7, 2019

dpkirchner commented Mar 7, 2019

nicolasnoble commented Mar 8, 2019

dpkirchner commented Mar 7, 2019 •

edited

Loading

murgatroid99 commented Mar 7, 2019 •

edited

Loading