-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRPC Node Client high memory usage, likely a memory leak #686
Comments
Can you show some specific code that experiences this problem? |
@murgatroid99 Thanks for your reply. I did a simple test with grpc node example here. and made a small change in client.
I observed rss grows quickly. so does heap. but both of them are not proportionate. |
gRPC client requests are asnychronous. What you are doing here is initiating 100,000 requests, and then checking the memory usage again without waiting for any of them to complete. Each request uses some memory while it is active, so it is completely normal that the memory usage would be higher at that time. Even accounting for that, increased memory usage alone is not sufficient to conclude that there is a memory leak. If you could show that after allowing calls to complete and the garbage collector to run, there is still increased memory usage proportional to the number of calls, then there is a memory leak. |
Yes, You are right. I have been observing a memory leak in my production code. when I am using grpc client to support logging. Let me come up with correct test code. Thanks |
Here's a modified example that demonstrates the issue:
I ran this against the example server, unmodified, and here's the output:
The heap stayed about the same, after GC, but rss shot up and stayed up. Perhaps some memory leak in the native code (code not using node's heap)? This is with:
|
The fact that memory increases here does not imply that there is a memory leak. In this example, the client object is still in scope, and it makes sense that the internal objects associated with it will use an approximately fixed amount of memory after making some requests as opposed to before. If you want to show that calls are leaking memory, you have to show that the increase in memory usage after garbage collection is proportional to the number of calls made. In other words, you need more data points. |
The fact RSS stays up is a normal behavior. Worrying about this value is the wrong thing to do. RSS isn't physically allocated memory. It's "reserved" memory, and its behavior is kind of never go down once it went up. The pages inside the RSS sections are sent back to the kernel, but the RSS high watermark doesn't move because there might be pages there that are still used. You can end up with a machine that has 10 times more memory in the RSS sections of its processes than the total physical memory + swap space. That's not an indication of bad behavior. It's only an indication that the memory usage of said processes went up at some point, and helps gauging peak memory usage of processes. Closing this one. |
As an additional note, the The much greater increase to RSS comes from the fact that you had 100,000 calls pending at the same time. |
For further proof, if I modify further the code to dump memory usage after every RPC, I get the following graphs: If there was a leak demonstrable with your numbers, you'd see a linear growth somewhere. This isn't the case here. Link to the data + live graph: https://docs.google.com/spreadsheets/d/1jXcZ88EnR15k5QUcmgvFF0BI5BRe7jRaS-hVPOkjwTo/edit?usp=sharing New code:
|
Also, as @murgatroid99 pointed out, the methodology of running all rpcs at once is inherently flawed. If we run sequentially however, this is much more relevant, and here's the corresponding graphs we get, which shows very clearly that there is no memory leak: Link to the data: https://docs.google.com/spreadsheets/d/12i9MC7Nl1Zkg3gInOhrhk_JRVAA17Rc-6UF5dxRCPAE/edit?usp=sharing Associated code:
|
By comparison, here's what a memory leak looks like: (taken from nodejs/help#1518) |
OK. Maybe I have a leak somewhere else. I'm getting a lot of GRPC-related errors in my logs that coincide with extremely high and ever-growing RSS (>10GB) matching OP's symptoms. I'll poke around some more. Thanks for investigating. |
It might still be grpc-related, I'm not contesting this, but your reproduction case definitely wasn't demonstrating any memory leak. I'm not excluding the possibility that some error may leak memory for instance, but you'd need to come up with a reproduction case that causes this error and demonstrates a leak. |
Yeah, I was mostly just adjusting the original reproduction case to address the concerned raised (the one about the async nature), hoping that it would help highlight some issue. As for the node graph, I'm trying to figure out the difference between it and the graph from the reproduction case, other than the fact that the node graph was over 2 days time and the grpc graph was immediate (because we are running all 100k requests at once). If I can show slow but steady growth over time using that sample script, but slowed down substantially, would that help? |
The graph I produced took samples over a few seconds, and it plateaued within milliseconds. I could let it run forever and it would probably still plateau similarly. Also, the location of the plateau is correlated with the moment V8's garbage collector changed strategy: The other graph took days to gather, and it was growing linearly over the same sort of timeframe that I did. It started slowing down after some hours because the machine was thrashing due to having less swap space to work, but I'd be certain if their X axis was "number of operations" like mine instead of "time", the RSS section would be fully linearly growing. So yes, if you can come up with some code sample that shows a linear growth in memory usage, that'd be useful. It should probably look like this over a couple of minutes: (yay paint) |
Problem description
GRPC NodeJS client is consuming lot of heap memory and crashing the node application.
Reproduction steps
Call the GRPC Node client for 10000 times in a loop. and it shows very high heap memory used.
I tested it with Simple RPC and Client streaming RPC, It consumes huge heap memory.
Environment
Additional context
Using the GRPC client for sending the metrics to the GRPC Server. GRPC Server is created in Java.
Node Application is consuming data from Kafka and generate the metrics.
When ran the test for 10000 Records below is the heap usage observed for simple RPC
Client Streaming RPC seems like costlier than previous
The text was updated successfully, but these errors were encountered: