-
Notifications
You must be signed in to change notification settings - Fork 141
Possible Memory Leak #13
Comments
This is intriguing as we are running the same code for weeks without OOMs, and the profiles look clean. |
This is how I launch the instance |
From the screenshot, it seems those are only using 5GBs or so, with a total of 7GBs consumed. Am I misreading it? |
yes, but this is only after around 30 minutes. The memory consumptions continues to scale linearly, so after 2 or 3 hours around 30gb of memory get's consumed. |
Alright. I think we should take a look at how the db prefetcher is handled and whether it's properly closed on all branches then. |
I have replicated this same issue on another machine. The greedy algotype won't trigger this issue because it doesn't have this line: builder/miner/algo_common.go. Likewise, if you don't have the BUILDER_TX_SIGNING_KEY environment variable the line isn't reached and there is no error. Perhaps thats why other people running this software are not having this issue? |
@cjios thanks for raising this issue I've looked into it. Code handling prefetcher is problematic but we don't see any errors because its never used if:
It works like that because the prefetcher is actually started after the first tx is committed. If the prefetcher was copied it never started and in both of these situations, we copy the state before committing the first tx so it's effectively useless. One solution would be to simply remove the prefetcher startup explicitly and forget about it since we don't use it. But I think that instead, we should fix it so it's never leaked and actually used properly. We don't need it for tx / bundles commit because that part of the code uses snapshots anyway but we need it for block finalization where we calculate the state hash of the block. It takes a considerable amount of time and it hits the disk more than needed. Block finalization is around 30% of the block creation and it can benefit from proper prefetcher. |
System information
Geth version:
1.10.23-stable
Go version:
go1.19.3
OS & Version: Linux
Commit hash : 9ca4d16
builder/miner/algo_common.go
Line 74 in 6cdb430
This line is causing a memory leak for me. My system has 128gb of memory, but after running the builder for around 10-12 hours the node consumes all my available memory and crashes. Commenting out this line results in the builder only using 2-3GB of memory. The following image is the pprof heap output after running the node for around 30 minutes. As you can see the trie being copied is consuming most of the heap space. I'm not sure why the garbage collector isn't clearing the space properly?
The text was updated successfully, but these errors were encountered: