Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64 V8 builds run out of memory #4027

Closed
targos opened this issue Feb 26, 2025 · 13 comments · Fixed by nodejs/node#57330
Closed

x64 V8 builds run out of memory #4027

targos opened this issue Feb 26, 2025 · 13 comments · Fixed by nodejs/node#57330

Comments

@targos
Copy link
Member

targos commented Feb 26, 2025

Example: https://ci.nodejs.org/job/node-test-commit-v8-linux/6426/nodes=benchmark-ubuntu2204-intel-64,v8test=v8test/console

Here's what I see after some time if I run the build manually:

Image

@targos
Copy link
Member Author

targos commented Feb 26, 2025

It happens while running:

PATH=/home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin ninja -C out.gn/x64.release/  d8 cctest inspector-test

@richardlau
Copy link
Member

@targos
Copy link
Member Author

targos commented Feb 26, 2025

Oh, I really thought it was the V8 update. I didn't see it also failed with main.

@targos targos changed the title x64 V8 13.4 builds run out of memory x64 V8 builds run out of memory Feb 26, 2025
@targos
Copy link
Member Author

targos commented Feb 26, 2025

apt logs around that date:

Start-Date: 2025-02-11  06:44:41
Commandline: /usr/bin/unattended-upgrade
Upgrade: vim:amd64 (2:8.2.3995-1ubuntu2.22, 2:8.2.3995-1ubuntu2.23), vim-common:amd64 (2:8.2.3995-1ubuntu2.22, 2:8.2.3995-1ubuntu2.23), vim-tiny:amd64 (2:8.2.3995-1ubuntu2.22, 2:8.2.3995-1ubuntu2.23), vim-runtime:amd64 (2:8.2.3995-1ubuntu2.22, 2:8.2.3995-1ubuntu2.23)
End-Date: 2025-02-11  06:44:44

Start-Date: 2025-02-11  06:44:45
Commandline: /usr/bin/unattended-upgrade
Upgrade: xxd:amd64 (2:8.2.3995-1ubuntu2.22, 2:8.2.3995-1ubuntu2.23)
End-Date: 2025-02-11  06:44:46

Start-Date: 2025-02-13  06:01:41
Commandline: /usr/bin/unattended-upgrade
Upgrade: tzdata:amd64 (2024b-0ubuntu0.22.04, 2024b-0ubuntu0.22.04.1)
End-Date: 2025-02-13  06:01:42

Start-Date: 2025-02-18  06:22:15
Commandline: /usr/bin/unattended-upgrade
Upgrade: intel-microcode:amd64 (3.20241112.0ubuntu0.22.04.1, 3.20250211.0ubuntu0.22.04.1)
End-Date: 2025-02-18  06:22:24

@targos
Copy link
Member Author

targos commented Feb 26, 2025

I can only guess that something changed in Chromium's tooling (depot_tools et al).

@richardlau
Copy link
Member

richardlau commented Mar 5, 2025

I saw https://ci.nodejs.org/job/node-test-commit-v8-linux/nodes=benchmark-ubuntu2204-intel-64,v8test=v8test/6438/ was running so ssh'ed into the machine to see what it was doing while it was running.

The build appears to be spawning a lot of python3 processes:

root@test-hetzner-ubuntu2204-x64-2 ~ # ps -ef | grep python3
...
iojs     2611881 2611878  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611884 2611881  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611887 2611884  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611890 2611887  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611893 2611890  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611896 2611893  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611899 2611896  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
iojs     2611902 2611899  0 10:41 ?        00:00:00 python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
root@test-hetzner-ubuntu2204-x64-2 ~ # ps -ef | grep python3 | wc -l
13160
root@test-hetzner-ubuntu2204-x64-2 ~ #

all of them with the command line:

python3 /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools/ninja.py -C out.gn/x64.release/ -j 20 d8 cctest inspector-test

@targos
Copy link
Member Author

targos commented Mar 5, 2025

Maybe this ninja.py proxy is the new thing?

@richardlau
Copy link
Member

Maybe related to https://chromium-review.googlesource.com/c/chromium/tools/depot_tools/+/6259139? It landed on the day our CI started failing.

@richardlau
Copy link
Member

(I'm in a meeting right now, but my suspicion is the proxy is spawning itself instead of the actual ninja binary.)

@cclauss
Copy link
Contributor

cclauss commented Mar 5, 2025

Would lowering -j 20 give us more visibility or use less memory?

@richardlau
Copy link
Member

richardlau commented Mar 5, 2025

Maybe related to https://chromium-review.googlesource.com/c/chromium/tools/depot_tools/+/6259139? It landed on the day our CI started failing.

It is this change. There's logic to skip recursively calling the ninja proxy, but it was changed in that CL from

        if bin_dir.rstrip(os.sep).endswith("depot_tools"):
            # skip depot_tools to avoid calling ninja.py infinitely.
            continue

to

        bin_dir = bin_dir.rstrip(os.sep)
        if os.path.basename(bin_dir) == "depot_tools":
            # skip depot_tools to avoid calling ninja.py infinitely.
            continue

In our builds we have two depot_tools directories on the path:
i.e.

10:43:02 + PATH=/home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools:/home/iojs/build/workspace/node-test-commit-v8-linux/depot_tools:/home/iojs/venv/bin:/home/iojs/nghttp2/src:/home/iojs/wrk:/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ninja -C out.gn/x64.release/ -j 20 d8 cctest inspector-test
  • /home/iojs/build/workspace/node-test-commit-v8-linux/deps/v8/_depot_tools
  • /home/iojs/build/workspace/node-test-commit-v8-linux/depot_tools

The old code (using endswith) skipped both of these directories, but the new code (==) doesn't skip the one with the leading _ (i.e. _depot_tools).

I don't know/remember why we have two depot_tools directories (seems one of those should be redundant?). The second of those is an explicit git clone in the build steps for node-test-commit-v8-linux:

# setup depot tools
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git depot_tools
export PATH=$WORKSPACE/depot_tools:$PATH

I'm wondering if the first (_depot_tools) that is problematic and being recursed is coming from us setting DEPOT_TOOLS_DIR in
https://github.com/nodejs/node/blob/4e222aea82abe0d5a4d82b5c975e094dae6c9d61/tools/make-v8.sh#L56-L59

  DEPOT_TOOLS_DIR="$(cd _depot_tools && pwd)"
  # shellcheck disable=SC2086
  PATH="$DEPOT_TOOLS_DIR":$PATH tools/dev/v8gen.py "$BUILD_ARCH_TYPE" $V8_BUILD_OPTIONS
  PATH="$DEPOT_TOOLS_DIR":$PATH ninja -C "out.gn/$BUILD_ARCH_TYPE/" "${JOBS_ARG}" d8 cctest inspector-test

@richardlau
Copy link
Member

richardlau commented Mar 5, 2025

I'm wondering if the first (_depot_tools) that is problematic and being recursed is coming from us setting DEPOT_TOOLS_DIR in https://github.com/nodejs/node/blob/4e222aea82abe0d5a4d82b5c975e094dae6c9d61/tools/make-v8.sh#L56-L59

Looks like it's actually coming from https://github.com/nodejs/node/blob/4e222aea82abe0d5a4d82b5c975e094dae6c9d61/tools/v8/node_common.py#L21

    depot_tools = os.path.join(v8_path, "_depot_tools")

via https://github.com/nodejs/node/blob/4e222aea82abe0d5a4d82b5c975e094dae6c9d61/tools/v8/fetch_deps.py#L71-L72

  # Check out depot_tools if necessary.
  depot_tools = node_common.EnsureDepotTools(v8_path, True)

@richardlau
Copy link
Member

I've opened nodejs/node#57330 to rename the clone we make with _depot_tools to depot_tools so that it gets excluded by the proxy again.

marco-ippolito pushed a commit to nodejs/node that referenced this issue Mar 7, 2025
Recent changes to `depot_tools`'s `ninja.py` proxy is causing infinite
recursion in our V8 CI builds as we checkout `depot_tools` into a
directory with a leading `_` (i.e. `_depot_tools`) and the proxy now
checks for an exact match (i.e. `== "depot_tools"`) instead of
`endswith("depot_tools")`.

Rename our checkout to `depot_tools` (without the leading `_`) so the
`ninja.py` proxy can exclude it when reinvoking `ninja`.

PR-URL: #57330
Fixes: nodejs/build#4027
Refs: https://chromium-review.googlesource.com/c/chromium/tools/depot_tools/+/6259139
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Christian Clauss <[email protected]>
aduh95 pushed a commit to nodejs/node that referenced this issue Mar 9, 2025
Recent changes to `depot_tools`'s `ninja.py` proxy is causing infinite
recursion in our V8 CI builds as we checkout `depot_tools` into a
directory with a leading `_` (i.e. `_depot_tools`) and the proxy now
checks for an exact match (i.e. `== "depot_tools"`) instead of
`endswith("depot_tools")`.

Rename our checkout to `depot_tools` (without the leading `_`) so the
`ninja.py` proxy can exclude it when reinvoking `ninja`.

PR-URL: #57330
Fixes: nodejs/build#4027
Refs: https://chromium-review.googlesource.com/c/chromium/tools/depot_tools/+/6259139
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Christian Clauss <[email protected]>
aduh95 pushed a commit to nodejs/node that referenced this issue Mar 9, 2025
Recent changes to `depot_tools`'s `ninja.py` proxy is causing infinite
recursion in our V8 CI builds as we checkout `depot_tools` into a
directory with a leading `_` (i.e. `_depot_tools`) and the proxy now
checks for an exact match (i.e. `== "depot_tools"`) instead of
`endswith("depot_tools")`.

Rename our checkout to `depot_tools` (without the leading `_`) so the
`ninja.py` proxy can exclude it when reinvoking `ninja`.

PR-URL: #57330
Fixes: nodejs/build#4027
Refs: https://chromium-review.googlesource.com/c/chromium/tools/depot_tools/+/6259139
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Christian Clauss <[email protected]>
richardlau added a commit to nodejs/node that referenced this issue Mar 13, 2025
Recent changes to `depot_tools`'s `ninja.py` proxy is causing infinite
recursion in our V8 CI builds as we checkout `depot_tools` into a
directory with a leading `_` (i.e. `_depot_tools`) and the proxy now
checks for an exact match (i.e. `== "depot_tools"`) instead of
`endswith("depot_tools")`.

Rename our checkout to `depot_tools` (without the leading `_`) so the
`ninja.py` proxy can exclude it when reinvoking `ninja`.

PR-URL: #57330
Fixes: nodejs/build#4027
Refs: https://chromium-review.googlesource.com/c/chromium/tools/depot_tools/+/6259139
Reviewed-By: Michaël Zasso <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Christian Clauss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants