Skip to content

test_interactive_remote_function_calls_no_memory_leak fails on python3.7+ for Mac OS #272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pierreglaser opened this issue May 21, 2019 · 3 comments · Fixed by #330
Closed

Comments

@pierreglaser
Copy link
Member

Title says it all. Should we add a skipif directive?

@pierreglaser pierreglaser changed the title test_interactive_remote_function_calls_no_memory_leakfails on python3.7+ for Mac OS test_interactive_remote_function_calls_no_memory_leak fails on python3.7+ for Mac OS May 21, 2019
@ogrisel
Copy link
Contributor

ogrisel commented May 21, 2019

Can you please paste the traceback here?

What is the cause of the problem? We should try to fix it rather than ignore it. It's possibly revealing a real bug.

@pierreglaser
Copy link
Member Author

pierreglaser commented May 21, 2019

Traceback

__________________________________________________________________ CloudPickleTest.test_interactive_remote_function_calls_no_memory_leak ___________________________________________________________________

source_code = 'if __name__ == "__main__":\n        from testutils import subprocess_worker\n        import struct\n\n        with su...function call.\n            growth = w.memsize() - reference_size\n            assert growth < 1e7, growth\n\n        '
timeout = 60

    def assert_run_python_script(source_code, timeout=TIMEOUT):
        """Utility to help check pickleability of objects defined in __main__

        The script provided in the source code should return 0 and not print
        anything on stderr or stdout.
        """
        fd, source_file = tempfile.mkstemp(suffix='_src_test_cloudpickle.py')
        os.close(fd)
        try:
            with open(source_file, 'wb') as f:
                f.write(source_code.encode('utf-8'))
            cmd = [sys.executable, source_file]
            cwd, env = _make_cwd_env()
            kwargs = {
                'cwd': cwd,
                'stderr': STDOUT,
                'env': env,
            }
            # If coverage is running, pass the config file to the subprocess
            coverage_rc = os.environ.get("COVERAGE_PROCESS_START")
            if coverage_rc:
                kwargs['env']['COVERAGE_PROCESS_START'] = coverage_rc
            if timeout_supported:
                kwargs['timeout'] = timeout
            try:
                try:
>                   out = check_output(cmd, **kwargs)

tests/testutils.py:204:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

timeout = 60, popenargs = (['/Users/pierreglaser/.virtualenvs/cloudpickle_py38/bin/python', '/var/folders/_t/w7jm81211td6bsjgb15qgtk80000gn/T/tmplhl3xydo_src_test_cloudpickle.py'],)
kwargs = {'cwd': '/Users/pierreglaser/repos/cloudpickle', 'env': {'ASSORTMENT_HOME': '/Users/pierreglaser/verteego/monoprix/ass...launchd.clIrqFP4ke/Render', 'CE_HOME': '/Users/pierreglaser/INRIA/patricio_cerda', 'CLICOLOR': '1', ...}, 'stderr': -2}

    def check_output(*popenargs, timeout=None, **kwargs):
        r"""Run command with arguments and return its output.

        If the exit code was non-zero it raises a CalledProcessError.  The
        CalledProcessError object will have the return code in the returncode
        attribute and output in the output attribute.

        The arguments are the same as for the Popen constructor.  Example:

        >>> check_output(["ls", "-l", "/dev/null"])
        b'crw-rw-rw- 1 root root 1, 3 Oct 18  2007 /dev/null\n'

        The stdout argument is not allowed as it is used internally.
        To capture standard error in the result, use stderr=STDOUT.

        >>> check_output(["/bin/sh", "-c",
        ...               "ls -l non_existent_file ; exit 0"],
        ...              stderr=STDOUT)
        b'ls: non_existent_file: No such file or directory\n'

        There is an additional optional argument, "input", allowing you to
        pass a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it too will be used internally.  Example:

        >>> check_output(["sed", "-e", "s/foo/bar/"],
        ...              input=b"when in the course of fooman events\n")
        b'when in the course of barman events\n'

        By default, all communication is in bytes, and therefore any "input"
        should be bytes, and the return value wil be bytes.  If in text mode,
        any "input" should be a string, and the return value will be a string
        decoded according to locale encoding, or by "encoding" if set. Text mode
        is triggered by setting any of text, encoding, errors or universal_newlines.
        """
        if 'stdout' in kwargs:
            raise ValueError('stdout argument not allowed, it will be overridden.')

        if 'input' in kwargs and kwargs['input'] is None:
            # Explicitly passing input=None was previously equivalent to passing an
            # empty string. That is maintained here for backwards compatibility.
            kwargs['input'] = '' if kwargs.get('universal_newlines', False) else b''

>       return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
                   **kwargs).stdout

/usr/local/lib/python3.8/subprocess.py:395:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

input = None, capture_output = False, timeout = 60, check = True
popenargs = (['/Users/pierreglaser/.virtualenvs/cloudpickle_py38/bin/python', '/var/folders/_t/w7jm81211td6bsjgb15qgtk80000gn/T/tmplhl3xydo_src_test_cloudpickle.py'],)
kwargs = {'cwd': '/Users/pierreglaser/repos/cloudpickle', 'env': {'ASSORTMENT_HOME': '/Users/pierreglaser/verteego/monoprix/ass...P4ke/Render', 'CE_HOME': '/Users/pierreglaser/INRIA/patricio_cerda', 'CLICOLOR': '1', ...}, 'stderr': -2, 'stdout': -1}
process = <subprocess.Popen object at 0x105232f00>
stdout = b'Traceback (most recent call last):\n  File "/var/folders/_t/w7jm81211td6bsjgb15qgtk80000gn/T/tmplhl3xydo_src_test_cloudpickle.py", line 32, in <module>\n    assert growth < 1e7, growth\nAssertionError: 10539008\n'
stderr = None, retcode = 1

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.

        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.

        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.

        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.

        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.

        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if 'stdin' in kwargs:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE

        if capture_output:
            if ('stdout' in kwargs) or ('stderr' in kwargs):
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE

        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired:
                process.kill()
                stdout, stderr = process.communicate()
                raise TimeoutExpired(process.args, timeout, output=stdout,
                                     stderr=stderr)
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['/Users/pierreglaser/.virtualenvs/cloudpickle_py38/bin/python', '/var/folders/_t/w7jm81211td6bsjgb15qgtk80000gn/T/tmplhl3xydo_src_test_cloudpickle.py']' returned non-zero exit status 1.

/usr/local/lib/python3.8/subprocess.py:487: CalledProcessError

During handling of the above exception, another exception occurred:

self = <tests.cloudpickle_test.CloudPickleTest testMethod=test_interactive_remote_function_calls_no_memory_leak>

    @pytest.mark.skipif(platform.python_implementation() == 'PyPy',
                        reason="Skip PyPy because memory grows too much")
    def test_interactive_remote_function_calls_no_memory_leak(self):
        code = """if __name__ == "__main__":
        from testutils import subprocess_worker
        import struct

        with subprocess_worker(protocol={protocol}) as w:

            reference_size = w.memsize()
            assert reference_size > 0


            def make_big_closure(i):
                # Generate a byte string of size 1MB
                itemsize = len(struct.pack("l", 1))
                data = struct.pack("l", i) * (int(1e6) // itemsize)
                def process_data():
                    return len(data)
                return process_data

            for i in range(100):
                func = make_big_closure(i)
                result = w.run(func)
                assert result == int(1e6), result

            import gc
            w.run(gc.collect)

            # By this time the worker process has processed 100MB worth of data
            # passed in the closures. The worker memory size should not have
            # grown by more than a few MB as closures are garbage collected at
            # the end of each remote function call.
            growth = w.memsize() - reference_size
            assert growth < 1e7, growth

        """.format(protocol=self.protocol)
>       assert_run_python_script(code)

tests/cloudpickle_test.py:1507:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

source_code = 'if __name__ == "__main__":\n        from testutils import subprocess_worker\n        import struct\n\n        with su...function call.\n            growth = w.memsize() - reference_size\n            assert growth < 1e7, growth\n\n        '
timeout = 60

    def assert_run_python_script(source_code, timeout=TIMEOUT):
        """Utility to help check pickleability of objects defined in __main__

        The script provided in the source code should return 0 and not print
        anything on stderr or stdout.
        """
        fd, source_file = tempfile.mkstemp(suffix='_src_test_cloudpickle.py')
        os.close(fd)
        try:
            with open(source_file, 'wb') as f:
                f.write(source_code.encode('utf-8'))
            cmd = [sys.executable, source_file]
            cwd, env = _make_cwd_env()
            kwargs = {
                'cwd': cwd,
                'stderr': STDOUT,
                'env': env,
            }
            # If coverage is running, pass the config file to the subprocess
            coverage_rc = os.environ.get("COVERAGE_PROCESS_START")
            if coverage_rc:
                kwargs['env']['COVERAGE_PROCESS_START'] = coverage_rc
            if timeout_supported:
                kwargs['timeout'] = timeout
            try:
                try:
                    out = check_output(cmd, **kwargs)
                except CalledProcessError as e:
>                   raise RuntimeError(u"script errored with output:\n%s"
                                       % e.output.decode('utf-8'))
E                                      RuntimeError: script errored with output:
E                                      Traceback (most recent call last):
E                                        File "/var/folders/_t/w7jm81211td6bsjgb15qgtk80000gn/T/tmplhl3xydo_src_test_cloudpickle.py", line 32, in <module>
E                                          assert growth < 1e7, growth
E                                      AssertionError: 10539008

tests/testutils.py:206: RuntimeError

We should try to fix it rather than ignore it

Sure. It is not going to be easy as it is OS (possibly compiler) specific. I mentioned skipping it because this test is already skipped in for PyPy. But this touches all darwin users, so it may be more important.

The cause is why the test was written for: a memory growth exceeding some threshold.

@pierreglaser
Copy link
Member Author

Reviving this issue: this seems to be a bad interaction with coverage measuring in subprocesses: this bug only happens if i ran ci/install_coverage_subprocess_pth.py at some point before launching the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants