Deadlock in AsyncConnectionPool #454

f0ff886f · 2021-12-02T11:51:59Z

httpcore==0.14.3
httpx==0.21.1
anyio==3.4.0
python==3.9.6 (asyncio runtime, not trio)

Hello, first of all I would like to apologize for not providing a repro for this bug as it is something I have hit only once but it is a very nasty hang.

I have a tool that uses httpx and issues HTTP POSTs concurrently with something like this:

async def stop(self):
    tasks = []
    for c in self._clients:
        stop_task = asyncio.create_task(c.stop())
        tasks.append(stop_task)

    results = await asyncio.gather(*tasks)

Where c.stop() is:

async def stop(self):
    try:
        response = await self._client.post("/api/stop", timeout=60)
    except Exception as e:
        raise ClientError(
            f"Error while stopping the client: {self._url} - {e}"
        ) from e

And lastly, self._client is simply: self._client = httpx.AsyncClient(base_url=url, timeout=15)

Recently, this stop sequence just stalled forever, and the clients never received a stop POST.

Digging into the running process with pyrasite-ng and the following toy code:

import asyncio

for a in asyncio.tasks._all_tasks:
    print(f"*** CURRENT TASK: {a} ***")
    f = a._coro
    while hasattr(f, "cr_await") and f.cr_await:
        print(f.cr_frame)
        f = f.cr_await

(the actual command was: pyrasite --output localterm 32953 debug.py)

I could see something like this for the stop coroutines:

*** CURRENT TASK: <Task pending name='Task-254' coro=<Client.stop() running at /home/f0ff/tool/client.py:85> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f0cb5ba3820>()]> cb=[gather.<locals>._done_callback() at .../python-3.9.6/lib/python3.9/asyncio/tasks.py:764]> ***
<frame at 0x2041f00, file '/home/f0ff/tool/client.py', line 85, code stop>
<frame at 0x2516530, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_client.py', line 1827, code post>
<frame at 0x25167a0, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_client.py', line 1513, code request>
<frame at 0x1f0bb50, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_client.py', line 1600, code send>
<frame at 0x256e6c0, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_client.py', line 1628, code _send_handling_auth>
<frame at 0x1fb58d0, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_client.py', line 1665, code _send_handling_redirects>
<frame at 0x7f0cb5af4c40, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_client.py', line 1702, code _send_single_request>
<frame at 0x7f0cb5ba19a0, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpx/_transports/default.py', line 291, code handle_async_request>
<frame at 0x253ee00, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpcore/_async/connection_pool.py', line 222, code handle_async_request>
<frame at 0x7f0cb5ab01e0, file '/home/f0ff/.venv/lib/python3.9/site-packages/httpcore/_synchronization.py', line 15, code __aenter__>
<frame at 0x7f0cb5af4440, file '/home/f0ff/.venv/lib/python3.9/site-packages/anyio/_core/_synchronization.py', line 126, code acquire>
<frame at 0x7f0cb5ab0380, file '/home/f0ff/.venv/lib/python3.9/site-packages/anyio/_backends/_asyncio.py', line 1654, code wait>
<frame at 0x7f0cb5ba1d60, file '.../python-3.9.6/lib/python3.9/asyncio/locks.py', line 226, code wait>

There looks to be a deadlock somewhere in the connection pool and its use of httpcore's AsyncLock -> anyio's Lock... unfortunately I cannot repro this easily and I don't know if there is any other detail I could provide to help.

I will leave the hung context open till the weekend, but if I could share any more info please let me know.

The text was updated successfully, but these errors were encountered:

graffic · 2021-12-03T09:12:03Z

Hello there! Disclaimer: @f0ff886f and I work together.

TL;DR: It seems that the issue is in anyio.Lock. A fix could be to use asyncio.Lock instead.

This code snippet reproduces the deadlock.

import asyncio
import anyio

class Holder:
    def __init__(self):
        # self.lock = asyncio.Lock()
        self.lock = anyio.Lock()

    async def work(self, index):
        print(f"Trying to acquire {index}")
        async with self.lock:
            print(f"Doing index {index}")
            await asyncio.sleep(5)


async def main_anyio():
    holder = Holder()
    t1 = asyncio.create_task(holder.work(1), name="holder 1")
    t2 = asyncio.create_task(holder.work(2), name="holder 2")

    print("Tasks created")
    await asyncio.sleep(0)

    print("Cancelling t1")
    t1.cancel()

    print("Waiting for t2")
    await asyncio.gather(t2)


if __name__ == "__main__":
    asyncio.run(main_anyio())

The explanation is that the first task trying to acquire the anyio.Lock lock is cancelled when the lock is in the __aenter__ of the async context. When a context manager is cancelled in the enter part, the exit part is NOT executed.

So the lock remains acquired and every other task waiting for an unlock.

This does not happen if the lock is an asyncio.Lock(). The asyncio.Lock never yields during the acquire step. Even if its ___aenter__ is async, there is no async code there.

I will fill a bug in the anyio library.

tomchristie · 2021-12-06T14:56:25Z

Okay, good work on this one folks, thanks.

We might be able to switch to using either trio.Lock or asyncio.Lock depending on which context we're running within.

agronholm · 2021-12-11T11:35:50Z

There is a fix: agronholm/anyio#399
The only reason it has not been merged is because I have not been able to get anybody to review or test it.

f0ff886f · 2021-12-11T22:26:02Z

Hi @agronholm thank you for the super fast turnaround! I'll poke @graffic to take a look at the PR as he seems to understand the deadlock well.

tomchristie · 2022-01-18T11:56:48Z

I'll close this off, since I assume it is resolved as of anyio 3.5.0 - https://pypi.org/project/anyio/#history

graffic mentioned this issue Dec 3, 2021

Deadlock in anyio.Lock agronholm/anyio#398

Closed

tomchristie closed this as completed Jan 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock in AsyncConnectionPool #454

Deadlock in AsyncConnectionPool #454

f0ff886f commented Dec 2, 2021 •

edited

Loading

graffic commented Dec 3, 2021

tomchristie commented Dec 6, 2021

agronholm commented Dec 11, 2021

f0ff886f commented Dec 11, 2021

tomchristie commented Jan 18, 2022 •

edited

Loading

Deadlock in AsyncConnectionPool #454

Deadlock in AsyncConnectionPool #454

Comments

f0ff886f commented Dec 2, 2021 • edited Loading

graffic commented Dec 3, 2021

tomchristie commented Dec 6, 2021

agronholm commented Dec 11, 2021

f0ff886f commented Dec 11, 2021

tomchristie commented Jan 18, 2022 • edited Loading

f0ff886f commented Dec 2, 2021 •

edited

Loading

tomchristie commented Jan 18, 2022 •

edited

Loading