Large websocket messages cause error that is silently caught somewhere #3410

Xtreemrus · 2024-07-28T10:56:57Z

Description

When ui.textarea contains more than a million characters, it starts to behave unpredictably without any visible warnings.

Steps to Reproduce

To demonstrate the issue from different perspectives, I wrote the following code:

from nicegui import ui

@ui.page('/')
def main():
    text_area = ui.textarea('Type something').props('rounded outlined input-class=mx-3') \
                .classes('w-full self-center')

    text_area.value = '123' + 'A'*1_000_000

    len_label = ui.label('Length').classes('mt-2')
    
    def text_area_length():
        print('Length:', len(text_area.value))
        len_label.set_text(f'Length: {len(text_area.value)}')
        print('first_character:', text_area.value[0])

    def text_area_delete_first():
        text_area.value = text_area.value[1:]
        text_area_length()

    ui.button('Calculate', on_click=text_area_length)
    ui.button('Delete first character', on_click=text_area_delete_first)

    text_area_length()

ui.run(title='Text Input Max')

When you run this code, it creates a textarea filled with over a million characters, plus three more to make the behavior more evident in different scenarios.

If you click the Calculate button immediately after launching (without interacting with the ui.textarea), everything is calculated correctly, and the console prints the appropriate values.
If you add or delete a character using the ui.textarea (e.g., delete the digit 1 at the beginning or add a new character), the Calculate button will no longer calculate the new value of text_area correctly. The code executes without errors, and you see the print output, but it shows the wrong number of characters and the first character is also incorrect, as if the connection between the frontend and backend is lost.
To further investigate, I created a button to delete the first character without using the cursor to interact with the ui.textarea. In this case, everything works as expected, the counter decreases, and the first characters are read correctly.

Additional Investigation

To see if this is a Quasar issue, I added character counting to their textarea code:
https://codepen.io/Xtreemrus/pen/XWLNPyY?editors=101

<!--
Forked from:
https://quasar.dev/vue-components/input#example--textarea
-->
<div id="q-app" style="min-height: 100vh;">
  <div class="q-pa-md" style="max-width: 300px">
    <q-input
      v-model="text"
      filled
      type="textarea"
    ></q-input>
    <div class="q-mt-md">
      <p>Calculate len: {{ characterCount }}</p>
      <p>first_character: {{ firstCharacter }}</p>
    </div>
  </div>
</div>

JS part:

const { ref, computed } = Vue

const app = Vue.createApp({
  setup () {
    const text = ref('')

    const characterCount = computed(() => text.value.length)
    
    const firstCharacter = computed(() => {
      if (text.value.length > 0) {
        return text.value[0]
      } else {
        return 'No symbols'
      }
    })

    return {
      text,
      characterCount,
      firstCharacter
    }
  }
})

app.use(Quasar, { config: {} })
app.mount('#q-app')

In this case, working with a million characters is handled correctly.

Impact

A couple of years ago, cases requiring a million characters in a textarea were rare. However, with the development of LLMs, users are inserting increasingly large chunks of information into input fields. One million characters are approximately 250,000 tokens, which is now a feasible amount.

Difficulty

The challenge with this bug is that no visible errors occur, neither on the user side nor the developer side. There are no exceptions thrown, making it hard to find a workaround.

Environment

Browser: Chrome
NiceGUI version: 1.4.30

The text was updated successfully, but these errors were encountered:

falkoschindler · 2024-07-31T13:52:43Z

Hi @Xtreemrus,

Thanks for reporting this issue! It looks like Socket.IO (internally using Engine.IO) has a default maximum frame size of 1,000,000 bytes. When we change it in

nicegui/nicegui/nicegui.py

Line 49 in ddb95e1

    
           core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json)

to something like

core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json,
                                      max_http_buffer_size=10_000_000)

your example works. So we need to think about whether to somehow detect an overflow and warn the user about it, or to provide a configuration option.

What do you think?

Xtreemrus · 2024-08-03T09:53:03Z

Hi @falkoschindler,

Thank you for your reply. I'm glad that the core issue has been identified.

While I'm not a professional developer, it seems the problem consists of two parts:

Providing developers with a way to catch and identify the issue. An exception should be raised when the buffer overflow occurs.
After notifying the developer about the error, we need to offer a way to fix it.

Possible solutions:
For point 2, it's relatively straightforward and could potentially be addressed by adding an extra parameter to ui.run.

The solution for point 1 is less obvious, at least from my perspective. Since nothing is propagated to the backend during overflow, we could potentially add a JS script to each ui.input field where overflow might occur. This script would check text.value.length, and if it exceeds max_http_buffer_size (which could be passed as a constant to JS at site startup), it would trigger an exception on the Python side.

Simply setting max_http_buffer_size to a very large number and hoping users won't exceed it seems less ideal, as we wouldn't have visibility into how often this event occurs.

In an ideal scenario, it would be great if all input fields where possible had an option to specify the maximum number of characters allowed. If a developer sets an input character count higher than max_http_buffer_size, they would receive a warning during ui.run. This wouldn't be an error, but a caution that it could occur. Additionally, giving developers the ability to call a custom function when buffer overflow happens could help them implement user-friendly messages like "Warning: This field is limited to 1 million characters."

These approaches would provide more robust error handling and improve the developer and end-user experience.

me21 · 2024-08-04T08:57:31Z

According to the engineio sources, AsyncSocket class raises an exception if it receives too much data:

    async def _websocket_handler(self, ws):
        """Engine.IO handler for websocket transport."""
        async def websocket_wait():
            data = await ws.wait()
            if data and len(data) > self.server.max_http_buffer_size:
                raise ValueError('packet is too large')
            return data

This exception is silently caught elsewhere.

me21 · 2024-08-04T09:05:33Z

How about core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json, max_http_buffer_size=float('inf'))? Would it break something?

falkoschindler · 2024-08-19T09:37:19Z

How about core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json, max_http_buffer_size=float('inf'))? Would it break something?

I guess this would allow any client to send arbitrarily large packages to the server, which could exhaust the memory and kill the server.

This exception is silently caught elsewhere.

That's bad. But where could this be? Here is the line emitting the socket message:

nicegui/nicegui/outbox.py

Line 99 in b4fb23d

await core.sio.emit(message_type, data, room=target_id)

Any exception caught in loop() should be logged to the console. 🤔

evnchn · 2025-03-29T10:14:16Z

Interesting 🤔

Would say, though: maybe a solution would be to send the difference in string, when the textarea is updated? Then no need to send so many characters across the internet?

But that would be quite difficult and require an intensive rework ...

falkoschindler · 2025-03-29T11:07:29Z

And it might not even solve the problem if the diff is large, e.g. loading a new text from a file, converting to uppercase, replacing certain characters, translating into another language, minifying code, ... Sure, each example can be considered an edge case. But it's hard to draw a line.

Therefore, I think, the main goal should be to at least raise an exception if a message cannot be sent due to size limitations.

evnchn · 2025-03-29T14:04:34Z

Proposal: Every time we do window.socket.emit in https://github.com/zauberzeug/nicegui/blob/6ad424126b4d6d11e5b5b069e545783d262d2fef/nicegui/static/nicegui.js, we check the length, and warn if it is getting to be too long (earlier than an error would show up, maybe 50% of the max length) so that the user and developer know to stop what they are doing or raise the limit themselves.

Thinking about the socket.io max message length: I think it was set in good faith so that one message will not monopolize the entire communication bandwidth, leading to other clients cannot send message. Maybe we can break up one message into several ones, if it is getting to be too long?

falkoschindler · 2025-03-30T19:16:46Z

Yes, warning about reaching the size limit or splitting messages are possible solutions.

Nevertheless, I'm still assessing the situation: Why is the following code silently dropping change messages when editing the value?

ui.textarea(value='x' * 1_000_000, on_change=lambda: ui.notify('changed'))

Shouldn't it show any kind of error? Even registering an error handler like this doesn't help:

window.socket.on("error", (err) => console.log(err));

evnchn · 2025-04-02T23:20:13Z

A hint to anyone trying to debug this:

The WebSocket simply gets disconnected, when the super-long message is sent.

evnchn · 2025-04-02T23:28:48Z

Here I will put my solutions I brainstormed:

An immediate approach would be, upon disconnection, check if the last message is super large. If so, show a warning on screen.
Or, upon reconnection, server reports on which message it has received, and if client sees a message is missing in the list, show a warning on screen.

evnchn · 2025-04-02T23:46:10Z

OK. I came to the conclusion that Python.EngineIO does not have method for setting any extra handler for a too-long WebSocket message. It just throws it silently out the window, terminates the connection, and calls it a day. Zero chance of attaching a handler without modifying upstream.

Look at this:

https://github.com/miguelgrinberg/python-engineio/blob/527e2a944985fc7f63a28169dfe8f4b325287885/src/engineio/async_socket.py#L151-L157

the inner function websocket_wait raises ValueError('packet is too large') if too long.

Then here:

https://github.com/miguelgrinberg/python-engineio/blob/527e2a944985fc7f63a28169dfe8f4b325287885/src/engineio/async_socket.py#L220-L241

You see, if websocket_wait raises an exception, it is caught by the catch-all except: block, which simply breaks the while loop of listening in on WebSocket messages.

Then finally here:

https://github.com/miguelgrinberg/python-engineio/blob/527e2a944985fc7f63a28169dfe8f4b325287885/src/engineio/async_socket.py#L258-L261

It then shuts the WebSocket like usual, as if nothing happened.

The above is confired by strategically placing print statements (idk pdb, sorry)

We then see the following in the terminal when the too-long message is sent:

Found it!
Break here???
Got out of the while loop...

I would call it bad code quality and API design on their part...

evnchn · 2025-04-02T23:50:11Z

This exception is silently caught elsewhere.

That's bad. But where could this be? Here is the line emitting the socket message:

nicegui/nicegui/outbox.py

Line 99 in b4fb23d

await core.sio.emit(message_type, data, room=target_id)
Any exception caught in loop() should be logged to the console. 🤔

Simply put: Now we know. The exception is in the bit bucket. It will never be logged to the console as such.

And yeah, that's indeed bad. Real bad.

evnchn · 2025-04-03T00:08:01Z

Here goes nothing: miguelgrinberg/python-engineio#397

rodja · 2025-04-03T03:25:12Z

For NiceGUI On Air we had a similar problem and implemented data chunking: #4555

falkoschindler · 2025-04-03T21:10:44Z

For reference, here is a demo showing

how the client disconnects when editing the 1MB text (the counter stops),
the client reconnects after a few seconds (the counter continues),
but the text stays out of sync between client and server (which still reports the original text length).

textarea = ui.textarea(value='x' * 1_000_000, on_change=lambda: ui.notify('changed'))
ui.label().bind_text_from(textarea, 'value', lambda x: f'{len(x)} characters')

number = ui.number(value=0)
ui.timer(1.0, lambda: number.set_value(number.value + 1))

falkoschindler · 2025-04-04T10:58:31Z

After reading miguelgrinberg/python-engineio#397 (comment), it seems to come down to the question whether the server is responsible for informing the client about errors (like I would expect from an HTTP server), or if the client needs to comply with certain limits (as it seems to be the case for Socket.IO).

Regarding the issue of lost messages after editing a long text, I don't think handling the disconnect is a good solution. There should be a way notice the problem beforehand, either with help from socket.io or not.

evnchn · 2025-04-04T12:12:17Z

Author of python-engineio seems to be disinterested to be responsible for informing neither the client nor the server about too-long-message errors.

Personally I would not go with that design, but anyways...

Seems like perhaps keeping track of message length on the client side, as well as segmenting, is the way to go so far.

evnchn · 2025-04-04T15:10:11Z

Yeah, so it's our responsibility, engineio is not taking it up.

Let's get to implementing segmenting and prevent the client from ever sending a too-long message in the first place, then.

falkoschindler · 2025-04-04T16:12:12Z

I'm not so sure we're there yet. Engineio might not be responsible, but maybe the Socket.IO client? It is responsible for serializing and submitting messages. Therefore it should be able to judge the total message size and report an error if it exceeds a limit.

Anyway, would we really want to implement segmenting? Do we really want to allow sending MBs of data back an forth? I still think the primary goal is to inform the user about the message not being transmitted. And it would be great to keep the connection alive.

evnchn · 2025-04-04T16:27:34Z

Well, regardless of doing segmenting or show a warning when too much data, the current response from engineio author (he also make the socketio) seems that he means that "responsibility is on us".

At the end of day we have to add something to our codebase, not him adding something to his.

falkoschindler · 2025-04-04T16:43:04Z

He is the author of https://github.com/miguelgrinberg/python-engineio and https://github.com/miguelgrinberg/python-socketio. But I'm talking about https://github.com/socketio/socket.io, the JavaScript client.

rodja · 2025-04-05T04:02:10Z

Anyway, would we really want to implement segmenting? Do we really want to allow sending MBs of data back an forth? I still think the primary goal is to inform the user about the message not being transmitted. And it would be great to keep the connection alive

In my opinion we should make it possible to submit large chunks of text (like it what the this issue describes in the first place), send image data via socketio etc without flooding the websocket connection. Purely increasing the engineio max_http_buffer_size will saturate the channel and block all other messages from passing through. Therefore, implementing segmentation with a maximum chunck size of 0.1 MB or so is required on frontend and backend. Of course we will also need to provide configurable limit for developers to set an upper limit which the server can receive. With #4571 @evnchn already implemented a message which can be shown to the user in such cases.

evnchn · 2025-04-05T04:31:06Z

In my eyes, it is a mixture of techniques:

0-0.1MB: message being sent is fast enough, no need show anything.
0.1-1MB: Chunked message transmission using chunks of 0.1MB: show a "Loading" popup while the event is being transmitted to the server, which may take a significant time duration. Potentially show how many chunks left (Message sending to server: 3/9): TODO!
1MB+: Exceed what is acceptable: show a "Message too long" popup as seen in popup if socket.io message expected to be too long #4571

Though, #4571 is still beneficial without the handling in 0.1-1MB, since at least the websocket isn't closed and re-opened.

And, with the chunked message transmission, we can afford a higher message upper limit (say 50MB) so that we can fix the core issue we are discussing, which is that "updates not going through when ui.textarea contains more than a million characters"?

It won't make 50MB of text show up easily, just so we are clear. It'll probably lag like crazy on any browser, especially mobile, but at least the issue won't be at NiceGUI, then.

evnchn · 2025-04-05T04:38:04Z

And, for the chunked message transmission, preferably the loading popup isn't as obvious as the one in disconnect and #4571.

Perhaps something inspired by streamlit.

falkoschindler added the bug Type/scope: A problem that needs fixing label Jul 29, 2024

falkoschindler added the help wanted label Jul 31, 2024

rodja mentioned this issue Aug 26, 2024

connection time out when websocket saved large image from backend #3591

Closed

rodja mentioned this issue Sep 24, 2024

ui.plotly plotly_selected event not triggered if many points are selected #3762

Closed

falkoschindler mentioned this issue Feb 14, 2025

ui.editor: Big data causes socket reset #4348

Closed

falkoschindler changed the title ~~Unpredictable Behavior of ui.textarea with Over a Million Characters~~ Communication error for input fields with over a million characters Feb 19, 2025

falkoschindler marked this as a duplicate of #4348 Feb 19, 2025

falkoschindler removed the help wanted label Mar 28, 2025

falkoschindler changed the title ~~Communication error for input fields with over a million characters~~ Large websocket messages cause error that is silently caught somewhere Mar 28, 2025

rodja mentioned this issue Mar 29, 2025

NiceGUI Vue Component does not call callback with large parameters #4546

Closed

3 tasks

falkoschindler marked this as a duplicate of #4546 Mar 29, 2025

evnchn mentioned this issue Apr 3, 2025

Zero way to detect dropped websocket connection due to large message miguelgrinberg/python-engineio#397

Closed

falkoschindler added blocked Status: Blocked by another issue or dependency and removed analysis Status: Requires team/community input labels Apr 3, 2025

falkoschindler added analysis Status: Requires team/community input and removed blocked Status: Blocked by another issue or dependency labels Apr 4, 2025

evnchn linked a pull request Apr 5, 2025 that will close this issue

popup if socket.io message expected to be too long #4571

Open

evnchn mentioned this issue Apr 6, 2025

ui.codemirror does not apply changes with special characters like 🔄 #4575

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large websocket messages cause error that is silently caught somewhere #3410

Large websocket messages cause error that is silently caught somewhere #3410

Xtreemrus commented Jul 28, 2024

falkoschindler commented Jul 31, 2024

Xtreemrus commented Aug 3, 2024

me21 commented Aug 4, 2024

me21 commented Aug 4, 2024

falkoschindler commented Aug 19, 2024

evnchn commented Mar 29, 2025

falkoschindler commented Mar 29, 2025

evnchn commented Mar 29, 2025

falkoschindler commented Mar 30, 2025

evnchn commented Apr 2, 2025

evnchn commented Apr 2, 2025

evnchn commented Apr 2, 2025 •

edited

Loading

evnchn commented Apr 2, 2025 •

edited by falkoschindler

Loading

evnchn commented Apr 3, 2025

rodja commented Apr 3, 2025

falkoschindler commented Apr 3, 2025

falkoschindler commented Apr 4, 2025

evnchn commented Apr 4, 2025

evnchn commented Apr 4, 2025

falkoschindler commented Apr 4, 2025

evnchn commented Apr 4, 2025

falkoschindler commented Apr 4, 2025

rodja commented Apr 5, 2025

evnchn commented Apr 5, 2025 •

edited

Loading

evnchn commented Apr 5, 2025

Large websocket messages cause error that is silently caught somewhere #3410

Large websocket messages cause error that is silently caught somewhere #3410

Comments

Xtreemrus commented Jul 28, 2024

Description

Description

Steps to Reproduce

Additional Investigation

Impact

Difficulty

Environment

falkoschindler commented Jul 31, 2024

Xtreemrus commented Aug 3, 2024

me21 commented Aug 4, 2024

me21 commented Aug 4, 2024

falkoschindler commented Aug 19, 2024

evnchn commented Mar 29, 2025

falkoschindler commented Mar 29, 2025

evnchn commented Mar 29, 2025

falkoschindler commented Mar 30, 2025

evnchn commented Apr 2, 2025

evnchn commented Apr 2, 2025

evnchn commented Apr 2, 2025 • edited Loading

evnchn commented Apr 2, 2025 • edited by falkoschindler Loading

evnchn commented Apr 3, 2025

rodja commented Apr 3, 2025

falkoschindler commented Apr 3, 2025

falkoschindler commented Apr 4, 2025

evnchn commented Apr 4, 2025

evnchn commented Apr 4, 2025

falkoschindler commented Apr 4, 2025

evnchn commented Apr 4, 2025

falkoschindler commented Apr 4, 2025

rodja commented Apr 5, 2025

evnchn commented Apr 5, 2025 • edited Loading

evnchn commented Apr 5, 2025

evnchn commented Apr 2, 2025 •

edited

Loading

evnchn commented Apr 2, 2025 •

edited by falkoschindler

Loading

evnchn commented Apr 5, 2025 •

edited

Loading