Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: read many messages in batch? #554

Closed
glycerine opened this issue Nov 16, 2019 · 10 comments
Closed

Q: read many messages in batch? #554

glycerine opened this issue Nov 16, 2019 · 10 comments

Comments

@glycerine
Copy link

glycerine commented Nov 16, 2019

I am having some slow message processing (my TCP receive buffer isn't being emptied) and I wonder if I might be able to speed it up by receiving many websocket messages as a slice, at once.

Is there a way to use the gorilla API to get back all the available messages in one call?

@ghost
Copy link

ghost commented Nov 16, 2019

If the application uses a reasonably sized read buffer, than it is unlikely that a batch read will make any difference in performance.

Is your application unable to keep up with data coming over the network when reading in a tight loop? What is the read buffer size and typical message size for your application?

@glycerine
Copy link
Author

glycerine commented Nov 16, 2019

If I have 1M small messages in my TCP buffer, doing one call through the gorilla stack instead of one million calls through the stack will be faster. Not only at the gorilla call, but also downstream in my further processing.

I know websocket messages are very sub-optimal for such situations, but I don't control the server.

Is your application unable to keep up with data coming over the network when reading in a tight loop?

Yes.

@ghost
Copy link

ghost commented Nov 17, 2019

Gorilla uses bufio.Reader to amortize the cost of read operations on the underlying network connection. A batching API will provide little or no additional benefit at that layer and below.

Because Gorilla's own code does not do a lot when reading messages as client, it will be difficult to squeeze more performance out if it.

Is anything preventing you from using batching in your downstream processing?

@glycerine
Copy link
Author

Ok. It may be that I am just seeing a very bad bug resulting in long delivery delays in gorilla under load. I will swap out the websocket implementation to check.

@ghost
Copy link

ghost commented Nov 17, 2019

The delivery delay bug is concerning. Please share more information about the bug and how to reproduce it.

Is the request for the batch API separate from the bug, or did you propose the batch API as a fix for the bug?

@glycerine
Copy link
Author

glycerine commented Nov 17, 2019

I don't have firm conclusions yet. I thought the batching my help the slowness I'm seeing.

Anyway, are there any existing load tests to determine capacity for gorilla?

@glycerine
Copy link
Author

Also, if I want to collect all available messages before proceeding, is there a way to not block on a ReadMessage() call if nothing more is available?

@ghost
Copy link

ghost commented Nov 17, 2019

I don't know of any load tests.

Read on a Gorilla connection is satisfied by returning data from the buffer or by issuing a read on the underlying network connection. With the exception of ping message handling, there are no locks or goroutines that that can cause a delay on the read path. The ping handler sends a pong by calling WriteControl. WriteControl takes an internal write lock when writing to the underlying network connection. Is the server sending a large number of pings and is the application itself writing to the connection? That can explain the delays.

Go network connections do not have a non-blocking mechanism for determining if data is available. Because Gorilla reads through to underlying network connection, it follows that Gorolla websocket connections do not have non-blocking mechanism for determine if a message is available.

To batch messages in an application, use one goroutine to pump messages from the websocket connection to a queue. Use another goroutine to take everything in the queue as a batch and process it.

@glycerine
Copy link
Author

glycerine commented Nov 17, 2019

Go network connections do not have a non-blocking mechanism for determining if data is available.

On a regular net.Conn, I usually use a read timeout of 10 millisecond for this. If a timeout error is returned, we know there wasn't anything available. This seems untenable with gorilla, because of the following strange choice:

// SetReadDeadline sets the read deadline on the underlying network connection.
// After a read has timed out, the websocket connection state is corrupt and
// all future reads will return an error.

@ghost
Copy link

ghost commented Nov 20, 2019

because of the following strange choice

It would be great if Gorilla did not have this limitation, but the choice is not strange. See #474 for background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant