-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Cosmovisor hangs node when piping API response. #9875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We got hit by this issue with cosmwasm because full contract bytecode being printed by There is a undocumented setting for this |
Hmm, that really worked. |
I think this should be re-opened as it shouldn't be possible to misconfigure in a way that causes a crash. Also this environment variable only exists in v43 at the moment. We started seeing this issue when upgrading to v0.42.4, only when the log level is set to info (which logs tx / query payloads). |
@jhernandezb should we re-open and perhaps do the following:
|
I agree on both, specially for cosmwasm chains it can halt validator nodes just by someone deploying contracts and not an easy bug to detect, until cosmovisor switches away from reading logs there should be better default values. |
Just bumped into this, triggered by p2p messages from Tendermint: osmosis-labs/osmosis#431. Assuming the majority of validators are using Cosmovisor these days, it seems like chain halts might be induced this way if a TX can be crafted to produce > 64k of log output. |
@mdyring do you know the offending log line? Where is the log coming from? The mempool? |
Yeah more details in osmosis-labs/osmosis#431. 👍 |
I believe cosmovisor is not the major software, when the daemon we run through it - is number 1, so helper tool must be adjusted for the purpose, not vise versa. |
I believe we solved this issue in
Could you install the latest cosmovisor (from current master) and check if this issue is still a case? |
Hi it should be working now with the new cosmovisor in the cosmos-sdk master branch. If the new cosmovisor also works for you, I will close this issue. @mrlp4 |
Hi! Just checked and everything seems to be fine now with latest master branch. |
Summary of Bug
When gaiad launched through cosmovisor, with API(on port 1317) server up, tx query on path=/cosmos.tx.v1beta1.Service/GetTx
of transaction with more than 64kb of total size hangs up the node, without any error codes or whatsoever.
Looks like the scanning stops
unrecoverably
at EOF with errorbufio.ErrTooLong
, the first I/O error, or a token too large to fit in the bufferVersion
commit 40bb2f4
Steps to Reproduce
Run gaiad node through cosmovisor, make tx with more ~100 messages, and try to query it using the API server path /cosmos.tx.v1beta1.Service/GetTx/<tx_hash>
If API server response is more than 64kb in size - node will hang.
For Admin Use
The text was updated successfully, but these errors were encountered: