bug: networkReachability=NotReachable log? #1499

dao · 2023-01-17T18:00:07Z

Problem

Running status-im/nwaku:0.14.0 docker image, I observe previously unseen INFO log messages:
Peer reachability status topics="wakunode" tid=1 file=waku_node.nim:225 networkReachability=NotReachable confidence=1.0

Impact

I am uncertain of the impact; the node still appears to have ~2 dozen connected peers but the log suggests that the node is not reachable

To reproduce

I'm unaware of any special steps or configurations required to reproduce. The node is running relay protocol only (no store, filter, etc) with rln disabled.

Expected behavior

If the message does not actually mean that my node is unreachable, it should not cause alarm by being logged as INFO

Screenshots/logs

see attached

nwaku version/commit hash

version / git commit hash: v0.14.0
log.txt

The text was updated successfully, but these errors were encountered:

alrevuelta · 2023-01-18T07:59:25Z

I am uncertain of the impact; the node still appears to have ~2 dozen connected peers but the log suggests that the node is not reachable

Are any of these connections inbound? A node can be NotReachable and still have connected peers (outbound peers).

Note though that this is an experimental feature that was just released both in our side and nim-libp2p, and works by asking other peers if they can connect to you.

Does your node/computer have the waku port open? Or even if its closed do you have a router supporting upnp?

If you do expect your node to be reachable, and you want to give it a shot, feel free to experiment with the autonatservice. Would suggest to enable askNewConnectedPeers and changing numPeersToAsk or maxQueueSize.

alrevuelta · 2023-01-19T10:30:01Z

@dao After some discussions with nim-libp2p team, this is a known issue on nim-libp2p. This feature works by requesting other nodes to dial you, and if they succeed, they flag you as Reachable otherwise NotReachable.

Problem is that we currently limit the maximum number of connection a node can have to 1. So if you are already connected the the peer that you are requesting to dial you to test your reachability, it will fail because you will reject that connection. This should explain why you see NotReachable.

We can also see that this metric is really unstable in our fleets, jumping quite often from NotReachable to Reachable.

By now you can ignore the logs. Thanks for reporting!

alrevuelta · 2023-01-23T17:04:35Z

Work is being done in vacp2p/nim-libp2p#845 and vacp2p/nim-libp2p#846. Once fixed in nim-libp2p we will bump the version and should be fixed.

alrevuelta · 2023-01-30T15:52:40Z

Just merged the nim-libp2p version bump containing the fixes. Will monitor our fleets (where the issue is also present) over the next 24 hours and check if the issue is resolved.

alrevuelta · 2023-01-31T07:48:38Z

Can verify in our wakuv2.test fleets that the nim-libp2p version bump to 67939b was deployed, containing 4ace70d53b0b0e3b58cd3bead70b967d34bd03f3 nim-libp2p version. However, I still see the issue: Reachable and NotReachable statuses are i) wrongly reported and ii) unstable.

This needs further investigation.

alrevuelta · 2023-02-20T13:59:21Z

An update on this. Latest nim-libp2p version included a fix but still a bit unstable. NotReachable state is still reported on nodes that are Reachable from time to time. Issue seems to happen more often when multiple nodes try to dial each other testing reachability at the same time.

This new issue was detected by nim-libp2p team and a fix should be ready soon. Once fixed, this should be closed once we bump to the latest version.

See: vacp2p/nim-libp2p#865

alrevuelta · 2023-04-05T08:46:04Z

v0.16.0 should fix this.

I can verify it works in a private network with v0.16.0. Will wait until we deploy this release to our fleets before closing.

Note thought that in a network containing a mix of old (v0.15.0) and new (>v0.16.0) nodes, new nodes will be biased by old nodes (they will report dialMe failed while that may not be the case)

alrevuelta · 2023-05-30T09:50:03Z

v0.16.0 indeed fixed this, but if other nodes did not update, they may flag us as notreachable while being reachable. Since multiple nodes are queried for our state, its just matter of consensus (if more nodes report us as reachable then we will flag ourselves are reachable).

closing since this is fixed from v0.16.0

dao added bug Something isn't working track:maintenance labels Jan 17, 2023

oskarth added this to Vac Research and Waku Jan 17, 2023

jm-clius moved this to Todo in Waku Jan 23, 2023

alrevuelta self-assigned this Jan 23, 2023

alrevuelta mentioned this issue Jan 30, 2023

chore: bump nim-libp2p #1518

Merged

jm-clius moved this from Todo to In Progress in Waku Jan 30, 2023

alrevuelta mentioned this issue Feb 10, 2023

chore: bump nim-libp2p 444b837 #1546

Merged

alrevuelta closed this as completed May 30, 2023

github-project-automation bot moved this from In Progress to Done in Waku May 30, 2023

github-project-automation bot moved this to Done in Vac Research May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: networkReachability=NotReachable log? #1499

bug: networkReachability=NotReachable log? #1499

dao commented Jan 17, 2023

alrevuelta commented Jan 18, 2023

alrevuelta commented Jan 19, 2023

alrevuelta commented Jan 23, 2023

alrevuelta commented Jan 30, 2023

alrevuelta commented Jan 31, 2023

alrevuelta commented Feb 20, 2023 •

edited

Loading

alrevuelta commented Apr 5, 2023 •

edited

Loading

alrevuelta commented May 30, 2023

bug: networkReachability=NotReachable log? #1499

bug: networkReachability=NotReachable log? #1499

Comments

dao commented Jan 17, 2023

Problem

Impact

To reproduce

Expected behavior

Screenshots/logs

nwaku version/commit hash

alrevuelta commented Jan 18, 2023

alrevuelta commented Jan 19, 2023

alrevuelta commented Jan 23, 2023

alrevuelta commented Jan 30, 2023

alrevuelta commented Jan 31, 2023

alrevuelta commented Feb 20, 2023 • edited Loading

alrevuelta commented Apr 5, 2023 • edited Loading

alrevuelta commented May 30, 2023

alrevuelta commented Feb 20, 2023 •

edited

Loading

alrevuelta commented Apr 5, 2023 •

edited

Loading