We want to measure the health of the Kademlia routing table in the running IPFS network. The measurements described in this page will help us understand better the state of the routing table in practice and will provide hints on how to improve routing in libp2p/IPFS.
Using the excellent Nebula Crawler, we can crawl the network to get all online peers in the IPFS network, and the peers in their k-buckets. Using this data, we are able to measure the following.
Every node has to look for its own identifier when joining the DHT, and it is supposed to receive the k=20 closest node identities to itself.
We want to monitor if every node is actually aware of its 20 closest neighbors. And observe churn in these 20 closest peers.
bucket i contains all nodes that are at a distance between 2^i and 2^(i+1), capped at k=20 peers per bucket. Each bucket should be as full as possible.
For non full k-buckets, we want to verify some nodes that should be included in these buckets are missing, and understand why.
If there is at least 1 peer at a distance between 2^i and 2^(i+1) of the reference peer, then its bucket i should never be empty. This is a critical case, as it threatens connectivity.
Same as above, for empty buckets, verify that no other online peer belongs in there.
Offline peers in routing table are useless, and make the network time out. Thus we don’t want dead peers in any routing table.
Monitor offline peers in routing tables
How full is each k-bucket? (How many peer are there in each bucket)