DRI: @Guillaume Michel (guissou) (discord handle: @guissou#9964)

✅ Project Completed

Final Report: https://github.com/protocol/network-measurements/blob/master/results/rfm19-dht-routing-table-health.md

Implementation: https://github.com/protocol/network-measurements/tree/master/implementations/rfm19-dht-routing-table-health

Slides CID: bafybeigdjyzepom74haqau7ojzay7zgwula4kss6nxoyqmrrrwwzjphf5q

https://www.youtube.com/watch?v=FU3oIZL0L5E

Motivation

We want to measure the health of the Kademlia routing table in the running IPFS network. The measurements described in this page will help us understand better the state of the routing table in practice and will provide hints on how to improve routing in libp2p/IPFS.

Method

Using the excellent Nebula Crawler, we can crawl the network to get all online peers in the IPFS network, and the peers in their k-buckets. Using this data, we are able to measure the following.

Planned measurements: part 1

20 closest peers

Every node has to look for its own identifier when joining the DHT, and it is supposed to receive the k=20 closest node identities to itself.

We want to monitor if every node is actually aware of its 20 closest neighbors. And observe churn in these 20 closest peers.

Non full k-buckets

bucket i contains all nodes that are at a distance between 2^i and 2^(i+1), capped at k=20 peers per bucket. Each bucket should be as full as possible.

For non full k-buckets, we want to verify some nodes that should be included in these buckets are missing, and understand why.

Empty buckets that shouldn’t be empty

If there is at least 1 peer at a distance between 2^i and 2^(i+1) of the reference peer, then its bucket i should never be empty. This is a critical case, as it threatens connectivity.

Same as above, for empty buckets, verify that no other online peer belongs in there.