Provider Record Liveness

DRI: @Mikel Cortes (discord handle: @Cortze#0649)

✅ Project Completed

Read the final report:

network-measurements/rfm17-provider-record-liveness.md at master · protocol/network-measurements

Background Context

< GH issues, or Slack threads, or Discord conversations, etc. >

DHT Theoretical Record Lifetimes - request by @Adin Schmahmann
~~Open~~ Completed RFM:

network-measurements/RFMs.md at master · protocol/network-measurements

Project Doc and Progress Updates

➡️ ➡️ Work on this project has been carried out here: Provider Record Liveness

Motivation

Provider records are replicated in the system to k=20 peers and are re-provided after 12hrs in the hope that, despite network churn, at least one of them will be alive to provide the record throughout the 12hr interval. However, we have not tested whether provider records indeed stay alive for 12hrs. In addition, we have found that the network has very high churn rate (at times in the order of 50% per hour).

Back of the envelope calculation suggests that with this rate of churn all 20 peers are likely to have gone offline after only 6hrs. In turn, this suggests that records might be unreachable for ~6hrs, which should be considered unacceptable. Of course, peers do not only leave, but also come back — our results suggest that the interarrival time is of similar levels as the churn rate — so it is likely that records become available again.

Despite the above argumentation, the network seems to work fine (although there have been reports of content being unreachable), so there are three things that might be happening:

Hydras help tremendously. Hydras should not be seen as a component of the core architecture and more like a supporting component. So we should not assume full dependence on hydras and have the settings configured correctly.
Peers churn, but come back online very often. Peers go offline very often, but come back online again too, so if interarrival time is small enough, this helps with keeping records alive.
Churn rate calculation is not accurate. The actual churn rate is not as calculated and we’re missing something.

<aside> 🔥 Impact: It is unacceptable to have unreachable content in the system just a few hours after it is published. If our hypothesis is correct, then we need to revisit these parameters and get them right, as they affect much of the performance of the system.

</aside>

✅ Project Completed

Background Context

Project Doc and Progress Updates

Motivation

Methodology