2023-01-10 Content Routing WG #3

Agenda

Important reference documents:

Purpose of Meeting 5m:

To agree on technical implementation of double hashing being proposed, and feedback any questions or concerns regarding future implementations of ambient content routing.

Topics to cover 30m:

Items from last meeting:

HTTP Delegated Routing rollout to Kubo and gateways:
- Content routing discussions underway at colo.
Bifrost implementation:
- Unify everything to v0.17.0 with the latest tested resource manager configuration but without Reframe. Remove the customization from banks 1, 4, 11, 12 and 13, as well as those hosts (preloads, bootstrappers and collab cluster) that were modified to test 0.17.0-RC (double check on this with Cameron, hydras have been turned off at this point)*
- Deploy v0.18.0 “RC2” with HTTP Delegated Routing to some test banks and one host each of preloads, bootstrappers and collab cluster.
  - Team was able to bring EVERYTHING up to v 0.17 successfully yesterday.(exc. Nitro clusters they’re in a life support state due to a phase out exercise).
  - @Torfinn Olsen add release notes to RC2 here for follow on readers.
  - @Adin Schmahmann highlighted that there are rollback issues present between v0.17 from v0.18 so some additional testing is warranted.
    - Mario’s teams plan accommodates testing prior to deployment, and there’s no production data so the TTFB(time to first byte) will suffer temporarily, but we’re not looking at any serious user experience problems. Wipe is likely the easiest but there’s also an IPFS repo command that can be leveraged.
  - Hydra functionality status review: Hydra functionality for bridging indexers and DHT’s is still there.
  - v0.17 unification of hosts is still presently missing reframe. Some test nodes have reframe some don’t. The Bifrost team is also adding the resource manager and they’re planning to bring HTTP delegated routing(previously known as reframe) during their upgrade to v0.18.
  - When the PR is merged @Torfinn Olsen will post the PR here. That will be completed today or tomorrow.
  - Until we get the HTTP delegated routing capability we won’t know how much traffic we’re indexing so we need some support to understand the expected indexer traffic growth increases. Banks 11,12,13 were fine but what fraction of the entirety of traffic are the remaining banks in the swarm?
    - 11,12,13 have 6 nodes each for a total of 18, and that’s out of a total of 120 so we’re looking at 6x-7x traffic increase. Load Balancers use consistent hashing, so it’s possible there’s an imbalance in source of CID’s.
- Indexer traffic doubling from hydras, and cid.contact. Can we filter or discretely avoid hydra traffic? There’s a config file for swarm filters with CIDR notation. It might save us more traffic and cost. Are the Hydra PeerID’s supposed to stay static? We think they’re an IP range.
Privacy support on Autodiscovery(feedback from IPFS at colo)
Probelab hydras turndown bitswap delay 20x reads could potentially flood to indexer service. Issue 8807 in kubo. https://github.com/ipfs/kubo/issues/8807
- Possible to test this on IPFS clients to measure the number and track queries going to indexer and DHT.
- Also a concern: consistent hashing for the nodes. RootCID or full path? It’s based on the full PATH. Bitswap bridges gaps when you query a full directory but there’s a risk that in this case every single path in the query will be a multiplied indexable query so there’s a forcing function risk. We should be using the RootCID to make the best use of the Kubo Bitstore. There’s some optimization considerations which need to happen.
- Thunderdome test? This test would be wise to rerun with a subset of 100 peers rather than 800. 1s is a little to significant of a travel time, the conclusion of the groups present analysis is that the 1s travel is to frequently experienced to accommodate. More testing needed.
The outcome is we’re ok to release v0.18 but just not with any delay eliminations.
- If delay gives better user performance that’s a win, but if we need to scale the indexer because we’re increasing the traffic we just need a warning on the Indexer team. The Probelab team thinks they can reduce the 400-500ms.
- Cost component this will increase cost by increasing volume on the Indexers cloudfront(Indexstar)
Ambient Routing TDD is out when can this be implemented?
- Design is proposed, implementation timeline outstanding.