Top Level Goals
- P0: Improve performance and reliability of data transfer stack in storage and retrieval deals for Estuary
- P1: Scale the ability of multiple team members (and network) to make meaningful improvements to the data transfer stack
- P2: Alignment of data transfer stack with design of Markets V2, future retrieval clients, other usages of go-graphsync
Primary High Level Strategies
- Embed in Estuary, have input from other graphsync users - Markets V2, Retrieval Client, Provider, etc:
- Build significant on demand request introspection into go-graphsync and go-data-transfer, and expose this in estuary. We should be able at any given time to collect significant history about what has happened with a transfer. We may also want introspection at the peer level. Introspection tools will be available to miners who want to use them.
- Focus initial performance work on a controlled set of miners — Magik + Sofia Miner initial, expanding to MinerX. Prove improvements can maximize bandwidth with select miners before expanding pool.
- Revisit go-data-transfer / go-graphsync boundary — Graphsync is a transport. Go-data-transfer is a control protocol (mostly for facilitating optimistic fair exchange). Currently the boundary is extremely complicated, and go-data-transfer does a lot that is not truly transport independent.
- Over prioritize ramp-up initially, recognize initial progress will be slower. Use smaller refactors as a way to ramp up. There is no way we can have real discussions about performance and large refactors without baseline shared competency among the team. It is ok and even expected to have a couple weeks of small refactors and bug fixes prior to making big decisions. Prioritize refactors that help people understand code first.
Secondary Strategies
- Clear triage process for incoming Estuary issues
- Build regression testing across versions to run in CI for releases
- Possibly refactor go-graphsync toward:
Scope Limits
- Initial duration is 3 months. At 3 months, reassess whether goals are met, and whether other critical goals have emerged, and whether project should be extended
- Protocol focus is go-graphsync. Implementation of other protocols is primarily to prove out go-data-transfer support for multiple protocols
- Team at proposed size is NOT also developing the web3 retrieval client. (see Team section)
- Core code repos are go-data-transfer, go-graphsync
- Delve into other dependent repos (go-ipld-prime, blockstore impls, etc) only as needed to deliver on core goals (flag of potential problem: selector DOS - large effort, possibly critical to performance)