context

useful information

“IPFS Transport” has been, for some time, a term used to describe transport protocols we implement that are “IPFS aware.”

However, the most common transport for writing into SP’s in Filecoin and into large providers is HTTP, and we all know the vast majority of reads are requested over HTTP as well.

This view of “IPFS Transports” has come to include a fair amount of graph awareness in the design of transport systems for IPFS data which has the unfortunate consequence of destroying any hope of reusing transports already widely in use.

This document is intended to split “transport” into some obvious and uncontroversial layers that allow us to discuss IPFS data movement in a clearer way that can optimally reuse existing systems and software.

One important thing to realize up front, there isn’t a single place or “layer” that is where IPLD or IPFS or libp2p go, that’s kinda the point. IPFS protocols can be implemented many ways in different environments, this starts that discussion at what existing system infra already looks like so we can decide how to best leverage the protocols to solve specific problems.

transports

🥞	🧐
Byte	Anything you can move bytes over. Sockets, disc, HTTP, between memory spaces.
Verifiable Transaction	Single verfiable request/response cycle.
Block	CAR, also IPFS-Gateway Block API, BitSwap
Graph	IPFS-Gateway File API, anything that parses a block using an IPLD codec, GraphSync (obviously)

Byte

This is probably the most important point that needs to be made from the perspective of operators to implementors: Moving bytes costs different amounts of money depending on how/where the bytes move.

These systems that you throw a CID at and it just grabs data from as many random providers as it can find are totally unsuitable to operate at scale.

Deciding who/how/where to move bytes from one place to another is how you cut costs. You only do three things in large distributed systems:

compute bytes
store bytes
move bytes

All of these things cost money and the material reality of how and where those happen can be a 100x cost difference.

From our experience so far, it’s at least 10x cheaper to use HTTP instead of a custom transport in existing Cloud Providers. It can get closer to 100x with caching optimizations, which is why “just use WebSockets” is such bad advise.

At some point, you figure out who/where data is and need to transfer them some bytes. We find it’s best to not add opinions or requirements to how those bytes move. Just use whatever the fastest/cheapest means of moving bytes is between the two parties and layer validation on from there.