GitHub issue: https://github.com/filecoin-station/spark/issues/40

Introduction

In the long term, we want Spark checkers running in Filecoin Station to sample all active Filecoin deals to pick a retrieval check to perform.

Currently, we have a manual process to scan the active deals and resolve them into task templates defined as (cid, address, protocol) and stored in our Postgres DB. This has several downsides:

Proposal

Step 1: spark checkers

  1. Rework the task-picking code to take only the cid field from the task definition picked from retrievalTasks and ignore providerAddress and protocol fields.
  2. Rework the code executing a task to start with an IPNI query to resolve the CID into the provider address. IPNI query is a simple HTTP GET request with the CID provided in the request path. IPNI returns 404 when no advertisements are found. Example URL: https://cid.contact/cid/bafybeibhsqlh4phj3r3seetqvbq4xebwzz4tvkpckrc4icoeztdxnipbam
    1. No advertisement found → report result “not advertised at all”
    2. No advertisement offering HTTP protocol → report result “http not advertised”
    3. Pick the first advertisement for the HTTP protocol In the future, we will look for an advertisement matching the FIL deal we are testing.
    4. Pick the first address provided in this advertisement
    5. Hardcode the protocol to http
  3. When submitting the result of the retrieval check (a new measurement) to SPARK API, include a new field in the request body: indexerResult.

In the future, we will rework the check to drop Lassie and fetch directly over HTTP(s), but let’s do this one step at a time.

Step 2: spark-evaluate (fraud detection)

When checking whether a measurement is for a valid task in the evaluated round, compare only the field cid. Ignore provider_address and protocol.

After we implement honest-majority committees, we should add cross-checks for IPNI resolution - add the following tasks to the relevant GH issue (https://github.com/filecoin-station/roadmap/issues/59):