Large neuroscience datasets require specialized workflows for breaking datasets down into storage-efficient chunks, ensuring persistent availability, validation of interoperability specifications, and managing metadata for linked resources.

The Opscientia team has begun testing decentralised file storage workflows with small datasets for unit-tests with plans for building up to 284TB of mixed content datasets indexed by the metadata aggregator datalad.

Resources

GitHub - opscientia/desci-storage

https://miro.com/app/board/o9J_ltSfj8M=/?invite_link_id=921306634511

Problem Statements

Storage design and optimization

Hybrid storage

Retrieval capabilities

Experiments