🌷 "We embody, we learn, we release the idea of failure, because it is all data." — adrienne maree brown
An open-source platform where local communities run rigorous geospatial analyses to model climate-resilient futures for their cities. The infrastructure itself is pedagogical — designed for citizen scientists of all levels.
Our first toolkit: urban thermal analysis. Where are the heat islands? How much do parks cool their surroundings? What landscape features drive that cooling? Who bears the heat burden?
| Layer | What it does | Primitives |
|---|---|---|
| soil/ | Prepares raw data | validate raster/vector · repair geometry · reproject · scale units · filter by area · mask water · crop to boundary · fetch from APIs |
| roots/geometry/ | Spatial operations | generate buffer rings · calculate geometry |
| roots/metrics/ | Extract measurements | zonal statistics · landscape metrics · calculate PCI (TPM-M) · land cover proportions via crosswalk |
| roots/statistics/ | Statistical analysis | correlation (bootstrap + partial) · regression (VIF enforcement, train/test split) · Getis-Ord Gi* · cluster classification |
A Python orchestrator reads experiment YAML files, resolves references to city data and analytical choices, builds a dependency graph, and executes R primitives in sequence — each one breathing through the rewildr package contract.
Every transformation produces a JSON provenance document. Warnings accumulate and never disappear. If data is degraded, the envelope says so. If a park's buffer ate the ocean and produced a negative PCI, the envelope flags it. The system's honesty is infrastructure, not an afterthought.
The soil pipeline has been deployed to AWS Batch (Fargate). Same Docker container runs identically locally and in the cloud. 2,055 NYC park boundaries processed in 31 seconds with full provenance tracking.
The pilot study replicated Xiao et al. (2023) but had known issues. Here's what v2 addresses:
| Problem | v1 (pilot) | v2 (current) |
|---|---|---|
| PCI calculation | Fixed 480m radius, not real TPM-M | True gradient walk: 30m rings, find first local max |
| Water contamination | Coastal parks got negative PCI, no detection | mask_raster_by_class.R removes water pixels before extraction; envelope warns on high water % |
| Land cover | NDVI threshold hack for blue/green/grey | Crosswalk YAML maps classified raster directly |
| Regression | No train/test split, VIF > 10 ignored | Holdout validation, VIF < 7.5 enforced, bootstrap CIs |
| Code structure | Monolithic scripts, hardcoded paths, emoji cats | Atomic primitives, rewildr contract, envelope provenance |