<aside> 🌷
“We embody, we learn, we release the idea of failure, because it is all data.”
**—adrienne maree brown
</aside>
This project builds upon a pre-existing body of work for rewildingCities, an open-source initiative promoting the creation of socio-technical systems for climate resilience in urban communities.
Our pilot study successfully replicated a peer-reviewed scientific paper on Park Cooling Intensity (PCI) executed in Nanjing, China. We focused on creating a reproducible analysis pipeline in R that processed multiple large geospatial datasets for New York City. This initial work was executed on a single machine, locally, and highlighted the significant computational bottlenecks inherent in complex geospatial analysis for large urban areas.
We will be prototyping a layered, collaborative research environment (CRE) envisioned for running complex socio-environmental models by attempting to architect its logic into a scalable, cloud-native system. The goal of rewildingCities is to create spaces for local communities to democratically model sustainable ecologies, economies, and infrastructure for climate resilient futures.
Notion—The AI workspace that works for you.
Our collaborative research environment is designed around an initial “recipe-ingredient” model that was validated in our pilot:
Recipes: This is a methodological system of blueprints defining the semantic data needs for a specific geospatial or scientific experiment that consist of curiosity spaces outlining specific questions that our system is capable of addressing, methodologies which show explicit choice points in actualizing the exploration of a curiosity space, or question, and experiments, which actually bind the questions a researcher may have, with the data that exists, and the analytical approach, or method, best for answering said question.
Manifests (manifest.yml): Local configuration files that map each recipe’s abstract ingredients to a specific city’s real-world data sources (in this instance, we chose New York City).
The Simulation Engine: The cloud-based backend that a manifest, processes data according to our internal schema, and executes local experiments.
With this project, we seek to explore several core scalability needs of experimenting with data in this way:
Computational Scaling (Event-Driven, Parallel Processing): The PCI pilot study for one city required processing terabytes of raw satellite and vector data. To support the possibility of dozens of scientist communities running complex models, the system must be able to parallelize these massive computational workloads. A single-machine approach is not viable.
Data Ingestion & Validation Scaling: The platform must be able to ingest and validate a heterogeneous mix of data sources. This requires a robust, decoupled pipeline that can handle diverse formats and gracefully manage failures.
Request Scaling (API & Data Serving): The ultimate goal is for the CRE to serve its results to interactive dashboards. The public-facing API must support a growing number of users querying these complex datasets, ensuring a responsive and interactive experience.
Our experiments will test the three most critical, user-facing aspects of our prototype’s performance: its raw computational speed for complex analyses, its ability to reliably and quickly ingest user-provided data, and its capacity to serve that data to multiple users with low latency.
→ Focus:
→ Desired Outcome:
→ Method:
→ Evaluation & Metrics: