Subteams

Team	Members	Focus	Spec Doc
🗄️ Data & Features (~7) (red)	aditya, ryan T, April, nikhil, erika, will	Kalshi + NOAA + Binance ingestion, feature engineering → `live_features.parquet`	see Reference section
🧠 Modeling & Intelligence (~9) (green)	oliver, austin, jason, alex, justin, shreyes, evan, alina, vicky, Jollen, anzhe	FinBERT/VADER NLP sentiment (internal) + XGBoost fusion model + probability calibration → `predictions.json`	see Reference section
⚡ Execution (~4) (blue)		Kelly sizing, risk checks, order management, dry-run → live trading	see Reference section

references

https://github.com/nmokey/ACM-AI-Spring-2026-Prediction-Markets
Kalshi API Docs
GNews API — request free academic access with UCLA email
NOAA Weather API
Binance Public REST API — no auth needed for price data
FinBERT on HuggingFace
all-MiniLM-L6-v2 (relevance scoring)
XGBoost Docs
GDELT Project — free news backup if GNews has gaps

Key Metrics

Metric	What it measures	Target
Brier Score	Model calibration. Random = 0.25, perfect = 0.0	< 0.20
Sharpe Ratio	Risk-adjusted return on backtest	> 1.0
Win Rate	% of trades that close profitably	> 52%
Edge per Trade	Avg (p_model − p_market) on winning trades	> 0.05
Dry-Run Trades	Proof the system is running autonomously	> 50

Data Contracts

Team 1 → Team 2 · data/features/live_features.parquet (refreshed every 15 min)

Field	Type	Notes
contract_id	str	Kalshi market ticker e.g. `KXBTC-25APR14-T100000`
timestamp	datetime UTC
market_price	float [0–1]	Normalized from Kalshi 0–100 cents
volume_24h	float
days_to_resolution	float
price_change_1h	float	Delta vs. 1h ago
price_change_6h	float
market_category	str	`"weather"` / `"crypto"` / `"sports"`

💡 Note on NLP signals: nlp/ and models/ are both owned by Team 2 (Modeling & Intelligence). Sentiment scores are an internal Team 2 artifact — they flow from nlp/sentiment.py into models/predict.py at runtime and are cached in nlp/sentiment.json. The only output Team 2 exposes externally is signals/predictions.json.

Team 2 → Team 3 · signals/predictions.json (refreshed every 15 min)

Field	Type	Notes
contract_id	str
timestamp	datetime UTC
p_model	float [0–1]	Calibrated probability of YES outcome
confidence	float [0–1]	Model uncertainty — used for position sizing