Helping moderators make consistent, evidence-based decisions—without relying on opaque model outputs.
đź”—Â Live Demo:Â https://ai-review-moderation-2-hdqedtqcmlbagcmiewmebt.streamlit.app
đź“‚Â GitHub:Â https://github.com/itingtseng/ai-review-moderation-2
Moderators often review hundreds of borderline cases per day, under time pressure and policy ambiguity.
Based on observed limitations in classifier-only moderation systems, v2.0 introduces…
| v1.0 | v2.0 |
|---|---|
| Classifier-only decisions | Hybrid (rules + neighbors) |
| Hard-coded keyword list | Data-driven log-odds n-gram |
| No fallback | Graceful degrade to rules |
| No queue | Moderator panel |
| Hard-coded thresholds | UI sliders |
final_score = α·rule_score + β·neighbor_conf
rules.ymlIf FAISS index unavailable:
neighbor_conf = 0
→ still deterministic
Designed to reduce cognitive load and support consistent human judgment under uncertainty.
| Layer / Function | Tool / Library | Reason |
|---|---|---|
| Frontend UI | Streamlit | Fast iteration, moderator-friendly UI |
| Backend API | FastAPI | Async, low-latency scoring endpoints |
| Embeddings | SentenceTransformer (all-MiniLM-L6-v2) |
Lightweight semantic understanding |
| Vector Search | FAISS | Efficient Top-K similarity retrieval |
| Rules Engine | YAMLÂ (rules.yml) |
Versionable governance config |
| Explainability | Regex / token stats | Transparent evidence for moderators |
| Fallback Logic | Custom degrade handler | Maintains determinism if index missing |
| Configuration | .env / environment vars |
Deployment toggles (thresholds, modes) |
| Logging / Audit | Python logging | Traceable moderation decisions |
| Visualization | Streamlit components | Inline evidence panels |
| Data Processing | Pandas / scikit-learn | Token stats, n-gram extraction |
| Deployment | Streamlit Cloud | Zero-ops hosting |
Shown for implementation reference
flowchart TD
A[User Review] --> B[Preprocessing]
B --> C[SentenceTransformer Embedding]
C --> D[FAISS Top-K Neighbors]
B --> E[Rule Engine rules.yml]
E --> F[Rule Score]
D --> G[Neighbor Confidence]
F --> H[Hybrid Decision α·rule + β·neighbor]
G --> H
H --> I[Risk Tier: HIGH MED LOW]
I --> J[Moderator Queue Panel]
J --> K[Human Verdict Logging]
H --> L{FAISS available?}
L -->|No| M[Graceful Fallback rule only]