Mustafa Nomair: mnomair01@gmail.com or nomair@usc.edu & https://www.linkedin.com/in/mustafa-nomair/

Abdelaziz Abderhman: abdelrhm@usc.edu & https://www.linkedin.com/in/abdelaziz-abdelrhman/

Hamza Wako: hwako@usc.edu & https://www.linkedin.com/in/hwako1/

https://github.com/mustafa-nom/SignSpace

Overview & Problem Statement

Over 500,000 people in the United States use American Sign Language (ASL) as their primary form of communication, yet most ASL learners rely on static YouTube videos, diagrams, or textbooks and in-person tutors are too expensive for many (being $70+/hr). These tools lack what matters most: real-time, personalized feedback. ASL requires precise hand orientation, joint positions, finger curvature, and motion flow, yet there is no system today that tells users what they are doing incorrectly or how to correct it.

SignSpace addresses this gap by utilizing the Apple Vision Pro’s advanced 3D hand-tracking to teach ASL interactively through spatial computing. Users learn signs with:

Built in under 12 hours at USC’s AI/ML Buildathon, SignSpace demonstrates how immersive hardware can redefine accessibility, language learning, and spatial skill acquisition.

Built in under 12 hours at USC’s AI/ML Buildathon, SignSpace demonstrates how immersive hardware can redefine accessibility, language learning, and spatial skill acquisition.


System Architecture

The user signs the letter A and as the text feedback updates, recieves feedback based on accuracy

The user signs the letter A and as the text feedback updates, recieves feedback based on accuracy

SignSpace is built entirely in visionOS, integrating native Apple Vision Pro 27-joint hand tracking, RealityKit for 3D visualization, SwiftUI for reactive interfaces, a CoreML classifier trained on 100+ samples per sign, and a rule-based gesture validator for high-precision corrections.

The system is organized into a modular architecture built around a hybrid recognition pipeline:

1. Hand Tracking Layer (ARKitSession + HandTrackingProvider)

The Vision Pro tracks all 27 joints at ~90Hz. Each joint’s 3D position is captured, transformed into world coordinates, and fed into the recognition engine. A custom HandTrackingManager is responsible for:

2. Hybrid Recognition Engine

SignSpace uses a combined approach to maximize reliability and interpretability:

CoreML Recognizer (Primary Path)