Overview & Problem Statement

Over 500,000 people in the United States use American Sign Language (ASL) as their primary form of communication, yet most ASL learners rely on static YouTube videos, diagrams, or textbooks and in-person tutors are too expensive for many (being $70+/hr). These tools lack what matters most: real-time, personalized feedback. ASL requires precise hand orientation, joint positions, finger curvature, and motion flow, yet there is no system today that tells users what they are doing incorrectly or how to correct it.

SignSpace addresses this gap by utilizing the Apple Vision Pro’s advanced 3D hand-tracking to teach ASL interactively through spatial computing. Users learn signs with:

Dynamic hand skeleton visualization
Hybrid gesture recognition powered by CoreML + rule-based geometry
Instant corrective feedback (“Move your thumb slightly inward”)
Progress tracking and reinforcement through confetti, sound cues, and accuracy bars

Built in under 12 hours at USC’s AI/ML Buildathon, SignSpace demonstrates how immersive hardware can redefine accessibility, language learning, and spatial skill acquisition.

System Architecture

The user signs the letter A and as the text feedback updates, recieves feedback based on accuracy

SignSpace is built entirely in visionOS, integrating native Apple Vision Pro 27-joint hand tracking, RealityKit for 3D visualization, SwiftUI for reactive interfaces, a CoreML classifier trained on 100+ samples per sign, and a rule-based gesture validator for high-precision corrections.

The system is organized into a modular architecture built around a hybrid recognition pipeline:

1. Hand Tracking Layer (ARKitSession + HandTrackingProvider)

The Vision Pro tracks all 27 joints at ~90Hz. Each joint’s 3D position is captured, transformed into world coordinates, and fed into the recognition engine. A custom HandTrackingManager is responsible for:

Session lifecycle management
Handling ARKit anchor updates
Generating a unified HandData structure for both hands
Fallback to mock data for simulator testing

2. Hybrid Recognition Engine

SignSpace uses a combined approach to maximize reliability and interpretability:

Overview & Problem Statement

System Architecture

1. Hand Tracking Layer (ARKitSession + HandTrackingProvider)

2. Hybrid Recognition Engine

CoreML Recognizer (Primary Path)