1. Registration & Communication
- Official Registration: All participants must register via the Official Google Form.
- Discord Community: Upon registration, team members will be individually invited to our Discord server. This is the primary hub for technical support, Q&A, and announcements.
- Eligibility: Each individual is permitted to join only one team. Participating in multiple teams under different aliases is strictly prohibited and will lead to disqualification.
2. Data Usage & NDA
- NDA Agreement: By downloading the 10k hour Dataset, participants automatically agree to the terms of the Non-Disclosure Agreement (NDA). The full NDA content will be available on the website prior to the dataset release.
- Usage Restriction: The provided data is for research purposes within this competition only.
3. Bi-weekly Evaluation Schedule
To help teams track their progress, we will conduct four rounds of evaluation:
- gitEvaluation Dates: March 1st, March 15th, April 1st, and April 15th (Final).
- Submission Deadline: Models must be submitted to the designated Cloudflare Storage (format and keys to be announced) by 23:59 AoE on each evaluation date.
- Submission Limit: Each team may submit only one model per evaluation cycle.
4. Technical Specifications & Environment
- Robot Platform: All evaluations will be performed using the Toyota HSR (Human Support Robot).
- Compute Hardware: Submitted VLA models will be executed on a system equipped with an NVIDIA RTX 5070 Ti.
- Deployment: To ensure environment consistency, we will soon release a Docker image and an evaluation script. All submissions must be compatible with this provided environment.
- Submission Content: Please submit only the necessary source code and model checkpoints required for inference.
5. Task Structure & Scoring
- Task Composition: The benchmark consists of 6 tasks in total:
- 3 Public Tasks: Detailed descriptions and training data provided.
- Scoring Metric: The final ranking is determined by the Average Success Rate across all 6 tasks.