Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
Virgo: Cluster-level Matrix Unit Integration in GPUs for Scalability and Energy Efficiency
SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs
WASP: Exploiting GPU Pipeline Parallelism with Hardware-Accelerated Automatic Warp Specialization
Extending GPU Ray-Tracing Units for Hierarchical Search Acceleration