A basic profiling between usage of CPU and GPU (GeForce RTX 4060) is in the benchmark.py file.
benchmark.py
And here are the results for 1000 steps with CPU and GPU :
Device: cuda | Time: 4.1586 seconds Device: cpu | Time: 8.3651 seconds