Quantization Modifier

MinMax:

Function Total Calls Min (s) Max (s) Mean (s) Std Dev
_load_model_and_processor 1 4.846553802490234 4.846553802490234 4.846553802490234 0.0
calculate_qparams 114911 0.0002410411834716797 0.0061359405517578125 0.0013692116306012565 0.0009716616353335256
_calibrate 1 247.97612380981445 247.97612380981445 247.97612380981445 0.0
_run_oneshot 1 248.97535705566406 248.97535705566406 248.97535705566406 0.0
_save_compressed_model 1 44.33747839927673 44.33747839927673 44.33747839927673 0.0
_handle_recipe 1 0.003471851348876953 0.003471851348876953 0.003471851348876953 0.0
_run_lm_eval 1 1189.0685720443726 1189.0685720443726 1189.0685720443726 0.0

calculate_qparams.png

MSE:

Function Total Calls Min (s) Max (s) Mean (s) Std Dev
_load_model_and_processor 1 4.908045530319214 4.908045530319214 4.908045530319214 0.0
calculate_qparams 114911 0.03839588165283203 0.24267148971557617 0.05269274481844343 0.0198057810199097
_calibrate 1 6117.812113761902 6117.812113761902 6117.812113761902 0.0
_run_oneshot 1 6150.447900533676 6150.447900533676 6150.447900533676 0.0
_save_compressed_model 1 40.4849693775177 40.4849693775177 40.4849693775177 0.0
_handle_recipe 1 0.002412080764770508 0.002412080764770508 0.002412080764770508 0.0
_run_lm_eval 1 1171.5198910236359 1171.5198910236359 1171.5198910236359 0.0

calculate_qparams.png

Comparison:

Metric MinMax Observer MSE Observer
Mean time (seconds) 0.0013692116306012565 0.05269274481844343
Min time (seconds) 0.0002410411834716797 0.03839588165283203
Max time (seconds) 0.0061359405517578125 0.24267148971557617
Std Dev (seconds) 0.0009716616353335256 0.0198057810199097
Total time (seconds) ~158 ~6051

MinMax.csv

MSE.csv

MinMax_transposed.csv

MSE_transposed.csv

GPTQ Modifier