Large Model

# hidden_layer_dims = [5000, 5000, 5000, 5000, 5000, 5000, 5000]
# nx = 1000, ny = 1000
# loss_scale = 1.00003466337
# epochs = 2000

params = 160,036,000

Run Time By Operation

Max Memory Allocation

Training Loss (per epoch)

Loss Scaler

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/9fdd5518-86ee-4d56-a8ca-b16fe6489f08/large-16-loss_scale.png


Medium Model

# hidden_layer_dims = [500, 500, 500, 500, 500, 500, 500]
# nx = 1000, ny = 1000
# loss_scale = 1.0003466337
# epochs = 2000

params = 2,504,500