batch_size learning rate model dataset AP training time required memory