Batching and rate-limiting are both important mechanisms when trying to ensure that work gets done in a system without overwhelming its resources. They are especially of interest when the load is high and unless we employ one or both of them, we risk either not making enough progress or that of overloading the resource.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/3cb281a6-96a2-4e29-a814-674d3ebd7181/Untitled.png

Batching is useful when trying to amortize some fixed cost, such as network latency or disk seek time. It works especially well when work is throughput bound and not latency bound. Disabling batching, or using smaller batch sizes, could give better latency at the cost of throughput.

Caveat: Large batches can cause spiky load/resource utilization instead of distributing it more uniformly, especially when combined with rate-limiting. It basically boils down to how quickly new request batches are created and dispatched. Small batches can naturally slow the system down and spread the load more evenly.

Lever: Batch size. The resource provides a natural upper limit on the max batch size that can be used (e.g. number of queries in a transaction).

Rate-limiting is useful when trying to keep load on a resource within tolerable bounds. The rate-limit is usually determined and enforced by the resource itself (to protect itself from rogue clients) and not by the load generator. Rate-limiting is usually applied to latency sensitive workloads and when each call to the resource is counted once regardless of batch size (e.g. API calls). Clients can be programmed to not generate load greater than the rate-limit if it is statically known, or to graciously handle rate-limit errors if the limit is dynamic.

Caveat: The granularity at which the limit is applied can impact the load characteristics, even for the same aggregate load - e.g. a limit of 10 reqs per 100ms is the same as 100 reqs per second, but the latter can overwhelm the underlying resource by sending all 100 reqs in the first 5 ms, whereas the former will prevent that from happening.

Lever: Rate-limit and granularity at which it is applied.

Conclusion:

These are the factors to consider when deciding how to configure batching and rate-limiting: