Common approach for synchronization is to construct keys and values in a way that data necessary for a computation is naturally brought together by the execution framework.
A common problem is co-occurrence of values.
Pairs - keys are pairs of desired ids.
- Pros
- values tend to be simpler
- easy to implement/understand
- Cons
- large number of key-value pairs (quadratic)
- combiners don’t help much
- $N \times N$ potential keys - most keys have few entries, so not many cases where combiner helps
- e.g. word co-occurence

Stripes - keys are same, values are a map with all associated values.
- Pros
- less key-value pairs compared to pairs, fewer and shorter intermediate keys (less sorting)
- combiners can do more work (more likely to have same key)
- Cons
- values are more complex (serialization + deserialization overhead)
- map may not fit in memory

Both algorithms benefit from combiners - respective operations in reducers are both commutative and associative.
- combiners with stripes have more opportunities to perform local aggregation - key space is vocabulary
- less opportunity with pairs - need to encounter exact pair match
- this also limits opportunities for in-memory combining - mapper can run out of memory to store partial counts
In terms of scalability:
- stripes assumes that its map is small enough to fit in memory, otherwise memory paging will significantly impact performance.