Storing Relational Data (rows)

There is a perspective on MapReduce that it is a step backwards in several respects:

In terms of database access, MapReduce is poor implementation
- Can only use brute force (e.g. no indexes)
MapReduce is not novel
MapReduce is missing features
- e.g. bulk loader, indexing, updates, transactions
MapReduce is incompatible with DBMS tools

In response, software like Vertica was made

Regardless of what system is used, some actions are very slow, like parsing text.

To deal with such issues:

We arrive at using row or column stores:

Files are 1 dimensional ⇒ need to project high dimensional data into 1D byte sequence
For row stores
- Easier to modify a record (in-place update)
- Unnecessary data may be read while processing