View Aggregation

CouchDB's incremental implementation of MapReduce is not just fast. It allows for complex aggregation of data at different granularity. Even better, data at all levels of aggregation are calculated at the same time, so you can explore your data at precisely the level you need, as quick as a flash!

See the example

Aggregation over complex keys

This example uses the Creative Commons licenced SimpleGeo "places of interest" dataset. This is a directory of business listings and contains NN locations from MM countries. Your definition of "interest" may vary...

We build a single view which has a list for the key. If a view uses a list for it's key your reduce function will be called for each potential key combination. For example, if you had a key like [a,b,c] your view would queriable for keys:

  • [a,b,c]
  • [a,b]
  • [a]
  • null

In CouchDB parlance this is accessing the view at a different group_level. By changing the group_level you query at you can aggregate your data at different granularities, for example aggregate sales at day, month or year granularity.

The three tables below show the same view queried at three different group levels.

Group level = 3

?stale=ok&limit=5&group_level=3

Group level = 2

?stale=ok&limit=5&group_level=2

Group level = 1

?stale=ok&limit=5&group_level=1

This example has shown you how view aggregation can be used to pull out data at varying granularity. By using compound keys and appropriate group_level you can minimise the number of views to compute (saving you time and I/Os) while still being able to interrogate your data at the level you need. If you would like to learn more our Chief Scientist Mike Miller wrote an in depth blog post about data aggregation with CouchDB MapReduce here. If you think this is interesting please share it with the world!