Chained MapReduce

Cloudant uniquely offers incremental, chained MapReduce. This means you can have precise, live data, pull out aggregates of that data and then re-index and access the aggregated data in real time. This example demonstrates that in action.

See the example

Basic views

CouchDB lets you build sorted secondary key:value indexes, called "views" using JavaScript MapReduce functions. Once built these views are quick to query, and can be searched, sliced and paginated. With vanilla CouchDB views are by default updated on access, and while it is possible to query stale views it means you need to choose between accurate data and a responsive application. Cloudant keeps your views current as your database is modified, meaning you no longer have to choose between accuracy and access speed.

Example view code
map: function(doc){
    if (doc.rep){ emit({"rep": doc.rep}, doc.amount); }
}
reduce: _sum

Views are created by using two javascript functions. A map function is used to select data from your documents and emit that data as rows of key:value pairs. Keys and values can be any valid JSON structure (strings, numbers, boolean's, lists, objects). Multiple rows can be emmited per document, but documents are atomic: the contents of one document are unknown to others. Views are ordered by the key, and can be accessed in ranges (termed slices) or queried for a specific key.

The reduce function aggregates values for a give key, down to a single value. The reduce function is optional, and views with a reduce defined can also be queried for unreduced data. There are built in (aka. faster) reduce functions for common operations such as sum and count and in general these should be preferred over writing your own.

Monthly Total Sales

Total Sales by Reps

Chained views

Example chained view code

Defining a chained view couldn't be simpler, just add a dbcopy to your view definition

map: function(doc){
    if (doc.rep){ emit({"rep": doc.rep}, doc.amount); }
}
reduce: _sum
dbcopy: sales
View on result data
map: function(doc){
  if (doc.key && doc.key.rep){ emit(doc.value, doc.key.rep); }
}

The two tables above are populated from a standard view over documents. With vanilla CouchDB this is as far as you can go without making the client work harder or putting some additional infrastructure around your database. Cloudant customers benefit from Chained MapReduce Views allowing them to build views on the result of views. This means you can do simple things like sorting on value instead of key, or do more complex roll-up aggregation of data.

Chained views can go into either the original database or a different one. This allows for different levels of access for different parts of your application or organisation, for example you could have a database per region recording sales, visible to members of that region, and a chained view aggregating all the sales into a central database where management looks at a coarser high level view across all regions.

Chained MapReduce allows you to build views that examine your data in more complex ways. The tables below are populated from chained views built to find the best month and sales rep for the year.

Highest Monthly Sales

Top Sales Reps

This tutorial has introduced you to Cloudant's unique chained incremental MapReduce. Chaining your MapReduce views together allows you to perform complex analytics on vast datasets in near real time and helps you roll up databases into more streamlined aggregated subsets.

We'd love to know if you're using this technique, or have any questions about it. If you do get in touch!