Quick and basic tip about how you can partition dates in your map and use that as an array key and then use different group-level values to perform aggregation on different granularity.

Given that we have some simple documents representing sales:

POST /sales/_bulk_docs

{"docs":[
  {"_id": "1", "created":"2015-01-01T13:35:01", "by":"adam", "value": 200},
  {"_id": "2", "created":"2015-01-01T13:40:02", "by":"john", "value": 220},
  {"_id": "3", "created":"2015-01-01T13:40:03", "by":"hans", "value": 10},
  {"_id": "4", "created":"2015-01-01T13:50:00", "by":"hans", "value": 190},
  {"_id": "5", "created":"2015-01-02T09:05:00", "by":"hans", "value": 10},
  {"_id": "6", "created":"2015-02-01T13:40:03", "by":"adam", "value": 200},
  {"_id": "7", "created":"2015-03-13T13:40:03", "by":"john", "value": 1333}
]}

Lets say we want to be able to see the value by:

  • Year
  • Year, Month
  • Year, Month, Date

The map should now just return an array as key and the value of the order as the value. It could look something like this:

function(doc) {
  var d = new Date(doc.created);
  emit([d.getFullYear(),d.getMonth() + 1,d.getDate()], doc.value);
}

For reduce, you could e.g. use the built in _sum or _stats function.

This view is stored in db “sales” under desgin document “salesinfo” as “bydates”, hence accessed via: GET /sales/_design/salesinfo/_view/bydates.

Lets say we want on per year basis, that would be the "first dimension", hence group_level=1:

GET /..?reduce=true&group_level=1

[2015] --> {sum: 2163, count: 7, min: 10, max: 1333, sumsqr: 1941589}

For year & month:

GET /..?reduce=true&group_level=2

[2015, 1] --> {sum: 630, count: 5, min: 10, max: 220, sumsqr: 124700}
[2015, 2] --> {sum: 200, count: 1, min: 200, max: 200, sumsqr: 40000}
[2015, 3] --> {sum: 1333, count: 1, min: 1333, max: 1333, sumsqr: 1776889}

For year, month & date:

GET /..?reduce=true&group_level=3

[2015, 1, 1]  --> {sum: 620, count: 4, min: 10, max: 220, sumsqr: 124600}
[2015, 1, 2]  --> {sum: 10, count: 1, min: 10, max: 10, sumsqr: 100}
[2015, 2, 1]  --> {sum: 200, count: 1, min: 200, max: 200, sumsqr: 40000}
[2015, 3, 13] --> {sum: 1333, count: 1, min: 1333, max: 1333, sumsqr: 1776889}

Now, lets say you want to limit the results to e.g. take from February:

GET /..?reduce=true&group_level=2&start_key=[2015, 2]

[2015, 2] --> {sum: 200, count: 1, min: 200, max: 200, sumsqr: 40000}
[2015, 3] --> {sum: 1333, count: 1, min: 1333, max: 1333, sumsqr: 1776889}

Of course, this also works with another group_level:

GET /..?reduce=true&group_level=1&start_key=[2015, 2]

[2015] --> {sum:1533, count:2, min:200, max: 1333, sumsqr: 1816889}

Quick and simple example of working with arrays as keys.

Cheers,

//Daniel

Category:
CouchDb & Cloudant, Development, NoSQL
Tags:
, , ,

Join the conversation! 2 Comments

  1. […] CouchDB – partioned dates and group levels to control aggregation granularity […]

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: