danielwertheim

danielwertheim


notes from a passionate developer

Share


Sections


Tags


Disclaimer

This is a personal blog. The opinions expressed here represent my own and not those of my employer, nor current or previous. All content is published "as is", without warranty of any kind and I don't take any responsibility and can't be liable for any claims, damages or other liabilities that might be caused by the content.

CouchDB - partioned dates and group levels to control aggregation granularity

Quick and basic tip about how you can partition dates in your map and use that as an array key and then use different group-level values to perform aggregation on different granularity.

Given that we have some simple documents representing sales:

POST /sales/_bulk_docs

{"docs":[
  {"_id": "1", "created":"2015-01-01T13:35:01", "by":"adam", "value": 200},
  {"_id": "2", "created":"2015-01-01T13:40:02", "by":"john", "value": 220},
  {"_id": "3", "created":"2015-01-01T13:40:03", "by":"hans", "value": 10},
  {"_id": "4", "created":"2015-01-01T13:50:00", "by":"hans", "value": 190},
  {"_id": "5", "created":"2015-01-02T09:05:00", "by":"hans", "value": 10},
  {"_id": "6", "created":"2015-02-01T13:40:03", "by":"adam", "value": 200},
  {"_id": "7", "created":"2015-03-13T13:40:03", "by":"john", "value": 1333}
]}

Lets say we want to be able to see the value by:

The map should now just return an array as key and the value of the order as the value. It could look something like this:

function(doc) {
  var d = new Date(doc.created);
  emit([d.getFullYear(),d.getMonth() + 1,d.getDate()], doc.value);
}

For reduce, you could e.g. use the built in _sum or _stats function.

This view is stored in db “sales” under desgin document “salesinfo” as “bydates”, hence accessed via: GET /sales/_design/salesinfo/_view/bydates.

Lets say we want on per year basis, that would be the "first dimension", hence group_level=1:

GET /..?reduce=true&group_level=1

[2015] --> {sum: 2163, count: 7, min: 10, max: 1333, sumsqr: 1941589}

For year & month:

GET /..?reduce=true&group_level=2

[2015, 1] --> {sum: 630, count: 5, min: 10, max: 220, sumsqr: 124700}
[2015, 2] --> {sum: 200, count: 1, min: 200, max: 200, sumsqr: 40000}
[2015, 3] --> {sum: 1333, count: 1, min: 1333, max: 1333, sumsqr: 1776889}

For year, month & date:

GET /..?reduce=true&group_level=3

[2015, 1, 1]  --> {sum: 620, count: 4, min: 10, max: 220, sumsqr: 124600}
[2015, 1, 2]  --> {sum: 10, count: 1, min: 10, max: 10, sumsqr: 100}
[2015, 2, 1]  --> {sum: 200, count: 1, min: 200, max: 200, sumsqr: 40000}
[2015, 3, 13] --> {sum: 1333, count: 1, min: 1333, max: 1333, sumsqr: 1776889}

Now, lets say you want to limit the results to e.g. take from February:

GET /..?reduce=true&group_level=2&start_key=[2015, 2]

[2015, 2] --> {sum: 200, count: 1, min: 200, max: 200, sumsqr: 40000}
[2015, 3] --> {sum: 1333, count: 1, min: 1333, max: 1333, sumsqr: 1776889}

Of course, this also works with another group_level:

GET /..?reduce=true&group_level=1&start_key=[2015, 2]

[2015] --> {sum:1533, count:2, min:200, max: 1333, sumsqr: 1816889}

Quick and simple example of working with arrays as keys.

Cheers,

//Daniel

View Comments