danielwertheim

danielwertheim


notes from a passionate developer

Share


Sections


Tags


Disclaimer

This is a personal blog. The opinions expressed here represent my own and not those of my employer, nor current or previous. All content is published "as is", without warranty of any kind and I don't take any responsibility and can't be liable for any claims, damages or other liabilities that might be caused by the content.

CouchDB - many-to-many relations

CouchDB is one of those databases that is super simple to get started with and then has its not so obvious “tricks” how to accomplish things that might have felt super natural in another DB. Now, first lets pause right here and think of what I just said:

…might have felt super natural in another DB.

CouchDB is a document oriented database where you query on simple or complex keys, which are produced by you using a secondary-index defined by a map-function (as in map-reduce). It is not a database designed for relational data as graph databases (e.g. Neo4j, OrientDB) or RDBMS. CouchDB is not optimized for handling relations. You can mimic it, but be aware that it can come back and bite you in … ..

The model

Lets say we have a model with Cars. Cars is our main focus in this context. To each car we can tie drivers and owners. Now, we could store this information as one bigger document, optimized for focusing on getting/reading detailed info about a car. But for some reason we want this separated. We even want a specific document tracking the connections between cars and persons. So we have:

Scary right. Having that feeling of old RDBMS design creeping into your document store.

Some sample documents.

{
     "_id": "car1",
     "_rev": "2-5...9",
     "$doctype": "car",
     "make": "volvo"
}

{
    "_id": "person1",
    "_rev": "2-d...0",
    "$doctype": "person",
    "name": "daniel"
}

{
    "_id": "person2",
    "_rev": "2-0...8",
    "$doctype": "person",
    "name": "julia"
}

and the document specifying the connections between cars and persons:

{
   "_id": "f...9",
   "_rev": "3-4...0",
   "$doctype": "car-persons",
   "car": "car1",
   "driver": "person1",
   "owner": "person2"
}

The “trick”

Now lets create a secondary-index, a map:

function(doc) {
  if(doc.$doctype == 'car-persons') {
    emit(doc.car, {_id: doc.car})
    emit(doc.car, {_id: doc.driver})
    emit(doc.car, {_id: doc.owner})
  }
}

you could make further adjustments by using complex keys and e.g. include an integer defining what kind of relation a certain record represents, e.g.

function(doc) {
  if(doc.$doctype == 'car-persons') {
    emit([doc.car, 1], {_id: doc.car})
    emit([doc.car, 2], {_id: doc.driver})
    emit([doc.car, 3], {_id: doc.owner})
  }
}

But lets stick with the sample with a simple key. If I now query this, with key="car1"&include_docs=true lets see what happens:

GET --> /mydb/_design/mydesigndoc/_view/myview?key="car1"&include_docs=true
{
    "total_rows": 3,
    "offset": 0,
    "rows": [
        {
            "id": "f...9",
            "key": "car1",
            "value": {
                "_id": "car1"
            },
            "doc": {
                "_id": "car1",
                "_rev": "2-5...9",
                "$doctype": "car",
                "make": "volvo"
            }
        },
        {
            "id": "f...9",
            "key": "car1",
            "value": {
                "_id": "person1"
            },
            "doc": {
                "_id": "person1",
                "_rev": "2-d...0",
                "$doctype": "person",
                "name": "daniel"
            }
        },
        {
            "id": "f...9",
            "key": "car1",
            "value": {
                "_id": "person2"
            },
            "doc": {
                "_id": "person2",
                "_rev": "2-0...8",
                "$doctype": "person",
                "name": "julia"
            }
        }
    ]
}

That was all for this time. Keep in mind. It’s a “trick”. Evaluate and measure for your domain. If found not performing enough or if it starts to pollute your application layer. Switch DB-engine to something that handles what your specific domain requires.

//Daniel

View Comments