Breedlove: reporting - Schema design for MongoDB pre-aggregated reports -

Saturday, 15 June 2013

reporting - Schema design for MongoDB pre-aggregated reports -

i'm next official mongodb docs (http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/) pre-aggregated reports. according tutorial, pre-aggregated document should this:

{   _id: "20101010/site-1/apache_pb.gif",   metadata: {     date: isodate("2000-10-10t00:00:00z"),     site: "site-1",     page: "/apache_pb.gif" },   hourly: {     "0": 227850,     "1": 210231,     ...     "23": 20457 },   minute: {     "0": {         "0": 3612,         "1": 3241,         ...         "59": 2130 },     "1": {         "0": ...,     },     ...     "23": {         "59": 2819 }   } }

the thing i'm using approach, , have info stored way. want add together dimension in metadata subdocument , reconsidering whole thing.

my question is: there reason build _id attribute same info stored in metadata attribute? wouldn't plenty create compound index (unique) around metadata , utilize objectid _id key?

thanks!

other way ;)

you can create simple collection:

{ "ts": "unix timestamp", "site": "site-1", "page": "/apache_pb.gif" }

this collection had performance on insert

and using complex aggregate query (with aggregate time grain):

db.test.aggregate( [   {     "$project": {       "ts": 1,       "_id": 0,       "grain": {         "$subtract": [           {             "$divide": [               "$ts",               3600             ]           },           {             "$mod": [               {                 "$divide": [                   "$ts",                   3600                 ]               },               1             ]           }         ]       },       "site": 1,       "page": 1     }   },   {     "$group": {       "_id": {         "site": "$site",         "page": "$page",         "grain": "$grain",       }     }   },   {     "$group": {       "tsum": {         "$sum": 1       },       "_id": {         "grain": "$_id.grain"       }     }   },   {     "$project": {       "tsum": "$tsum",       "_id": 0,       "grain": "$_id.grain"     }   },   {     "$sort": {       "grain": 1     }   } ])

aggregate statistics 1 hr - 3600 sec in example

imho - more simple , manageable solution without complex datamodel preformance (don't forget index)

mongodb reporting data-modeling

Breedlove

Saturday, 15 June 2013

reporting - Schema design for MongoDB pre-aggregated reports -

No comments:

Post a Comment