mongodb - Retrieving Mongo docs for a large fixed set of identifiers -
i have mongo db 200m+ documents. each document has "name" field (indexed) string , "items" field (not indexed) array of integers. size of array can range 1 100.
say have txt file 1m names. need create txt file containing "items" each of 1m names.
options:
just iterate through names 1 @ time , extract items based on _id. create "batches" of little sets of names (say 100 @ time) , query db using$in
operator. later iterate through documents 1 one. use sort of map-reduce break 1m names , query them in parallel. what efficient way this?
this hard reply without trying , profiling.
since array little , assuming every name found brute-force scan of database in natural order may faster of options suggested.
using parallel scan (http://docs.mongodb.org/manual/reference/command/parallelcollectionscan/) can iterate on documents; can hold 1m names in memory , 1 time every 200 records you'll find match write output text file.
mongodb mongodb-query
No comments:
Post a Comment