Wednesday 15 May 2013

performance - Handling large transactions: any time/memory tradeoffs? -



performance - Handling large transactions: any time/memory tradeoffs? -

in our scheme there (quite common) case user's action can trigger operation involves setting/removing labels onto/from nodes and relationships amounting total order of hundreds of thousands entities. (remove label a 100k nodes, set label b 80k labels, set property [x,y,z] 20k nodes , on). of course, can't squeeze them in 1 transaction and, fact these nodes can separated big number of subsets, perform actions within number of separate transactions, which, of course, breaks acidity, satisfies in terms of performance. if i, however, seek nest transaction single big 1 to rule them all, top-level transaction tries track internal transactions' updates db, which, of course, results in extremely poor performance.

what can guys recommend me solve problem?

my config (well, relevant parts):

"org.neo4j.server.database.mode" : "ha", "use_memory_mapped_buffers" : "true", "neostore.nodestore.db.mapped_memory" : "450m", "neostore.relationshipstore.db.mapped_memory" : "450m", "neostore.propertystore.db.mapped_memory" : "450m", "neostore.propertystore.db.strings.mapped_memory" : "300m", "neostore.propertystore.db.arrays.mapped_memory" : "50m", "cache_type" : "hpc", "dense_node_threshold" : "15", "query_cache_size" : "150"

any hints , clues much appreciated :)

you right modifying hundreds of thousands of entities result of user action in same transaction isn't going performant. nested transactions in neo4j "placebo" transactions, correctly point out.

i start thinking alternative strategies accomplish goal (which know nil about) without needing update many entities.

if alternative isn't possible, inquire whether ok updates happen short time after user action. if reply yes, store message user action in persistent queue, process asynchronously. way, user phone call returns , update happens eventually.

finally, if acceptable time between user action , big update take longer, consider , "agent" continuously crawls graph , updates labels of entities encounters, opposed transaction-driven updates. have @ graphaware noderank inspiration.

performance transactions neo4j cypher

No comments:

Post a Comment