cluster computing - Brand new to Cassandra, having trouble understanding replication topology -
so i'm taking on our cassandra cluster after previous admin left i'm busy trying larn much can it. i'm going through documentation on datastax's site we're using product.
that said, on replication factor part i'm having bit of problem understanding why wouldn't have replication factor set number of nodes have. have 4 nodes , 1 datacenter, nodes located in same physical location well.
what, if any, benefit there having replication factor of less 4?
i'm thinking beneficial fault tolerance standpoint if each node had own copy/replica of data, not sure why want less replicas number of nodes have. there performance tradeoffs or other reasons? missing concept here (entirely possible)?
there few reasons why might not want increment rf 3 4:
increasing rf multiplies original info volume amount. depending on info volume , info density may not want incur additional storage hit. rf > number of nodes help scale beyond 1 node's capacity.
depending on consistency level experience performance hit. i.e. when writing quorum consistency level (cl) rf of 3 wait 2 nodes come before confirming write client. in rf of 4 waiting 3 nodes come back.
regardless of cl, every write going every node. more activity on cluster , may not perform if nodes aren't scaled workload.
you mentioned fault tolerance. rf of 4 , reads on cl one, can absorb 3 of servers beingness downwards simultaneously , app still up. fault tolerance perspective pretty impressive, unlikely. guess if have 3 nodes downwards @ same time in same dc, 4th downwards (natural disaster, flood, knows...).
at end of day depends on needs , c* nil if not configurable. rf of 3 mutual among cassandra implementations
check out deck joe chu cassandra cluster-computing nodes datastax
No comments:
Post a Comment