Topics in Database Systems: Data Management in Peer-to-Peer Systems STRUCTURED OVERLAYS BASICS (Week 2 - 26/10/2006) NOTES ================================================================== Basic operations for all structured overlays: Join: a. Bootstrap must know at least one other node in the system b. Find the right position in the overlay (by routing) c. Update the overlay: build own routing table and update tables at a number of other nodes that are affected * important that the routing tables are small and involve only a small number of other neigboring nodes d. Publish own data (keys) e. Get assigned the corresponding data partition: data (keys) are transferred from neigboring nodes to it * important that both the number of affected neigborhs and the data volume is kept small Step c may be done incrementally: first, ensure connectivity - that is, correct operation by updating immediate neigborhs - and then do the other updates (eg Chord (on demand), CAN (periodic refreshes)) Leave: a. Find out that the node is unreachable (failed, voluntary left) by periodic life beats b. Update the overlay: 1. find a node to "take over" the data space hold by the departing node or "replace it" 2. update the affected routing tables Lookup: a. Follow the routing table entries towards the node that is closer to the node that holds the data * The key point is to decrease the distance to the destination at each step, if you halve this distance then you get logarithmic lookup as in CHORD. In general, most structured overlays seek to keep the minimum distance between any two nodes (this is called diameter) small. However, since in many cases, there are more than one path between any two nodes, lookup must find the minimum path between the node that issues the lookup and the destination. DHTs vs trees: DHTs (in their basic form) cannot support range queries There is a subtle difference between virtual nodes and actual nodes: the mapping between them is not necessarily a one-to-one mapping An actual node may host more than one virtual node. ------------------------------------------------------------------------- CAN * importance of *equal* size partitions CHORD * virtual ring: nodes fill in positions in this ring BATON * search trees give logarithmic lookup time: but must keep them balanced and avoid the fact that the higher in the tree, the more the load you get -- To keep the tree balanced, BATON places new nodes as leaves (as a child of a node at level L-1, or L (if all L-1 nodes have two children) and removes leaf nodes (if a non leaf node or a leaf node at level L-1 is removed, it is replaced by a leaf at level L) -- To balance the load, BATON adds links between siblings (at logarithmic distances similarly to CHORD), upwards links to the parent node and links to the left and right adjacent nodes in an in-order traversal ---------------------------------------------------------------------------- Further issues: 1. Load-balance a. Each node must be assigned similar-sized partition of the data b. Each node must receive the same query and update load (including routing load) 2. Physical network awareness a. When building the network, take care so that neigborhs in the overlay are also neigborhs in the physical network b. While routing, if you have a choice, move to the neigbor that is physically closer 3. Resilience to failures When a node fails (departs) ensure that a. the overlay operates correctly b. The content of nodes is not lost 4. Concurrency Updates and lookups appear concurrently, ensure correct operation (consistency)