Degree-Optimal Deterministic Routing for P2P Systems Dipartimento di Informatica e Applicazioni “R.M. Capocelli” Università di Salerno, 84081, Baronissi (SA) - Italy Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Università di Salerno - GL7 Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Outline • • • • • • • P2P e DHT DHT performance metrics Greedy Routing vs Non-Greedy Routing Neighbor of Neighbor Routing algorithm The Small World Phenomena Our Proposal: H-Networks Conclusions Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Distributed Hash Table (DHT) • Distributed version of a hash table data structure • Stores (key, value) pairs • • The key is like a filename The value can be file contents • Goal: Efficiently insert/lookup/delete (key, value) pairs • Each peer stores a subset of (key, value) pairs in the system • Core operation: Find node responsible for a key • Map key to node • Efficiently route insert/lookup/delete requests to this node Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno DHT performance metrics • Three performance metrics: • Routing table size (degree) • Storage cost • Measure the cost of self-stabilization for adapting to node joins/leaves • Diameter and Average path length • Time cost • Fault tolerance Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Chord • Chord uses a one-dimensional circular key space (ring) of N=2m identifiers • The node responsible for the key is the node whose identifier most closely follows the key Routing • Chord maintains two sets of neighbors: correctness • A successor list of k nodes that immediately follows it in the key space • A finger list of m = log N nodes spaced exponentially around the key space • Routing consists in forwarding to the node closest, but not past, the key Routing efficiency • Performance: • Diameter: log N (O(log n) whp) where n denote the number of nodes present in the network • Routing table size: log N (O(log n) whp) • Average path length: ½ log N Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Successors ID Resp. indice Nod o 1 14 8+1=9 14 2 8+2=11 21 14 3 8+4=12 4 8+8=16 5 24 14 32 21 38 8+16=24 6 24 42 8+32=40 42 Predecessor Nodo 1 Meeting Firb - Genova, 5-6 luglio 2004 Meeting WEBMINDS 2005 m=6 Salerno, 20/21/22 giugno Greedy routing: move to the neighbor that minimizes the distance to the target. t s Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Greedy Routing in the Considered Networks Simple – to understand and to implement. Local – routing occurs inside the portion of ring that is delimited by source and destination In some cases – (Hypercube, Chord) – the best we can do. Not optimal with respect to the degree. Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Greedy Routing in the Considered Networks •Degree is (log n) •Greedy routing needs (log n) hops. •Lower bound is Ω(log n / log log n) Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Non Greedy Routing Routing is not local • Viceroy Network • Degree: O(1) • Average Path Length: O(log N) • De Bruijn graphs • Degree: O(log N) logN • Average Path Length: O loglogN Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Neighbor of Neighbor (NoN) Greedy Routing Let d(x,y) be a metric for the nodes in the network. 1. Assume the message is currently at node u ≠ target. 2. Let N = {v1, v2, …, vk} be the neighbors of u. 3. For each 1 ≤ i ≤ k, let wi1, wi2, …, wik be the neighbors of vi and let N'= { wij 1 ≤ i, j ≤ k}. 4. Among these k2+k nodes, assume that z is the one closest to the target (with respect to metric d). 5. If z N route the message from u to z else z = wij, for some i and j, and we route the message from u via vi to z. • Manku, Naor, Wieder [2004] NoN-routing within O(log n / loglog n) hops in Small World Networks. • Manku, Bawa, Ragahavan [2003]: a heuristic routing algorithm in Symphony – a Small World P2P network. • Coppersmith, Gamarnik and Sviredenko [2002]: proved an upper bound on the diameter of a Small World graph. Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Neighbor of Neighbor (NoN) Greedy Routing t Meeting WEBMINDS 2005 s Salerno, 20/21/22 giugno The Small World Phenomena • The “six degree of separation” experiment S. Milgram [M67]. • The sociological experiment relied on social networks to transmit a letter from a person to unfamiliar targets by passing the letter only via acquaintances. • Only a small number (around 6) of steps was needed. • Problem: Locate a resource in a ‘natural’ network based on partial information • Question: How do people find short paths? • Recent work [DRW03], shows that, in the first steps the message was forwarded to a person P by using a guess on who P knew or, in other words, on his/her neighbors. Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Small World • Nodes points in a two dimensional grid • Grid edge short range • Each edge (x, y) appears independently with probability 1/d(x,y)2 •Degree of each node (log N) Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno R-Schemes • R-Chord N=2m [MNW04] • For each 0 ≤ i < m, let r(i) denote an integer chosen uniformly at random from the interval [0,2i), node x is connected by edges to the nodes x+2i+r(i); x 2i 2i+1 • R-Hypercube [MNW04] • For each 0 ≤ i ≤ m, node x is connected with y where y is defined as follows: the top i-1 bits of y are identical to those of x. The ith is flipped. The remaining m - i bits are chosen uniformly at random. Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno NoN-Greedy Routing [MNW04] R-Chord, R-Hypercube, Skip Graphs Symphony Degree Greedy NoN Greedy log N log N k < log N Meeting WEBMINDS 2005 log 2 N k log N log log N log 2 N k log k Salerno, 20/21/22 giugno Neighbor of Neighbor (NoN): Degree • Cost of Neighbor of Neighbor lists: • Memory: O(log2n) • Maintenance: O(log n) must be updated • Neighbor lists should be maintained (open connection, pinging, etc.) “In practice, a Chord ring will never be in a stable state; instead, joins and departures will occur continuously, interleaved with the stabilization algorithm. The ring will not have time to stabilize before new changes happen.” [SMLKKDB03] Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno H-Networks • H-Chord • Let N=2m and H() denote a good hash function. For each 0 ≤ i ≤ m, node x is connected by edges to the nodes x+2i+ H(x) mod 2i; x 2i 2i+1 • H-Hypercube • Let H() denote a good hash function, for each 0 ≤ i ≤ m, node x is connected with y where y is defined as fallows: the top i-1 bits of y are identical to those of x. The ith is flipped. The remaining m - i bits are identical to those of H(x). Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno H-Networks H-Networks Degree Greedy NoN Greedy H-Chord, H-Hypercube, log N log N log N log log N H-Skip-graphs log N log N log N k < log N log 2 N k log 2 N k log k H-Symphony Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno H-Chord Lemma The average path length is O(log n / loglog n) hops for the NoN Greedy algorithm on H-Chord with n=2m nodes. n<2m Chernoff Bound Proof s t d(s,t)=d the distance decrease at last with a factor of ¾ for each step Phase I: d < n1/loglog n O(log n / loglog n) step to reach the destination Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno H-Chord I s t d(s,t)=d Phase II: d > n1/loglog n • |I|=d’=d / log d • Goal: The probability that s can reach the interval I in two hops is equal to a constant c Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno H-Chord p - 1 neighbors I s t |I|=d’=d / log d d > n1/loglog n d(s,t)=d p=log d • s has at least p - 1 neighbors s1,… ,sp-1 Claim: The probability that si can reach the interval I is at least d’/2p The probability that s can reach the interval I in two hops is equal to a constant 1-e-1 APL=O(log n / loglogn) Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Skip – Graphs • Each node (resource) has a name. • Nodes are arranged on a line sorted by name. b a 0 0 1 c 1 0 1 d 0 1 0 f e 0 0 1 1 1 1 1 0 0 • Each node x chooses a random string m(x) of bits. • An edge is established if two nodes share a prefix which is not shared by the nodes between them. • Allows prefix search. • ??? Load balancing ??? Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Routing in Skip – Graphs • Greedy Routing – use longest edge possible. • Path length and degree are (log N) w.h.p. 0 0 1 1 0 1 Meeting WEBMINDS 2005 0 1 0 0 0 1 1 1 1 1 0 0 Salerno, 20/21/22 giugno H-Skip-graphs Let H() denote a good hash function, H-Skip graphs are identical to Skip-graphs but with m(x)=H(x) b a 0 0 1 c 1 0 1 Meeting WEBMINDS 2005 d 0 1 0 f e 0 0 1 1 1 1 1 0 0 Salerno, 20/21/22 giugno H-Skip-graphs • A node in H-Skip-graphs, in spite of using a deterministic hash function, has no way of estimate its neighbor’s neighbors. • Nevertheless, by using a deterministic hashing function the membership vector m(t) of the target become available to the source, and a more efficient search is now possible. • d(x,y)=(y+2m-x) mod 2m • d ' ( x, y ) ( y 2 m x) mod 2 m 2 b a 0 0 1 lcs ( m ( x ), m ( y )) c 1 0 1 Meeting WEBMINDS 2005 d 0 1 0 f e 0 0 1 1 1 1 1 0 0 Salerno, 20/21/22 giugno Chord: N IDs, N nodes Chord 10 9 8 7 6 5 4 3 2 1 0 R-Chord Greedy R-Chord NoN R-Chord SP H-Chord Greedy H-Chord NoN 18 16 14 12 10 8 6 H-Chord SP 4 2 hops Lower is better log N Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Lower is better Chord: 232 IDs, N nodes 9 Chord 8 6 R-Chord Greedy R-Chord NoN 5 R-Chord SP hops 7 4 2 H-Chord Greedy H-Chord NoN 1 H-Chord SP 3 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 log N Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Lower is better Skip-graphs: N nodes 14 Skip-graphs Greedy Skip-graphs NoN Skip-graphs SP H-Skip-graphs Greedy H-Skip-graphs NoN H-Skip-graphs SP 12 hops 10 8 6 4 2 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 log N Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Skip-graphs: N nodes Lower is better 14 H-Skip-graphs Greedy 12 hops 10 H-Skip-graphs NoN 8 6 H-Skip-graphs Greedy (New Metric) 4 2 H-Skip-graphs NoN (New Metric) 16 14 12 10 8 6 4 2 0 log N Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Conclusions • H-Networks: • Deterministic P2P networks • No additional information is transmitted nor stored: • Each node x, knowing y, can compute H(y) and then can estimate y’s neighbors. • Asymptotically optimal with respect to average path length and degree (No hidden constant) • Allows a trade-off between efficiency and maintenance • No overhead with respect to greedy routing system Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Bibliography "Degree-Optimal Deterministic Routing for P2P Systems”. G.Cordasco, L.Gargano, M.Hammar, and V.Scarano. In Proc. of 10th IEEE Symposium on computers and communications (ISCC 2005) La Manga del Mar Menor, Cartagena, SPAIN June 27-30, 2005. Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Dipartimento di Informatica e Applicazioni “R.M. Capocelli” Università di Salerno, 84081, Baronissi (SA) - Italy Questions? Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno • Fine Diapositive Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Motivation • Peer to Peer Systems (P2P) • • • • • • • • File sharing system; File storage system; Distributed file system; Redundant storage; Availability; Performance; Permanence; Anonymity; Meeting WEBMINDS 2005 Scalability Salerno, 20/21/22 giugno Uniform Routing Algorithm • We consider a ring of N identifiers labeled from 0 to N-1 • A routing algorithm is uniform if for each identifier x, x is connected to y iff x+z is connected to y+z (i.e. : all the connection are symmetric). • Advantages • Easy to implement • Greedy algorithm is optimal • Simple – to understand and implement • Local – routing occurs inside the portion of ring that is delimited by source and destination • No node congestion • Fast Bootstrap • Do not need to estimate n • Drawback Routing is not greedy • Less powerful (De Bruijn Graph and Neighbor of Neighbor Greedy routing are more powerful) Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Asymptotic tradeoff curve Diameter Ring Uniform Routing algorithm n -1 Totally connected graph Chord et al. LB O(log n) O(log n/ log(log n)) 1 Non-Uniform Routing algorithm 1 O(log n) n -1 Routing table size Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Classification…. Pure P2P Systems Uniform Systems Non Uniform Systems Chord, CAN, Pastry, Tapestry… F-Chord Greedy Routing Non Greedy Routing Randomized Networks and Neighbor of Neighbor Routing Viceroy, De Bruijn graphs Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno F-Chord() [1/2,1] m=log n=1.44 log n F-Chord(1) 12 3 5 8 13 21 34 89 55 • F-Chord() Fib(2i), for i = 1,2, …,(1-)(m-2) Fib(i), for i = 2 (1-)(m-2) +2, …, m-1 even jumps all jumps • Degree: F-Chord() use (m-2) jumps • Diameter: For any value of , the diameter of F-Chord() is m/2 0.72021 log N • Average Path Length: The average path length of the F-Chord() scheme is bounded by 0.39812 log N + (1-)0.24805 log N+1 Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno Graphical results Lower is better 1.60 1.40 Degree hops x log n 1.20 Diameter 1.00 Average Path Length 0.80 0.60 Chord APL 0.40 Chord Degree & Diameter 0.20 0.00 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno H-Networks • We denote by j1, j2 , …, jd all the jumps of our schemes (ordered by their size); • Let H() a good hash function that map an id on a sequence of m bits, for each 1 ≤ i ≤ d, node x is connected by edges to node x + ji + (H(x)/2m)*(ji+1 - ji) [0,1) ji+1 ji Meeting WEBMINDS 2005 Salerno, 20/21/22 giugno