Special Aspects in Communication Networks
Transcription
Special Aspects in Communication Networks
Special Aspects in Communication Networks Structured Peer-to-Peer Systems 26.01.2012 PD Dr. Oliver Waldhorst Based on lectures by Dominic Battré and Thorsten Strufe Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 1 The “Lookup”-Problem • Central problem in P2P systems: – “I want to use resource X – who is responsible for it?” Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 2 Locating Resources • Given – A set of nodes (devices, peers) – A set of resources distributed among the nodes • Basic questions: – Where is a resource stored? – How can it be located? • Examples for resources – – – – – Files (e.g. in Gnutella) References to files (e.g. in Napster) Tables of a database Entries of database tables Any other kind of information Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 3 Napster vs. Gnutella • Napster – Resources = metadata for files (including reference to file) • Stored on the (central) index server • Located by searching the index – Problems: Reliability, Scalability • Gnutella – Resource = file • Stored on peers • Located by flooding – Problems • Flooding can cause network congestion • Limiting TTL may result in incomplete results (especially for unpopular files) Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 4 Structured P2P Networks • Use protocols for consistent storage and location of resources in P2P systems – Efficiency independent of resource popularity • Typical measures for efficiency – Per node amount of information for organizing storage and locating resources (“routing table”) – Number of steps through the network (“hops”) required for locating resource – Number of operations for adding / removing peers – … • In general measured with respect to number of nodes N Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 5 Comparison of Approaches Communication Overhead O(N) Flooding Structured P2P O(log N) O(1) Flooding O(1) O(log N) O(N) State information (Routing table + resources) Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 6 Hash Tables • Mapping of resources to peers inspired by hash tables – Allow insertion, deletion, lookup in O(1) • Hash table is a fixed-size array – Elements of array also called (hash) buckets • Hash function maps keys to elements in the array • Properties of good has functions – Fast to compute – Good distribution of keys into hash table – Example: SHA-1 algorithm [3] Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 7 0 0 1 1 2 3 4 Hash Table Example • Hash function for key x: hash(x) = x mod 10 • Insert numbers 0, 1, 4, 9, 16, and 25 4 5 25 6 16 • Easy to find if a given key is present in the table 7 8 9 9 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 8 Distributed Hash Tables (DHTs) • Idea: Treat node as bucket in a hash table Resources Nodes • Problem: Node arrival / departure requires changing hash table – Number of buckets & hash function must be adjusted – Resources must be moved between nodes Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 9 Source: D. Battré, VL P2P Netzwerke, 2010 – Each resource has a key – Hash function maps resource with key k to node n hash(k) = n Consistent Hashing • How to build a hash table that requires few movement when changing table size? – Idea: Use hierarchy of has tables • Large hash table that is never changed • Small hash table: Buckets consist of subset of buckets of large hash table • Resizing small hash table requires only movement of a few buckets of large hash table (with few resources) – O(X/N) for X resources, N buckets in small hash table Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 10 Consistent Hashing in DHTs • Approach: – Assign identifier (ID) to each resource – Assign nodes responsibility for closed subset of ID space – Organize overlay in a way that enables quick location of node responsible for a given ID Nodes Node arrival Resources ID space Nodes Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 11 Source: D. Battré, VL P2P Netzwerke, 2010 Resources ID space Consistent Hasing in DHTs • What does the ID space look like? – Typically natural numbers, e.g., 0 to 2128-1 • Often considered as a logical ring Source: D. Battré, VL P2P Netzwerke, 2010 – Each node has an ID out of the ID space (node ID) Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 12 Consistent Hashing in DHTs • Node is responsible for a subset of ID space “close to” its node id Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 13 Chord [1], [2] • Each node applies (SHA1) hash function to current IP address – Node ID from [0, 2m-1] (usually m = 128, with SHA1 m = 160) – Value space of hash function treated as ring Properties of the hash function – (Pseudo) of peers in ID space [0, 2m-1] – Peer has unique successor and predecessor Source: D. Battré, VL P2P Netzwerke, 2010 • Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 14 Chord • Resource IDs also determined by SHA1 hash – E.g., hash over content of MP3 file Resource has unique ID from [0, 2m‐1] • Rule: A peer is responsible for resources IDs between its ID and the ID of its predecessor Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 15 Routing in Chord • 1st approach: hand query to successor until it reaches responsible peer • Advantages – Small routing tables – Each peer must only store pointer to successor • Disadvantage – O(N) hops to destination (for N peers) – High response times – Low robustness Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 16 Routing in Chord • 2nd approach: Each peer knows all other peers • Advantages – Each query can be processed within a single hop • Disadvantage – Very large routing tables, size O(N) – Not appropriate for systems with many peers • High memory consumption • High maintenance traffic Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Source: D. Battré, VL P2P Netzwerke, 2010 Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 17 Routing in Chord • Routing in Chord is a compromise between Approach 1 and Approach 2 – Each peer knows successor and predecessor – Additionally each peer maintains finger table • Contains O(log N) entries for other peers • Peer i has (approximate) distance 2i to peers own ID (Such peer exists with high probability due to SHA1 hash function) • Routing to appropriate successor divides distance to destination in half in each step – Routing table size O(log N) – Distance to destination O(log N) Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 18 Chord Routing Table Example • Example for node 105 Offset 1 2 4 8 16 32 64 ID 106 107 109 113 121 9 41 Real Node 120 120 120 120 18 18 58 Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 19 Routing in Chord • • Idea: Search for node in finger table that is closest but not behind destination ID Algorithms: 1 // ask node n to find the successor of id 2 n.find_successor(id) 3 if id in (n, successor] 4 return successor; 5 else 6 n‘ = closest_preceding_node(id); 7 return n‘.find_successor(id); 1 // search the local table for the highest predecessor of id 2 n.closest_preceding_node(id) 3 for i = m downto 1 4 if finger[i] in (n, id) 5 return finger[i]; 6 return n; Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 20 Node Arrival • Every node has a predecessor pointer • Integration of new node with ID n – Node n must know a node o that is part of the network – n asks o for n’ = successor(n) • Can be done by Chord routing algorithm – n contacts n’ and becomes predecessor(n’) Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 21 Stabilization Procedure • Frequent check of successor / predecessor pointer 1 // called periodically. verifies n’s immediate 2 // successor, and tells the successor about n. 3 p.stabilize() 4 n = successor.predecessor; 5 if n in (p, successor) 6 successor = n; 7 successor.notify(p); 1 2 3 4 // n' thinks it might be our predecessor. n.notify(n') if (predecessor is nil or n' in (predecessor, n)) predecessor = n'; Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 22 Stabilization of Finger Pointers 1 // called periodically. refreshes finger table entries. 2 // next stores the index of the next finger to fix. 3 n.fix_fingers() 4 next = next + 1; 5 if (next > m) 6 next = 1; 7 finger[next] = find_successor(n + 2^(next‐1) ); Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 23 Potential Sources of Errors 1 // ask node n to find the successor of id 2 n.find_successor(id) 3 if id in (n, successor] 4 return successor; Problem 1: Outdated successor pointer 5 else 6 n‘ = closest_preceding_node(id); 7 return n‘.find_successor(id); 1 // search the local table for the highest predecessor of id 2 n.closest_preceding_node(id) 3 for i = m downto 1 4 if finger[i] in (n, id) 5 return finger[i]; Problem 2: Outdated finger pointer 6 return n; Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 24 New nodes (unknown for k) Old finger pointer of node k for distance 2i Destinations for which finger pointer to p can be used correctly k + 2i (correct position of finger pointer) Routing works even with correct finger table! Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Source: D. Battré, VL P2P Netzwerke, 2010 Wrong Fingers, Correct Successors Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 25 Source: D. Battré, VL P2P Netzwerke, 2010 Wrong Successors Routing does not work with wrong successor pointers! Data might not be transferred from n’ to n! • Due to periodic stabilization errors are temporary and must be treated by application layer – E.g. repeated request / insertion Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 26 Size of Interval per Node • IDs chosen uniformly at random – Ideal case: for N nodes MAXID / N IDs per node • In reality? • Normalize ring size to 1 – Ideal size of interval a node is responsible for is 1/N – Probability for an interval • larger by factor log N or • smaller by factor 1/N is very small Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 27 Size of Interval per Node • What is a small / high probability? – Definition: • Low probability : 1 / Nc, c≥1 • High probability : 1 − 1 / Nc , c≥1 The larger the network, the closer a low (resp. high) probability is to 0 (resp. 1) Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 28 Size of Interval per Node 1. The probability for an arbitrary interval to be smaller than Ω(1 / N2) is low. 2. The probability for an arbitrary interval to be larger than O(log N / N) is small. Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 29 Size of Interval per Node • Consider interval I of size 1 / N2 behind an arbitrary node • Ei: Event “Node i is located in interval I” • E: Event “Any node is located in interval I” Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 30 Size of Interval per Node • Consider interval I of size log(N) / N behind a node • Ei: Event “Node i is located in interval I” • Ei: Event “Node i is not located in interval I” • E: Event “No node is located in interval I” Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 31 Number of Fingers per Node • Consider normalized ring of size 1 – Fingers of a node point to +1/2, +1/4, +1/8, +1/16, … • Starting with the finger to a distance smaller than interval size all further fingers are identical – No node in between successor • We are searching for fingers with • Thus, we have O(log N) different finger pointers (w.h.p) Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 32 Routing Performance • O(log N) with high probability • Consider a message from node n to ID k – Node z responsible – Node p is predecessor of node z how many hops to p? • n ≠ p: n sends message to next predecessor of p in finger table • Consider i with p in [n+2i-1, n+2i) – Interval is not empty (contains p) – n sends to first node in interval via finger table – Message bridges a gap larger or equal to 2i-1 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 33 Routing Performance • Both new “current” node (that holds the message now) and p are in [n+2i-1, n+2i) – Length of interval is 2i-1 Remaining distance to p ≤ 2i-1 Progress in this step at least half the distance to k – Remaining distance is divided by half in each step Steps 1 2 3 … log N 2 log N Remaining distance 1/2 1/22 1/23 … 1/2log N = 1/N 1/22log N = 1/N2 • After 2log N steps remaining distance is 1/N2 – Recall: With high probability there is no more than one node in this interval • Routing in O(log N) steps with high probability Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 34 Node Failures • Node recognizes failure of successor – E.g. by missing keep-alive messages – Where to get a new successor? • Solution: store multiple successor pointers – Pass list counterclockwise periodically • Recall: Routing is correct even if fingers are wrong! Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 35 Heavy Churn What happens if nodes are joining and leaving frequently? Example: 1 p.stabilize() 2 n = successor.predecessor; 3 if n in (p, successor) 4 successor = n; 5 successor.notify(p); 1 n.notify(n') 2 if (predecessor is nil or n' in (predecessor, n)) 3 predecessor = n'; Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 36 Source: D. Battré, VL P2P Netzwerke, 2010 • • Data Replication • Successor of a failing node takes over ID space – Thus, data is replicated on the successor Node Root Replica 1 Replica 2 18 [121,18] [106,120] [77,105] 40 [19,40] [121,18] [106,120] 58 [41,58] [19,40] [121,18] 76 [59,76] [41,58] [19,40] 105 [77,105] [59,76] [41,58] 120 [106,120] [77,105] [59,76] Source: D. Battré, VL P2P Netzwerke, 2010 Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 37 Storing Data • Each node possesses local data (e.g. music files) – (Meta-)data is published in the network by origin node – (Meta-)data is stored by root node selected by hash function • Soft-state – Data on remote node subject to timeout – Origin node must refresh data periodically – Data of failing nodes will vanish after a while • Replication on replica nodes performed by root node Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 38 Data Flow 18 40 58 76 105 120 Local Local Local Local Local Local Root Root Root Root Root Root Replica Replica Replica Replica Replica Replica Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 39 Node Arrival • Problem: New node is responsible of a interval of IDs – How does it get data for those IDs? • Possible solutions – Forward queries to formerly responsible node until data is copied – Integrate into network after data is copied – Ignore problem and wait for soft-state updates Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 40 Chord - Summary • Structured P2P system based on distributed hash table – Resources and nodes mapped into the same key space – Overlay: Ring structure with fingers to enable efficient searching – Successor & predecessor pointers to increase robustness • Advantages – Small state O(log N) – Queries resolved with view hops O(log N) • Disadvantages – Requires explicit stabilization – Nevertheless sensitive to churn • Outlook: Kademila uses self-stabilizing approach Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 41 References [1] I. Stoica, R. Morris, D. Liben‐Nowell, D. Karger, F. Kaashoek, F. Dabek, and H. Balakrishnan, "Chord: A Scalable Peer‐to‐peer Lookup Protocol for Internet Applications", IEEE/ACM Trans. On Networking, Vol. 11, No. 1, pp. 17‐32, February 2003. [2] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, "Chord: A Scalable Peer‐to-peer Lookup Service for Internet Applications", MIT TR 819. [3] Eastlake, D. and T. Hansen, "US Secure Hash Algorithms (SHA and SHA-based HMAC and HKDF)", RFC 6234, May 2011. Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 42 Contact Integrated Communication Systems Group Ilmenau University of Technology PD Dr. rer. nat. habil. Oliver Waldhorst fon: fax: e‐mail: +49 (0)3677 69 2788 +49 (0)3677 69 1226 oliver.waldhorst@tu‐ilmenau.de Visitors address: Technische Universität Ilmenau Helmholtzplatz 5 Zuse Building, room 1066 D‐98693 Ilmenau www.tu‐ilmenau.de/ics Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel Integrated Communication Systems Group www.tu-ilmenau.de/ics Peer-to-Peer and Overlay Networks PD Dr. Oliver Waldhorst Page 43