Special Aspects in Communication Networks

Transcription

Special Aspects in Communication Networks
Special Aspects in
Communication Networks
Structured Peer-to-Peer Systems
26.01.2012
PD Dr. Oliver Waldhorst
Based on lectures by Dominic Battré and Thorsten Strufe
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 1
The “Lookup”-Problem
• Central problem in P2P systems:
– “I want to use resource X – who is responsible for it?”
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 2
Locating Resources
• Given
– A set of nodes (devices, peers)
– A set of resources distributed among the nodes
• Basic questions:
– Where is a resource stored?
– How can it be located?
• Examples for resources
–
–
–
–
–
Files (e.g. in Gnutella)
References to files (e.g. in Napster)
Tables of a database
Entries of database tables
Any other kind of information
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 3
Napster vs. Gnutella
• Napster
– Resources = metadata for files (including reference to file)
• Stored on the (central) index server
• Located by searching the index
– Problems: Reliability, Scalability
• Gnutella
– Resource = file
• Stored on peers
• Located by flooding
– Problems
• Flooding can cause network congestion
• Limiting TTL may result in incomplete results
(especially for unpopular files)
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 4
Structured P2P Networks
• Use protocols for consistent storage and location of
resources in P2P systems
– Efficiency independent of resource popularity
• Typical measures for efficiency
– Per node amount of information for organizing storage and
locating resources (“routing table”)
– Number of steps through the network (“hops”) required for
locating resource
– Number of operations for adding / removing peers
– …
• In general measured with respect to number of nodes N
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 5
Comparison of Approaches
Communication Overhead
O(N)
Flooding
Structured
P2P
O(log N)
O(1)
Flooding
O(1)
O(log N)
O(N)
State information (Routing table + resources)
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 6
Hash Tables
• Mapping of resources to peers inspired by hash tables
– Allow insertion, deletion, lookup in O(1)
• Hash table is a fixed-size array
– Elements of array also called (hash) buckets
• Hash function maps keys to elements in the array
• Properties of good has functions
– Fast to compute
– Good distribution of keys into hash table
– Example: SHA-1 algorithm [3]
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 7
0
0
1
1
2
3
4
Hash Table Example
• Hash function for key x:
hash(x) = x mod 10
• Insert numbers 0, 1, 4, 9, 16, and 25
4
5
25
6
16
• Easy to find if a given key is present in
the table
7
8
9
9
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 8
Distributed Hash Tables (DHTs)
• Idea: Treat node as bucket in a hash table
Resources
Nodes
• Problem:
Node arrival / departure requires changing hash table
– Number of buckets & hash function must be adjusted
– Resources must be moved between nodes
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 9
Source: D. Battré, VL P2P Netzwerke, 2010
– Each resource has a key
– Hash function maps resource with key k to node n
hash(k) = n
Consistent Hashing
• How to build a hash table that requires few movement
when changing table size?
– Idea: Use hierarchy of has tables
• Large hash table that is never changed
• Small hash table: Buckets consist of subset of buckets of
large hash table
• Resizing small hash table requires only movement of a
few buckets of large hash table (with few resources)
– O(X/N) for X resources, N buckets in small hash table
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 10
Consistent Hashing in DHTs
• Approach:
– Assign identifier (ID) to each resource
– Assign nodes responsibility for closed subset of ID space
– Organize overlay in a way that enables quick location of node
responsible for a given ID
Nodes
Node arrival
Resources
ID space
Nodes
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 11
Source: D. Battré, VL P2P Netzwerke, 2010
Resources
ID space
Consistent Hasing in DHTs
• What does the ID space look like?
– Typically natural numbers, e.g., 0 to 2128-1
• Often considered as a logical ring
Source: D. Battré, VL P2P Netzwerke, 2010
– Each node has an ID out of the ID space (node ID)
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 12
Consistent Hashing in DHTs
• Node is responsible for a subset of ID space
“close to” its node id
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 13
Chord [1], [2]
•
Each node applies (SHA1) hash function to current IP address
– Node ID from [0, 2m-1] (usually m = 128, with SHA1 m = 160)
– Value space of hash function treated as ring
Properties of the hash function
– (Pseudo) of peers in ID space [0, 2m-1]
– Peer has unique successor and predecessor
Source: D. Battré, VL P2P Netzwerke, 2010
•
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 14
Chord
•
Resource IDs also determined by SHA1 hash
– E.g., hash over content of MP3 file
 Resource has unique ID from [0, 2m‐1]
•
Rule: A peer is responsible for resources IDs between its ID and the ID of its
predecessor
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 15
Routing in Chord
• 1st approach: hand query to successor until it reaches
responsible peer
• Advantages
– Small routing tables
– Each peer must only store
pointer to successor
• Disadvantage
– O(N) hops to destination
(for N peers)
– High response times
– Low robustness
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 16
Routing in Chord
• 2nd approach: Each peer knows all other peers
• Advantages
– Each query can be
processed within a
single hop
• Disadvantage
– Very large routing tables,
size O(N)
– Not appropriate for systems
with many peers
• High memory consumption
• High maintenance traffic
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Source: D. Battré, VL P2P Netzwerke, 2010
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 17
Routing in Chord
• Routing in Chord is a compromise between Approach 1
and Approach 2
– Each peer knows successor and predecessor
– Additionally each peer maintains finger table
• Contains O(log N) entries for other peers
• Peer i has (approximate) distance 2i to peers own ID
(Such peer exists with high probability due to SHA1
hash function)
• Routing to appropriate successor divides distance to
destination in half in each step
– Routing table size O(log N)
– Distance to destination O(log N)
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 18
Chord Routing Table Example
• Example for node 105
Offset
1
2
4
8
16
32
64
ID
106
107
109
113
121
9
41
Real Node 120
120
120
120
18
18
58
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 19
Routing in Chord
•
•
Idea: Search for node in finger table that is closest but
not behind destination ID
Algorithms:
1 // ask node n to find the successor of id
2 n.find_successor(id)
3
if id in (n, successor]
4
return successor;
5
else
6
n‘ = closest_preceding_node(id);
7
return n‘.find_successor(id);
1 // search the local table for the highest predecessor of id
2 n.closest_preceding_node(id)
3
for i = m downto 1
4
if finger[i] in (n, id)
5
return finger[i];
6
return n;
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 20
Node Arrival
• Every node has a predecessor pointer
• Integration of new node with ID n
– Node n must know a node o that is part of the network
– n asks o for n’ = successor(n)
• Can be done by Chord routing algorithm
– n contacts n’ and becomes predecessor(n’)
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 21
Stabilization Procedure
• Frequent check of successor / predecessor pointer
1 // called periodically. verifies n’s immediate
2 // successor, and tells the successor about n.
3 p.stabilize()
4
n = successor.predecessor;
5
if n in (p, successor)
6
successor = n;
7
successor.notify(p);
1
2
3
4
// n' thinks it might be our predecessor.
n.notify(n')
if (predecessor is nil or n' in (predecessor, n))
predecessor = n';
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 22
Stabilization of Finger Pointers
1 // called periodically. refreshes finger table entries.
2 // next stores the index of the next finger to fix.
3 n.fix_fingers()
4
next = next + 1;
5
if (next > m)
6
next = 1;
7
finger[next] = find_successor(n + 2^(next‐1) );
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 23
Potential Sources of Errors
1 // ask node n to find the successor of id
2 n.find_successor(id)
3
if id in (n, successor]
4
return successor;
Problem 1: Outdated successor pointer
5
else
6
n‘ = closest_preceding_node(id);
7
return n‘.find_successor(id);
1 // search the local table for the highest predecessor of id
2 n.closest_preceding_node(id)
3
for i = m downto 1
4
if finger[i] in (n, id)
5
return finger[i];
Problem 2: Outdated finger pointer
6
return n;
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 24
New nodes (unknown for k)
Old finger pointer
of node k
for distance 2i
Destinations for which finger pointer to p
can be used correctly
k + 2i
(correct position
of finger pointer)
Routing works even with correct finger table!
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Source: D. Battré, VL P2P Netzwerke, 2010
Wrong Fingers, Correct Successors
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 25
Source: D. Battré, VL P2P Netzwerke, 2010
Wrong Successors
Routing does not work with wrong successor pointers!
Data might not be transferred from n’ to n!
• Due to periodic stabilization errors are temporary and
must be treated by application layer
– E.g. repeated request / insertion
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 26
Size of Interval per Node
• IDs chosen uniformly at random
– Ideal case: for N nodes MAXID / N IDs per node
• In reality?
• Normalize ring size to 1
– Ideal size of interval a node is responsible for is 1/N
– Probability for an interval
• larger by factor log N or
• smaller by factor 1/N
is very small
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 27
Size of Interval per Node
• What is a small / high probability?
– Definition:
• Low probability :
1 / Nc,
c≥1
• High probability :
1 − 1 / Nc ,
c≥1
The larger the network, the closer a low (resp. high) probability is to 0 (resp. 1)
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 28
Size of Interval per Node
1. The probability for an arbitrary interval to be smaller
than Ω(1 / N2) is low.
2. The probability for an arbitrary interval to be larger than
O(log N / N) is small.
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 29
Size of Interval per Node
• Consider interval I of size 1 / N2 behind an arbitrary node
• Ei: Event “Node i is located in interval I”
• E: Event “Any node is located in interval I”
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 30
Size of Interval per Node
• Consider interval I of size log(N) / N behind a node
• Ei: Event “Node i is located in interval I”
• Ei: Event “Node i is not located in interval I”
• E: Event “No node is located in interval I”
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 31
Number of Fingers per Node
• Consider normalized ring of size 1
– Fingers of a node point to +1/2, +1/4, +1/8, +1/16, …
• Starting with the finger to a distance smaller than interval size
all further fingers are identical
– No node in between  successor
• We are searching for fingers with
• Thus, we have O(log N) different finger pointers (w.h.p)
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 32
Routing Performance
• O(log N) with high probability
• Consider a message from node n to ID k
– Node z responsible
– Node p is predecessor of node z  how many hops to p?
• n ≠ p: n sends message to next predecessor of p in
finger table
• Consider i with p in [n+2i-1, n+2i)
– Interval is not empty (contains p)
– n sends to first node in interval via finger table
– Message bridges a gap larger or equal to 2i-1
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 33
Routing Performance
• Both new “current” node (that holds the message now) and p
are in [n+2i-1, n+2i)
– Length of interval is 2i-1  Remaining distance to p ≤ 2i-1
 Progress in this step at least half the distance to k
– Remaining distance is divided by half in each step
Steps
1
2
3
… log N
2 log N
Remaining distance
1/2
1/22
1/23
… 1/2log N = 1/N
1/22log N = 1/N2
• After 2log N steps remaining distance is 1/N2
– Recall: With high probability there is no more than one node in
this interval
• Routing in O(log N) steps with high probability
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 34
Node Failures
• Node recognizes failure of successor
– E.g. by missing keep-alive messages
– Where to get a new successor?
• Solution: store multiple successor pointers
– Pass list counterclockwise periodically
• Recall: Routing is correct even if fingers are wrong!
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 35
Heavy Churn
What happens if nodes are joining and leaving frequently?
Example:
1 p.stabilize()
2
n = successor.predecessor;
3
if n in (p, successor)
4
successor = n;
5
successor.notify(p);
1 n.notify(n')
2 if (predecessor is nil or n' in (predecessor, n))
3 predecessor = n';
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 36
Source: D. Battré, VL P2P Netzwerke, 2010
•
•
Data Replication
• Successor of a failing node takes over ID space
– Thus, data is replicated on the successor
Node Root
Replica 1
Replica 2
18
[121,18]
[106,120]
[77,105]
40
[19,40]
[121,18]
[106,120]
58
[41,58]
[19,40]
[121,18]
76
[59,76]
[41,58]
[19,40]
105
[77,105]
[59,76]
[41,58]
120
[106,120]
[77,105]
[59,76]
Source: D. Battré, VL P2P Netzwerke, 2010
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 37
Storing Data
• Each node possesses local data (e.g. music files)
– (Meta-)data is published in the network by origin node
– (Meta-)data is stored by root node selected by hash
function
• Soft-state
– Data on remote node subject to timeout
– Origin node must refresh data periodically
– Data of failing nodes will vanish after a while
• Replication on replica nodes performed by root node
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 38
Data Flow
18
40
58
76
105
120
Local
Local
Local
Local
Local
Local
Root
Root
Root
Root
Root
Root
Replica
Replica
Replica
Replica
Replica
Replica
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 39
Node Arrival
• Problem: New node is responsible of a interval of IDs
– How does it get data for those IDs?
• Possible solutions
– Forward queries to formerly responsible node until data is
copied
– Integrate into network after data is copied
– Ignore problem and wait for soft-state updates
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 40
Chord - Summary
• Structured P2P system based on distributed hash table
– Resources and nodes mapped into the same key space
– Overlay: Ring structure with fingers to enable efficient searching
– Successor & predecessor pointers to increase robustness
• Advantages
– Small state O(log N)
– Queries resolved with view hops O(log N)
• Disadvantages
– Requires explicit stabilization
– Nevertheless sensitive to churn
• Outlook: Kademila uses self-stabilizing approach
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 41
References
[1] I. Stoica, R. Morris, D. Liben‐Nowell, D. Karger, F.
Kaashoek, F. Dabek, and H. Balakrishnan, "Chord: A
Scalable Peer‐to‐peer Lookup Protocol for Internet
Applications", IEEE/ACM Trans. On Networking, Vol. 11,
No. 1, pp. 17‐32, February 2003.
[2] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and
H. Balakrishnan, "Chord: A Scalable Peer‐to-peer
Lookup Service for Internet Applications", MIT TR 819.
[3] Eastlake, D. and T. Hansen, "US Secure Hash
Algorithms (SHA and SHA-based HMAC and HKDF)",
RFC 6234, May 2011.
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 42
Contact
Integrated Communication Systems Group
Ilmenau University of Technology
PD Dr. rer. nat. habil. Oliver Waldhorst
fon: fax: e‐mail: +49 (0)3677 69 2788
+49 (0)3677 69 1226
oliver.waldhorst@tu‐ilmenau.de
Visitors address:
Technische Universität Ilmenau
Helmholtzplatz 5
Zuse Building, room 1066
D‐98693 Ilmenau
www.tu‐ilmenau.de/ics
Prof. Dr.-Ing. habil. Andreas Mitschele-Thiel
Integrated Communication Systems Group
www.tu-ilmenau.de/ics
Peer-to-Peer and Overlay Networks
PD Dr. Oliver Waldhorst
Page 43