Peer-to-Peer Systems

Transcription

Peer-to-Peer Systems
Peer-to-Peer Systems
Winter semester 2014
Jun.-Prof. Dr.-Ing. Kalman Graffi
Heinrich Heine University Düsseldorf
Peer-to-Peer Systems
Unstructured P2P Overlay Networks
– Unstructured Heterogeneous Overlays
This slide set is based on the lecture "Communication
Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt
Unstructured Heterogeneous P2P Overlays
Unstructured P2P
Structured P2P
Centralized P2P
Homogeneous P2P
Heterogeneous P2P
DHT-Based
Heterogeneous P2P
1.  All features of
Peer-to-Peer
included
2.  Central entity is
necessary to
provide the
service
1.  All features of
Peer-to-Peer
included
2.  Any terminal
entity can be
removed without
loss of
functionality
3.  ! no central
entities
1.  All features of
Peer-to-Peer
included
2.  Any terminal
entity can be
removed without
loss of
functionality
3.  ! dynamic central
entities
1. 
1.  All features of
Peer-to-Peer
included
2.  Peers are
organized in a
hierarchical
manner
Examples:
§  Gnutella 0.4
§  Freenet
Examples:
§  Gnutella 0.6
§  Fasttrack
§  eDonkey
3.  Central entity is
some kind of
index/group
database
Examples:
§  Napster
All features of
Peer-to-Peer
included
2.  Any terminal
entity can be
removed
without loss of
functionality
3.  ! No central
entities
4.  Connections in
the overlay are
“fixed”
Examples:
§ 
Chord
§ 
CAN
§ 
Kademlia
3.  Any terminal
entity can be
removed without
loss of
functionality
Examples:
•  AH-Chord
•  Globase.KOM
from R.Schollmeier and J.Eberspächer, TU München
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
3
Principles – Hierarchical / Heterogeneous
Approach:
to combine best of both
worlds
§  Robustness by distributed
indexing
§  Fast searches by server
queries
Components
§  Supernodes
•  Mini servers / super peers
•  Used as servers for queries
++ Advantages
§  More robust than
centralized solutions
§  Faster searches than in
pure P2P systems
-- Disadvantages
§  Need of algorithms to
choose reliable supernodes
–  To build a sub-network
between supernodes
–  Queries distributed at subnetwork between
supernodes
§  “Normal” peers
•  Have only overlay
connections to supernodes
Picture from R.Schollmeier and J.Eberspächer, TU München
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
4
er-to-Peer Filesharing
History of P2P Filesharing Networks
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
5
Decentralized File Sharing with Distributed Servers
For example: eDonkey
see e.g.
•  http://www.overnet.org/
•  http://www.emule-project.net/
•  http://savannah.gnu.org/projects/mldonkey/
eDonkey file-sharing protocol
§  Most successful/used file-sharing protocol in
•  e.g. Germany & France in 2003 [see sandvine.org]
–  52% of generated P2P file sharing traffic
–  KaZaA only for 44% in Germany
§  Stopped by law
•  February 2006 largest server „Razorback 2.0“ disconnected be
Belgium police
–  http://www.heise.de/newsticker/eDonkey-Betreiber-wirft-endgueltig-dasHandtuch--/meldung/78093
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
6
The eDonkey Network - Principle
Distributed server(s)
§  Set up and RUN BY POWER-USERS
§  à nearly impossible to shut down all servers
§  Exchange their server lists with other servers
•  using UDP as transport protocol
§  Manages file indices
Client application
§  Connects to one random server and
stays connected
§  Using a TCP connection
§  Searches are directed to the server
§  Clients can also extend their search
•  by sending UDP search messages to additional servers
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
7
Edonkey functionality
eDonkey hash can be used for several queries
§  eDonkey server
•  Search for peers
–  Servers block requests if too many requests are sent
§  Kad network à additional structure p2p overlay
•  Search for peers (including peers behind a firewall)
–  Very efficient (10 requests per second) Queries to peers
–  Finds more peers than found using servers
•  Ratings and comments for all Kad peers
–  Not used very widely
§  Directly from the peer (requests to a specific file)
•  Query for the filename
–  About 65 % of all peers answer with filename
•  Ratings and comments of the peer
•  Search for further peers
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
8
The eDonkey Network
Search
TCP
UDP
Server List
Exchange
Download
Supernode
Node
Extended
Search
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
9
The eDonkey Network
Search
Procedure
§  New servers send
•  their port + IP to other
servers (UDP)
§  Servers send
•  server lists (other servers
they know) to the clients
§  Server lists can also be
downloaded on various
websites
Server List
Exchange
Download
Extended
Search
TCP
UDP
Supernode
Node
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
10
The eDonkey Network
eDonkey
Files are identified by
●
●
●
●
§  This helps in
§  Unique MD4
•  Resuming a download
Filesharing network with most files
from a different source
•  Message-Digest
Algorithm4,
RFC 1186
file many eDonkey
•  Downloading
Centralized
P2P network
with
servers the same file
hashes
from multiple sources at
•  16 byte
long
Additional
DHT:
Kad
the same time
§  Are not identified by
•  Verification that the file has
eDonkey hash is created directly from file content
filenames
been correctly downloaded
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
11
The eDonkey Network
à the SEARCH consists of two steps
1. Full text search to
•  Connected server (TCP) or
•  Extended search with UDP to other known servers.
§  Search result are the hashes of matching files
2. Query Sources
•  Query servers for clients offering a file with a certain hash
Later
•  Download from these sources
Status: 1,229,568 users, 37,399,014 files (30.08.2012)
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
12
The eDonkey Network
à the alternate SEARCH consists of two steps
0. Participate in the KAD network
1.  Know MD4 – hash of file
2. Query Sources in KAD
§  Send lookup to node responsible for file hash
§  Query responsible node for clients offering the
Later
•  Download from these sources
Status: 600k-2M users, 200M-600M files (30.08.2012)
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
13
Testing the Content in Edonkey Networks
Forensic Test set
Consists out of about 1500
files
§  Images, music, videos,
Hits
documents and
miscellaneous
Hitrate: 385 / 1479 (26 %)
§  Images: Fraunhofer,
Windows 7, KDE, 4chan
§  Music: mainly three big
music collections
§  Videos: YouTube, Open
Source Films, P2P, ...
§  Documents: diverse PDFs,
Fraunhofer, BitTorrent
§  Miscellaneaous: Zips,
executables, Malware, ...
●
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
14
KaZaA: Decentralized File Sharing with Super Nodes
see
§  www.kazaa.com, gift.sourceforge.net, http://www.my-k-lite.com/
System
§  Developer: Fasttrack
§  Clients: KaZaA
Properties:
§  Most successful P2P network in USA in 2002/3
Architecture: neither completely central nor decentralized
§  Supernodes to reduce communication overhead
P2P system
#users
#files
terabytes
#downloads
(from
download.com
Fasttrack
2,6 Mio.
472 Mio.
3550
4 Mio.
eDonkey
230.000
13 Mio.
650-2600
600.000
Gnutella
120.000
28 Mio.
105
Ca. 525.000
Numbers are from 10‘2002
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
15
Decentralized File Sharing with Super Nodes
Examples: KaZaA, Gnutella 0.6 (Morpheus, mldonkey)
Peers
§  Connected only to some super nodes
§  Send IP address and file names only to super peers
Super nodes - super peers:
§ 
§ 
§ 
§ 
Peers with high-performance network connections
Take the role of the central server and proxy for simple peers
Answer search messages for all peers (reduction of comm. load)
One or more supernodes can be removed without problems
Additionally, the communication between nodes is encrypted
Search
Service
Delivery
Search
Download
Superpeer
Peer
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
16
Example for KaZaA
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
17
Decentralized File Sharing with Complete Files
Drone 1 receives
§  25% of the file
§  at 12,5 KB/s rate
Drone 1 has
§  50 KB/s upload rate
§  not utilized
until he has whole file
Queen Bee has
§  100 MB file
§  50 KB/s upload rate
in total
At the beginning
Later
From www.wtata.com
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
18
Issues with KaZaA / Gnutella 0.6
Keyword-based search
§  You do not know what you get
§  Pollution a problem
•  Music companies flooded the network with false files
•  Chance to get a “good” file ~ 10%
•  Problem for “small” files
Full file download before uploading
§  User go offline after download finished
§  Only few uploaders online
§  Problem for “large” files
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
19
Google Trends for KaZaA, Limewire, Torrent, Emule
http://www.google.com/trends?q=kazaa,+limewire,+torrent,+emule
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
20
Unstructured Hybrid Resource Sharing: Skype
Offered Services
§  IP Telephony features
§  File exchange
§  Instant Messaging
Features
§ 
§ 
§ 
§ 
KaZaA technology
Encrypted high media quality
Support for teleconferences
Multi-platform
Further Information
§  Very popular, low-cost IP
telephony
§  SkypeOut extension to call regular
phone numbers (not free)
§  Great business potential if
combined with free WIFIs
Oct.2011 bought by Microsoft
Very
popular
§  Super nodes are now servers
From www.skype.com
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
21
Skype
Network Architecture
§  formerly KaZaA based
message exchange
at login
super node
regular node
Skype login
server
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
22
Exercise
4
5
6
7
8
1
2
3
2 1
3
10
3
4
8
7
6
5
3 4
4
2
5
6
9
1 2
7
10
2
9
1
1
3
4
8
5
6
7
5
1
2
8
7
6
3
4
5
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
23
Problem 2.2 - Super Hyper Hierarchical Networks
Let us imagine a hierarchical overlay with three hierarchy
steps: there are normal peers, super peers, and hyper
peers.
a) Number of hyper peers needed
§  Assume that a super peer cares for 100 normal peers, and a
hyper peer is responsible for 100 super peers. How many hyper
peers would we need in a network of 999 000 peers in total?
b) Querying in the hyper-super-overlay
§  Suggest a way how such a network could handle search queries.
Which information should a super peer maintain? What should a
hyper peer know? How would a search query be processed?
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
24
Peer-to-Peer Systems
Structured Homogenous P2P Overlay Networks
– Distributed Indexing and DHTs
This slide set is based on the lecture "Communication
Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt
Structured P2P Overlays: Principles
Unstructured P2P
Structured P2P
Centralized P2P
Pure P2P
Hybrid P2P
DHT-Based
Hybrid P2P
1.  All features of
Peer-to-Peer
included
2.  Central entity is
necessary to
provide the
service
1.  All features of
Peer-to-Peer
included
2.  Any terminal
entity can be
removed without
loss of
functionality
3.  ! no central
entities
1.  All features of
Peer-to-Peer
included
2.  Any terminal
entity can be
removed without
loss of
functionality
3.  ! dynamic central
entities
1. 
1.  All features of
Peer-to-Peer
included
2.  Peers are
organized in a
hierarchical
manner
Examples:
§  Gnutella 0.4
§  Freenet
Examples:
§  Gnutella 0.6
§  Fasttrack
§  eDonkey
3.  Central entity is
some kind of
index/group
database
Examples:
§  Napster
All features of
Peer-to-Peer
included
2.  Any terminal
entity can be
removed
without loss of
functionality
3.  ! No central
entities
4.  Connections in
the overlay are
“fixed”
Examples:
§ 
Chord
§ 
CAN
§ 
Kademlia
3.  Any terminal
entity can be
removed without
loss of
functionality
Examples:
•  AH-Chord
•  Globase.KOM
from R.Schollmeier and J.Eberspächer, TU München
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
26
Structured Overlay Networks: Interconnection Networks
Structured Overlay Networks
§  Give peers and objects (unique) identifier
•  PeerIDs and ObjectIDs shall be from the SAME key set
•  Each peer is responsible for a specific range of ObjectIDs
§  Indexing (knowledge on location of resources) to be distributed
§  No search needed anymore (local indexing)
§  No server knowing all (global indexing) available
New challenge: to find peer(s) with specific ID in overlay
§  Lookup:
•  “Route” queries across the overlay network to peer with specific ID
§  Once peer is found
•  Initiate direct communication
•  Upload / download resources
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
27
Functions in a Structured P2P Overlay (all)
IsMyKey(K) à true if node is responsible for Key K
Route(K, M, hint) à send message M to node responsible for K
§  Hint: Optional first hop
GetNodeHandle (K, hint) à get contact details of responsible node
Send(M, q) à Send Message M to node q
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
28
Schematic View on Distributed Hash Table
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
29
Additional Functions in a Distributed Hash Table
Put (Data D, Key K) à Copies Data to node responsible for K
GetData (Key K) à Gets Data stored under the Key K
Optional further functions:
§  Replication
§  Secure Communication
§  Access Control
H(„my data“)
= 3107
1622
1008
709
2011
2207
?
611
3485
2906
12.5.7.31
berkeley.edu
planet-lab.org
peer-to-peer.info
89.11.20.15
95.7.6.10
86.8.10.18
7.31.10.25
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
30
Look up in Structured P2P Systems
Principle
§  Location of the objects is found via routing
•   Node A (provider) advertises object at responsible peer B
»  Advertisement is routed to B.
•  ‚ Node C looking for object sends query
»  Query is routed to responsible node.
•  ƒ Node B replies to C by sending contacting information of A
2. “Routing”
to / Lookup of
desired Object
Node B
3. P2P communication.
Get link to object.
?
Node C
1. Publish link at
responsible Peer
Node A
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
31
Strategies for Data Retrieval: Distributed Indexing
Goal is scalable complexity for
§  Communication effort: O(log(N)) hops
§  Node state: O(log(N)) routing entries
Routing in O(log(N))
steps to the node
storing the data
H(„my data“)
= 3107
1622
1008
709
2011
2207
?
611
3485
2906
12.5.7.31
berkeley.edu
planet-lab.org
peer-to-peer.info
89.11.20.15
Nodes store O(log(N))
routing information to
other nodes
95.7.6.10
86.8.10.18
7.31.10.25
The content of this slide has been adapted from “Peer-toPeer Systems and Applications”, ed. by Steinmetz, Wehrle
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
32
Recall Hash Function & Hash Table
Hash function H(x)
§  maps
•  large input domain
§  onto
•  smaller target domain/range
(most often subset of integer)
Like arrays, hash tables can
provide O(1) lookup
§  with respect to the number
of records in the table.
§  such that
•  we get few collisions
•  i.e.
–  it would be possible to
uniquely identify most of
these strings using this hash
Hash table
§  data structure that provides fast
lookup
§  of a record indexed by a key
§  where
•  the domain of the key is too
large for simple indexing;
as would occur if an array were
used
And .. Question
§  IF H(x) ≠ H(y)
•  THEN (implies) x ≠ y ?
(yes)
§  IF H(x) = H(y)
•  THEN (implies) x = y ?
(no)
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
33
Recall Hash Tables & Hash Functions
Hash tables are a well-known data structure
§  A fixed-size array
§  Elements of array also called “hash buckets”
Properties
§  allow insertions
§  allow deletions
§  allow to find entry in constant (average) time
Hash functions
§  map keys to elements onto (in) the array
Properties of good hash functions:
§  Fast to compute
§  Good distribution of keys into hash table
§  Example: SHA-1 algorithm
•  SHA = Secure Hash Algorithm
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
34
Hash Tables: An Example
Assume a network of N=10 nodes
Hash function:
§  hash(x) = x mod 10 (=N)
Example
§  Insert numbers 0, 1, 4, 9,
§  16, and 25
Properties
§  Easy to find if a given key
is present in the table
Hash Table, an example
Keys
0
1
2
3
4
5
6
7
8
9
Values
0
1
4
25
16
9
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
35
Hash Tables: An Example
Drawback of the example
§  Collisions are likely to happen
§  Time to search grows linearly
with amount of peers
§  To insert and remove a peer
scales also linearly
§  Hash function must be adapted
to the amount of available peers
and it is extremely time
consuming
Distributed Hash Table DHT
§  Huge hash table (2^160 entries)
§  Assigns concatenated input
RANGE to peers
Hash Table, an example
Keys
0
1
2
3
4
5
6
7
8
9
Values
0
1
4
25
16
9
•  (instead of individual numbers)
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
36
Design Aspects for Distributed Hash Tables
1. Choice of an identifier space
2. Mapping of resources and peers to the identifier space
3. Management of the identifier space by the peers
4. Graph embedding (structure of the logical network)
5. Routing strategy
6. Maintenance strategy
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
37
Overlay Network: Design Decisions
Group of peers P
FR : R → I
Group of resources R
Overlay maps resources R
and peers P on identifier
space I
FP : P → I
Example I:
§  Chord: [0, 2^160[
§  Pastry: [0, 2^128[
§  CAN: multidimensional
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
38
(1) Choice of Identifier Space
Importance of Identifier Space:
§  Identifier space needed for addressing resources and peers
•  Often: ID (object ) = hash (object content)
§  Identifier space should be large to support large systems
§  Identifier space independent from physical location of peer à
mobility of peers
§  Clustering of resources due to closeness metric of identifier space
§  Message routing uses identifier space
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
39
(1) Choice of Identifier Space
à Main addressing ID space
The identifier space must posses closeness metric d:
d :I×I →R
Which MUST satisfy the following conditions:
∀x, y ∈ I : d (x, y ) ≥ 0
∀x ∈ I : d (x, x) = 0
∀x, y ∈ I : d (x, y ) = 0 → x = y
And SHOULD satisfy the following conditions:
∀x, y ∈ I : d (x, y ) = d ( y, x)
∀x, y, z ∈ I : d (x, z ) ≤ d (x, y ) + d ( y, z )
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
40
(2) Mapping to the Identifier Space
à Assigning ID addresses to peers (à Peer ID)
Possible design decisions:
Completeness: FP may be complete or partial
Identifier space should be injective
∀p, q ∈ P : p ≠ q ⇒ FP ( p ) ≠ FP (q )
Dynamicity: FP may be fixed or change dynamically over time
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
41
(3) Management of Identifier Space
Peers are responsible for resource identifiers
Identifier space I is managed by peers P:
P
Responsibility function: M : I → 2
Which associates
§  the identifiers of a resource (i = FR ( r ) ∈ I )
§  with a set of peers managing the resource
Through M
r∈R
§  each peer p is assigned responsibility
−1
§  for a set of identifier M ( p )
Locating a resource corresponds to finding a peer p in M ( FR (r ))
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
42
Responsibility Function
Basic properties of M:
Completeness: ∀i ∈ I : ∃p ∈ P : p ∈ M (i )
Cardinality:
§  One or more peers are responsible for given identifier M
§  OFTEN induced by proximity: identifiers are associated with peers
that are numerically closest
p ∈ M (i ) ⇒ d ( Fp ( p ), i ) =min d ( FP (q ), i )
q∈P
Dynamicity:
§  the responsibility function typically changes as peers join and leave
Uniformity / non-uniformity of replication
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
43
(4) Graph Embedding
à Creating a graph / network with the peers
Overlay can be modeled as a graph G=(P,E)
P
A neighborhood function N : P → 2
defines the neighbor set N(p) of the peer p
Which means that p ∈ P maintains all connected nodes as
neighbors: ∃q ∈ N ( p ) : ( p, q ) ∈ E
Typically: Here the overlays differ in the implementation
§  Which nodes to connect to
§  How to set up the routing table
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
44
(4) Graph Embedding – Desired properties
Desired properties of the graph are:
Graph diameter:
§  a small graph diameter should provide lower bounds for latencies during
routing
Connectivity:
§  the graph should be connected at any time
Local connectivity:
§  a peer should be connected to a subset of its immediate neighbors
Long-range connectivity:
§  overlay connectivity should be structurally similar to small world graphs and
should satisfy the condition:
P[q ∈ N ( p )] ≈ 1
d ( FP ( p ), FP (q )) − d
§  with d denoting the dimension of the identifier space
§  (Many “close” contacts, few “far away” contacts)
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
45
(5) Routing strategy
à Routing in the network to the queried identifier
Routing is modeled as asynchronous message passing
§  which forwards a message m with identifier i to peer p
route(p,i,m)
A routing strategy defined as a non-deterministic function:
R : P × I → 2P
Which selects at
§ 
§ 
§ 
§ 
A given element of I as destination ID
a given peer p
with neighborhood N(p)
the set of next peers R ( p, i ) ∈ N ( p )
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
46
Routing strategy: Greedy routing
Mostly used in structured overlays
For a routing step from peer p to peer q with q ∈ R ( p, i )
the follow condition holds:
d ( Fp ( p ), i ) ≤ d ( FP (q ), i )
Which means that
§  the distance to the target
§  after one routing step is
§  less or equal to the distance before
•  (Ideally: distance is halved in order to reach O(log(N)) routing steps)
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
47
(6) Maintenance Strategy
à Keeping the routing information up-to-date
Maintenance strategies can be classified into:
§  Proactive correction (e.g. using heartbeat messages)
§  Reactive mechanisms
•  Correction on use
•  Correction on failure
•  Correction on change
Goal for maintenance strategy:
§  Sufficient level of consistency
§  Minimize effort
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
48
Design Concepts for P2P Overlays
Key design concepts :
• 
• 
• 
• 
choice of an identifier space
mapping of resources and peers to the identifier space
management of the identifier space by the peers
graph embedding (structure of the logical network, selection of
contacts)
•  routing strategy
•  maintenance strategy
Unstructured vs. structured
§  Structured: object IDs and peer IDs share same ID space
•  Every object ID is assigned to one single peer
•  Lookup possible := routing to peer being responsible for desired
object ID
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
49
Communication
Overhead
Motivation Distributed Indexing
Flooding
O(Nk)
Bottleneck:
• Communication
Overhead
• False negatives (viele Fehlmeldungen)
Bottlenecks:
• Memory, CPU, Network
• Availability
?
O(log N)
Central
Server
Scalable solution
between both
extremes?
O(1)
O(1)
Communication overhead i.e.
O(log N)
State(s) of Node
O(N)
§  no. of hops vs.
§  State(s) of node
•  (i.e. amount of routing entries stored in node, e.g. server)
The content of this slide has been adapted from “Peer-toPeer Systems and Applications”, ed. by Steinmetz, Wehrle
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
50
Motivation Distributed Indexing
Communication
Overhead
Communication overhead vs. node state
Scalability: O(log N)
No false negatives
Flooding
§  i.e. Never ( answer YES .. if it is NOT there )
O(N)
Bottleneck:
• Communication
Overhead
• False negatives
More resistant against changes
§ 
§ 
Failures, Attacks
Short time users
Bottlenecks:
• Memory, CPU, Network
• Availability
Distributed
Hash Table
O(log N)
Central
Server
O(1)
O(1)
O(log N)
Node State
O(N)
The content of this slide has been adapted from “Peer-toPeer Systems and Applications”, ed. by Steinmetz, Wehrle
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
51
Fundamentals of Distributed Hash Tables
Challenges for designing DHTs:
1. Desired Characteristics
§  Flexibility
§  Reliability
§  Scalability
2. Load balancing
§  Equal content “load” for all nodes
§  vs. content load proportional to node capacity
§  vs. content load proportional to content consumption
3. Permanent adaptation to faults, join, leave of nodes
§  Assignment of responsibilities to new nodes
§  Re-assignment and re-distribution of responsibilities
in case of node failure or departure
HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html
52