RES224 Architecture des applications Internet

Transcription

RES224 Architecture des applications Internet
RES224 Architecture des applications Internet
Dario Rossi
mailto:[email protected]
http://www.enst.fr/~drossi
Version 19 Septembre 2014
Contents
1 Presentation de l’Unité d’enseignement
2
2 Cours magistraux (CM)
2.1 Introduction . . . . . . . . . . . . . . . .
2.2 Les applications client-server . . . . . . .
2.3 Addressage: DNS et DHCP . . . . . . . .
2.4 Accès aux données: HTTP et FTP . . . .
2.5 eMail: SMTP POP et IMAP . . . . . . .
2.6 Les applications P2P: Introduction et DHT
2.7 Echange de données en P2P: BitTorrent .
2.8 Multimedia et VoIP en P2P: Skype . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
12
39
58
88
111
133
155
180
3 Travaux dirigés (TD)
204
3.1 Enoncé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
3.2 Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
4 Travaux pratiques (TP)
232
4.1 Enoncé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
5 Lectures obligatoires (LO)
5.1 SPDY (Google whitepaper) . .
5.2 SPDY (CACM’12) . . . . . . .
5.3 SPDY (CoNEXT’13) . . . . . .
5.4 SPDY (NSDI’14) . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
236
237
241
251
263
6 Example de contrôle de connaissances (CC)
277
6.1 Enoncé du 25/11/2011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
2
D. Rossi – RES224
RES 224
Architecture des applications Internet
Presentation de l’unité d’enseignement
dario.rossi
Dario Rossi
RES224
http://www.enst.fr/~drossi
v250811
Plan
• Enseignants
• Organisation
• Évaluation
• CC
• TPs et interface Web TPRES
• Materiel
• Structure du cours
2
Presentation de l’Unité d’enseignement
Equipe pedagogique
• Enseignants
– [email protected][email protected]
LINCS (av Italie)
C234 (Barrault)
• Encadrants (TPs et TDs)
– Enseignants + PhDs (e.g., F. Diaz, R. Nassar, G.Rossini,
C. Testa,...)
– Se reporter aux enseignants pour questions
concernant l’organisation du cours
– Se reporter au responsable d'UE de la periode pour
les informations sur l'examen (cfr Evaluation)
Organisation
• Cours magistraux
– Equipe enseignant
• Travaux pratiques (en deux groupes)
– Assisté par les encadrants
– Presence obligatoire, justifiez à l'avance vos absences (par mail)
– Compte rendu optionnel à remettre (cfr. Evaluation)
• Echeance: au plus tard le 15eme jours après le TP
• En format PDF uniquement, via interface Web
• Travaux dirigés (en deux groupes)
– Corrigé par les encadrants pendant le TD
– Corrigé electronique disponible pour la plupart des exercices
• Regarder la solution des exercises uniquement
après avoir essayé de les resoudre!
3
4
D. Rossi – RES224
Organisation
• Groupes (pour TPs):
–
–
–
–
Partages-vous de facon equilibré dans les salles de TP
eg., ordre alphabetique, social, aleatoire,
Si > 30 eleves: Obbligatoirement 2 groupes
Si <= 30 eleves: Optionellement 2 groupes
• En tout cas
– Chaque groupe est partagé en binomes
• (ou, au plus, trinomes).
• (pas de bonus pour les monomes)
– Les bi/trinomes doivent etre stables
(i.e., ne pas changer pendant le cours)
Syllabus
• Cours
– Introduction
– Internet
– Les applications
– Client-server
– Accès au données: Web, FTP
– Messages: SMTP, POP, IMAP
– Addressage: DHCP, DNS
– Peer-to-peer
– Localisation/diffusion
– Partege de fichiers: BitTorrent
– VoIP: Skype
4
• TDs
– Domain Name System
– Autres applications
• TPs
– Session applicatives
(analyse du trafic avec
Wireshark )
• Materiel à lire
– Lecture obligatoires
(articles inclus dans
le policopie)
Presentation de l’Unité d’enseignement
Evaluation
• Note finale
– Note = CC si pas de TP
– Note = moyenne ½ TP + ½ CC
– Remarque: si CC<6, moyenne avec TP limitée à <10
– Cas réel: TP=20, CC=0, moyenne arithmetique = 10, note = 9.5
(moitié validation)
• Contrôle de connaissance (CC)
– Sans documents, sans calculette, sans iPhone, ...
• les informations nécessaires vous seront fournies
(e.g., format entetes TCP, IP, Eth)
• attention aux horaires du CC!
– Format:
• QCM sur tout le contenu du cours (slides TP, TD)
• Questions (ouvertes ou QCM) sur materiel à lire
Evaluation
• Precision: ne fera pas objet du
controle
• Tout jeu de slides explicitement
marqué comme tel (e.g.,
2010/S1P1)
• Les lectures optionnelles
(optional readings)
• But du materiel hors controle
• Pour vous donner un apercu de
la recherche,
et des evolutions plus recentes
du domaine,
• Pour vous montrer que les
réseaux peuvent etre “fun”
5
6
D. Rossi – RES224
Evaluation: QCM
• QCM
• Entre 3-5 choix pour chaque questions
• lorsque il y a des questions ouvertes:
20 questions, 2/3 de la note du CC
• lorsque il n’y a pasdes questions ouvertes :
~30 questions, 3/3 de la note du CC
• Attention
• Certaines reponses fausses peuvent entrainer une penalisation
• But: decourager les reponses systematiquement aléatoires
• Remarque: si vous etes indecis entre deux reponses, sur un
nombre limité de questions (e.g., 3-4), il est difficile que
toutes soient avec penalité
Evaluation: Questions ouvertes
• Questions ouvertes
•
•
•
•
•
Questions sur “mandatory readings”
Questions ouvertes ou integrés dans le QCM
Questions simples sur articles difficiles
Articles inclus dans le materiel fourni
Articles à lire de facon autonome
• Prenez de l’avance pour les lire !
• Profitons des cours/TP/TDs si vous avez des questions
• E.g. Pour 2010/S1P1:
• Peer-2-Peer: "Chord: A Scalable Peer-to-peer Lookup Service for
Internet Applications"
• Client-Server: “Wow, That's a Lot of Packets!”.
6
Presentation de l’Unité d’enseignement
Evaluation: TP
• Compte rendus TPs
– Rappel: justifiez vos absences à l'avance
– Rappel: délai maximum de 15 jours
– Compte rendu optionnel (donc note finale = exam)
• Format PDF uniquement,
• Interface Web TPRES
• Format du compte rendu
– Complet mais synthetique ! Repondez aux questions !
– Il ne faut pas ré-ecrire la theorie, mais motiver et commenter les
resultats que vous avez obtenu
Materiel
• Site de reference:
– http://www.enst.fr/~drossi
– Pointeur depuis le site pédagogique
• Materiel sur le site
–
–
–
–
–
–
Support de cours électronique (PDF)
Texte des TPs et TDs
Corrigé des TDs
Software pour les TPs
Articles à lire
(peu d'examples d') Annales d'examen
7
8
D. Rossi – RES224
TPRES https://tpres.enst.fr/tpweb
Interface Web d’upload des compte rendus
Votre account unix/mail
TPRES https://tpres.enst.fr/tpweb
Interface Web d’upload des compte rendus
Choix des UEs,
inscriptions aux
nouvelles UEs
Votre liste des TPs pour l’UE selectionné
8
Presentation de l’Unité d’enseignement
TPRES https://tpres.enst.fr/tpweb
Interface Web d’upload des compte rendus
Mise en ligne
et verification
des rapports
Un seul responsable par TP saisi
ses coequipiers et effectue l’upload
TPRES https://tpres.enst.fr/tpweb
Accessibilité
•
•
Directe depuis l'ecole
Via PPTP depuis l'exterieur (voir instructions DSI)
Remarques
•
Inscrivez vous ASAP
•
Vous devez vous inscrire par vous memes
•
Le responsible de la mise en ligne ne peut vous
ajouter que si vous etes deja inscrits
•
Compte rendu en format PDF uniquement
9
10
D. Rossi – RES224
References
• Livres (à la bibliotheque d'ENST)
– James F. Kurose & Keith W. Ross, Computer Networking: A topdown approach featuring the Internet, (4th ed) Addison Wesley
– R.W. Stevens, TCP/IP Illustrated: Volume 1, the Protocols,
Addison Wesley
– A. Tanenbaum, Computer Networks, Pearson Education,
Prentice Hall
• Autres ressource (sur Internet)
– IETF RFCs, pointeurs donnés dans les slides
– http://www.ietf.org/rfc.html
• Dans chaque cours
• Mandatory readings (objet d’examen)
• Optional readings (culture generale)
?? || //
10
Cours magistraux (CM)
11
2 Cours magistraux (CM)
Cette unité complète l’unité RES223 (qui se focalise sur le transport et routage Internet) avec un aperçu
complet de la couche des applications. Après une introduction générale d’Internet (Sec. 2.1) et des problématiques de la couche application (Sec. 2.2), on étudiera aussi bien des protocoles client-server (de
Sec. 2.3 à Sec. 2.5) que paire-à-paire (Sec. 2.6-Sec. 2.8).
La partie client-server se focalise, en particulier, sur
• l’adressage applicatif (DNS et DHCP en Sec. 2.3),
• l’accès au données (HTTP et FTP en Sec. 2.4) et
• la messagerie (SMTP, POP et IMAP en Sec. 2.5).
Ensuite, la partie paire-à-paire se focalise sur
• la recherche du contenu (DHT en Sec. 2.6),
• la diffusion du contenu (BitTorrent en Sec. 2.7) et
• le multimedia (Skype en Sec. 2.8).
Le cours est constamment mis à jour, de façon telle a donner un aper cu actuel des dernières tendences
en matière d’application Internet. Notamment, les cours suivants on vu le jours en:
• BitTorrent en 2008
• Skype en 2009
• P2P-TV en 2010
• LEDBAT en 2011
• YouTube en 2012
• SPDY en 2014
Bien evidemment, le contenu du cours n’augmente pas seulement, mais les nouveau cours remplacent
le materiel au fur et à mesure qu’il devient obsolète. Par example le cours de YouTube remplaceait en
2012-2014 le cours de P2P-TV tenu pendant 2010-2012, en raison du succes croissant de YouTube et
des CDN.
Cette edition 2014 voit un ajout important, notamment SPDY proposé par google comme replacement
de HTTP, et en cours de normalisation á IETF. Comme toute matiére de pointe et très recente n’est pas
couverte dans les livres de référence, on s’appuiera pour son étude sur des lectures d’approfondissement
(disponibles uniquement en anglais en Sec. 5).
A noter que les lectures d’approfondissement reportés en Sec. 5, font object d’examen aussi bien que
les cours magistraux. A la fin de chaque section, le lecteur intéressé trouvera des pointeurs vers des
approfondissement facultatifs (qui avaient à leur tour servi de lecture obligatoire dans des précedent
éditions de ce cours), qui en revanche ne sont pas object d’examen.
Comme complement, le cours donne aussi un aperçu sur de recents résultats de recherche sur ce domaine.
Cette “fenêtre sur la recherche” à le double but de montrer que le travail de recherche est intellectuellement stimulant et potentiellement amusant, ainsi que de montrer un example des aspects de recherche
que l’on aborde au departement (pour les étudiants interessés aux projet libres ou stages de recherche
dans le labo ou auprès des partenaires industriels). A noter que ce complement n’est pas object d’examen.
12
D. Rossi – RES224
RES 224
Architecture des applications Internet
Introduction
dario.rossi
Dario Rossi
RES224
http://www.enst.fr/~drossi
v250811
Plan
• Definition d’Internet
• Histoire d’Internet
• Architecture d’Internet
•
•
•
•
Couches protocolaires
Reseaux datagrammes
Transport, acheminement, routage
Structure d’Internet
• References
12
2.1 Introduction
Objectif
•
Objectif de cette introduction
– Panorama sur Internet au dela de RES224
•
Internet et le réseaux TCP/IP
– Principes fondamentaux des réseaux de données IP
– Problématique du transport des données TCP
– L’architecture d’Internet
•
Remarque
– Internet n’est pas le seul réseau à étendue mondiale existant
• D’autres reseaux: GSM, PSN (=RTC), …
– Dans ce cours, on se focalisera sur TCP/IP
• A la limite, on fera quelque comparison
Internet
• Definition d’Internet ?
– Internet en 2001,
http://www.caida.org/
13
14
D. Rossi – RES224
Internet
Internet = ses composants
•
•
•
•
•
Des millions de hôtes
– PCs, stations de travail, serveurs
PDAs, téléphones, grille-pain (!)
– Equippés de dispositifs de
communication
Nombreux liens de communication
– fibre optique, cuivre, wireless, …
transferent des trames de bits
Des routeurs (IP)
– Interconnectent les hotes,
et transfèrent des données
L’echange de segments (TCP)
– l'émission et la réception des
donnés sont controlés par des
protocoles de transport
L’echange de messages
– Entre hotes pour la definition de
services applicatif
http://www.livinginternet.com/i/ia_myths_toast.htm
Internet
http://www.livinginternet.com/i/ia_myths_toast.htm
Internet = applications + usager
•
La liste ne cesse de s’allonger…
– Login distant, transfert de
fichiers, e-mail, WWW, ecommerce, streaming audio et
vidéo, visioconférences, jeux
en réseau, social networking
•
Le nombre d’usagers ne cesse
d’augmenter…
– ~ 2 milliards d’usagers (03/2011)
ht tp://www.internetworldstats.com/stats.htm
14
IP/TCP/SNMP
2.1 Introduction
Internet
Internet = ses normes
•
•
•
Internet Engineering Task Force (IETF)
http://www.ietf.org
IETF Request For Comments (RFC )
– Documents techniques détaillés
définissant les protocoles tels que
HTTP, TCP, IP…
– Au départ: informel; aujourd'hui,
normes de facto
– Plus de 6000 RFCs (08/2011)
Autres formus existent
– W3C pour WWW, BEP pour BitTorrent …
•
http://javvin.com/map.html
Pas de norme pour certain protocoles
– Protocoles proprietaires: KaZaa, Skype, etc.
Internet
Internet = architecture
•
•
•
•
Offre des services applicatifs…
Controlant le transport l’information….
Acheminée de bout-en-bout…
Utilisant plusieurs liens point
à point…
•
Bout-en-bout (end-2-end)
– Application, transport des info,
acheminement des paquets
Point-à-point (point-2-point)
– Communication sur lien et physique
•
V. Cerf and R. Kahn, “A protocol for
packet network intercommunication“
IEEE Transactions on Communications ,
May 1974
15
16
D. Rossi – RES224
Historique
• Quelque dates et chiffres
Historique
1961-1972: Principe des réseaux de paquets
•
•
•
•
16
1961: L. Kleinrock
théorie des files d’attentes
(efficacité des réseaux de
paquets)
1964: P. Baran
packet-switching dans les réseaux
militaires
1967:
Advanced Research Projects
Agency (ARPAnet)
1969:
premier noeud de ARPAnet
•
1972:
– ARPAnet démontré
publiquement
– NCP (Network Control
Protocol) premier protocole
host to hot
– Premier programme d’email
– ARPAnet: 15 noeuds
2.1 Introduction
Historique
1972-1980: Nouveaux réseaux et réseaux propriétaires
•
•
•
•
•
1970: ALOHAnet
réseau CSMA à Hawaii
1973: B. Metcalfe
Ethernet
1974: V.Cerf et R. Kahn
architecture pour l’inter-connexion de
réseaux
Fin 70’s:
architectures propriétaires: DECnet,
SNA, XNA
1979: ARPAnet a 200 noeuds
•
Principes de Cerf and Kahn’s :
–
–
–
–
•
Minimalisme, autonomie
Modèle best effort
Routeurs stateless
Contrôle décentralisé
C’est l’architecture actuelle d’Internet
Historique
1980-1990: Nouveaux protocoles, prolifération des réseaux
•
•
•
•
•
1983: deploiement
de TCP/IP
1983: SMTP (e-mail)
1983: P. Mockapetris
defini DNS pour le nommage
1985: FTP
1988: contrôle de congestion
TCP
•
•
Réseaux nationaux: Csnet,
BITnet, NSFnet, Cyclades
100 000 hôtes interconnextés
17
18
D. Rossi – RES224
Historique
1990-2010’s: Commercialisation, Web, P2P, VoD et Social networks
•
•
•
Début 1990’s: Fin d’ARPAnet
•
1990s: Web
– hypertext
– HTML, HTTP: Berners-Lee
– 1994: Mosaic, puis Netscape
– Fin 1990’s: commercialisation du
Web
– Environ 50 millions d’hôtes, plus de
100 millions d’utilisateurs
2000s
– killer apps: messagerie instantanée,
peer-2-peer (P2P) gnutella,
eDonkey, Skype, BitTorrent, …
– Environ 1 milliard d’usagers
– Backbones > Gbps
2010’s
– Video : YouTube 2nd search
engine apres Google
http://www.tgdaily.com/trendwatch
-features/39777-youtube-surpassesyahoo-as-world%E2%80%99s-2search-engine
– Social networking: Facebook
3eme pays du monde avec plus
de 400 million d’usagers
http://www.techxav.com/2010/03/1
9/if-facebook-were-a-country/
Historique
•
Backbone en 2007
Source http://www.spectrum.ieee.org (jun 07)
London – NY = 387 Gbps (environ 5 fois > Paris – NY)
18
2.1 Introduction
Les couches protocolaires
Chaqun
son role!
Les couches protocolaires
• Les réseaux sont complexes et hétérogènes:
–
–
–
–
–
–
hôtes
routeurs
médias différents
applications
protocoles
hardware, software
• Donc
– Comment organiser la structure du réseaux ?
– Comment organiser les communication entre hotes ?
19
20
D. Rossi – RES224
Les couches protocolaires
• Different necessités
– Les deux end-points d’un application doivent s’echanger des messages
– Chaque echanges de type end-2-end (globale) est composé par
plusieurs echabges point-2-point (locales)
– La communication globale (end-to-end) a besoin de garanties, celle
locale (point-2-point) doit etre optimisé
• Une architecture à couches
– Chacun fait son boulot: separation de roles
– Deplacer la complexité vers le bord du reseau
– Maintenir le coeur aussi simple, scalable et generale que possible
Les 7-Couches du Modèle OSI
Application
Presentation
Session
Transport
Network
Link
Physical
20
L7
L6
L5
L4
L3
L2
L1
}
}
}
Applications
RES240/RES224
End-to-end
RES240/RES223
Point-to-point
2.1 Introduction
TCP/IP: un sous-ensemble de OSI
Application
L7
Transport
L4
L3
Internet
L2
HostHost-toto-network
}
}
}
Applications
RES240/RES224
End-to-end
RES240/RES223
Point-to-point
TCP/IP: un sous-ensemble de OSI
Application
Transport
Internet
L7
L4
L3
L2
HostHost-toto-network
Message
Segment
Datagramme
Remarque
– Un “paquet” Internet
prends plusieurs noms
– A chaque niveau, le
“paquet” a un nom
bien precis !
Frame (trame)
21
22
D. Rossi – RES224
TCP/IP: un sous-ensemble de OSI
Application
L7
– HTTP, FTP, SMTP, POP, IMAP,
DNS, RIP, BitTorrent, Skype,etc.
Transport
L4
L3
– TCP, UDP, SCTP, RTP, DCCP, etc.
– IP, ICMP, IGMP, OSPF, BGP, etc.
Internet
L2
HostHost-toto-network
– PPP, Ethernet, WiFi, etc.
Communications entre couches
22
Application
Application
Presentation
Presentation
Session
Session
Transport
Transport
Network
Network
Link
Link
Physical
Physical
•
Une couche (layer)
communique
– logiquement avec une
couche du meme niveau
– en utilisant les services de
la couche en bas
– pour offrir un service à la
couche en haut
2.1 Introduction
Communication logique vs physique
application
transport
transport
réseau
lien
physique
application
transport
transport
réseau
lien
physique
réseau
lien
physique
réseau
lien
physique
application
transport
transport
réseau
lien
physique
Logique, de bout en bout
application
transport
transport
réseau
lien
physique
Physique, point à point,
Couches proches à l’usager
Application
Presentation
Session
Transport
Network
Link
Physical
• Application layer
– requete de pages Web, reponse avec
le contenu, voix, video et interaction (pause,
play), securité
• Presentation layer
– Codage des donnés, securité, compression, ...
• Session layer
– Synchronisation entre applications,
connexion logiques, recuperation d'erreur
23
24
D. Rossi – RES224
Couches de bout en bout
Application
• Transport layer
– fiabilité end-to-end
– Controle de congestion end-to-end
– Multiplexage / demultiplexage des
applications
Presentation
Session
Transport
Network
• Network layer
Link
– Routing end-to-end
– Addressing globale
– Multiplexage / demultiplexage des
protocoles de transport
Physical
Couches point à point
Application
•
Presentation
Link layer
–
–
–
Session
Transport
Partages du medium
Detection des bornes de trames
Detection d'erreur, correction et
retransmission
Network
Link
Physical
24
•
Physical layer
–
Transmission et codage
du signal
2.1 Introduction
Mise en couche
• Encapsulation/decapsulation pour gerer les
different couches
M
Ht M
H nH t M
H l H nH t M
source
destination
application
transport
network
link
physical
application
transport
network
link
physical
M
message
Ht M
segment
H nH t M
H l H nH t M
datagram
frame
Mise en couche: encapsulation
– Chaque couche n recoit des données (protocol data unit PDU(n+1))
de la couche supérieure n+1, qui est une service data unit SDU(n)
au niveau n
– Ajoute un header (PCI(n), protocol control information), avec
information de control propre de son niveau
– Passe la nouvelle unité de données PDU(n) à la couche inférieure
M
Ht M
H nH t M
H l H nH t M
source
destination
application
transport
network
link
physical
application
transport
network
link
physical
M
message
Ht M
segment
H nH t M
H l H nH t M
datagram
frame
25
26
D. Rossi – RES224
Mise en couche: decapsulation
– Chaque couche n recoit une unité données (PDU(n), protocol data unit)
de la couche inferieure n-1
– Interprete le header (PCI(n), Protocol control information), et les
information de control propre de son niveau
– Passe la nouvelle unité de données (SDU(n), service data unit) à la
couche superieure
M
Ht M
H nH t M
H l H nH t M
source
destination
application
transport
network
link
physical
application
transport
network
link
physical
M
message
Ht M
segment
H nH t M
H l H nH t M
datagram
frame
Remarque
• Couches protocolaires: en principe
– Repartition de taches, eviter duplication des fonctionnalités
– Offre des fonctions bien definies aux couches en haut
(en utilisant les fonctions des couches en bas)
• Couches protocolaires: en pratique
– Replication des fonctionnalités possiblé:
• Controle d’integrité aux niveaux
2 (DataLink), 3 (Network) et 4 (Transport)
– Violation de niveau courant (cross-layer):
• TCP checksum est effectué sur header IP,
• Checksum IP computé par la carte Ethernet,
• Violation de codage PHY pour delimiter les trames,
• …
26
2.1 Introduction
Internet: Quel type de réseau ?
Réseaux
Internet
Commutation de
circuits
FDM
TDM
Network
Transport
Commutation de
paquets
Réseaux
avec VC
TCP
Datagramme
IP
UDP
Internet: Quel type de réseau ?
Definitions:
• Signalisation
– Echange d’information concernant le controle et la gestion d’un réseau de
telecommunication
• Commutation
– Le proces d’interconnexion de ressources pour le temps necessaire à la
communication
• Transmission
– Le proces de transfer d’information depuis un point du réseau vers un (ou
plusieurs) autre(s) point(s)
27
28
D. Rossi – RES224
Commutation de circuit
– La communication necessite d’une phase préliminaire de signalisation (etablissement
du circuit par signalisation)
– Dans cette phase, des ressources (frequence, slot temporels, etc.) sont allouées dans
tous les equipement traversé par le flux de donnés
– La commutation est la meme pendant toute la transmission,
et est effectué sur la base d’un identifiant de circuit
– Les données suivent le meme parcours (le circuit) fixé en phase d’etablissement de la
connexion pendant toute la communication
– Les ressources sont utilisés exclusivement par les deux partenaires jusqu’à
la cessation du service (liberation du circuit par signalisation)
•
Example
– Acces du Global System for Mobile communication (GSM)
application
6. Receive
data
Data flowNetwork
begins (PSTN), connu
– Public Switched5.Telephone
en France
transport
comme Réseau4.Telephonique
Commuté (RTC)
Call connected
3. Accept call
network
– ISDN, SONET/SDH
1. Initiate call
2. incoming call
data link
physical
application
transport
network
data link
physical
Commutation de circuit
– La communication necessite d’une phase préliminaire de signalisation (etablissement
du circuit par signalisation)
– Dans cette phase, des ressources (frequence, slot temporels, etc.) sont allouées dans
tous les equipement traversé par le flux de donnés
– La commutation est la meme pendant toute la transmission,
et est effectué sur la base d’un identifiant de circuit
– Les données suivent le meme parcours (le circuit) fixé en phase d’etablissement de la
connexion pendant toute la communication
– Les ressources sont utilisés exclusivement par les deux partenaires jusqu’à
la cessation du service (liberation du circuit par signalisation)
Example
– Acces du Global System for Mobile communication (GSM)
– Public Switched Telephone Network (PSTN), connu en France
comme Réseau Telephonique Commuté (RTC)
– ISDN, SONET/SDH
28
2.1 Introduction
Commutation de paquet
– Mode datagramme
• Chaque paquet porte information de controle necessaire pour
son acheminement
• La fonction de commutation est instantanée pour chaque paquet,
• Aucune ressource n’est reservée, ni la signalation necessaire
• En revanche, aucune performance n’est garantie
– Mode circuit virtuel
• Transmission orienté connexion dans reseaux à paquets
• Examples
application
– mode datagramme au niveau reseau IP et transport UDP
application
transport
– transport
mode circuit virtuel au niveau transport TCP et link-layer X.25, FrameRelay, ATM,
network
MPLS
network
data link
physical
1. Send data
2. Receive data data link
physical
Commutation de paquet
– Mode datagramme
• Chaque paquet porte information de controle necessaire pour
son acheminement
• La fonction de commutation est instantanée pour chaque paquet,
• Aucune ressource n’est reservée, ni la signalation necessaire
• En revanche, aucune performance n’est garantie
– Mode circuit virtuel
• Transmission orienté connexion dans reseaux à paquets
Examples
– mode datagramme
• Couche reseau IP et couche transport UDP
– mode circuit virtuel
• Couche transport TCP, link-layer X.25, FrameRelay, ATM, MPLS
29
30
D. Rossi – RES224
Commutation de paquets: delai
• Les paquet subissent du delai à chaque saut
• Quatre sources de delai:
– Processement (<µs)
– File d’attente (µs – s)
Processement
- Transmission = P/C (µs – ms)
- Propagation = D/V (LAN < ms,
WAN >10ms)
Transmission
P=taille de paquet
C=capacité du lien
D=distance
V=vitesse du signal
File
Propagation
d’attente
Recap: Circuits vs paquets
• Commutation de circuits:
30
• Commutation de paquets:
– la function de commutation alloue
des ressources aux usagers qui
souhaitent etre
mis en communication
– un (ou plus) circuit pour le sont
mis en place avec une phase de
signalisation
– la liberation des ressources
necessite d’une signalisation
explicite
– la function de commutation
n’alloue aucune ressource
– toutes les ressources sont
toujours disponibles pour la
transmission des paquets
– les paquets attendendent dans
des files que les ressources se
libèrent
– Garantie de performance
– Pas de file d’attente
– Gachis de ressource potentiel
– Pas de garanties, délai variable
– Utilisation plus efficace et
flexible des ressources
2.1 Introduction
Recap: Circuit vs circuits virtuels
Similitudes
Differences
•
Modalité de transmission orienté
connexion
•
•
Dans les deux cas, les données
sont transferés dans le meme
ordre avec lequel ont été
transmis
La commutation de circuit
garantie les performance
(e.g., debit, delai) car les
ressources ont été alloués
en usage exclusif
•
Le circuit virtuel ne garantie pas
les performances, car les
ressources sont partagés pour
benefier du statistical
multiplexing
•
Signalisation necessaire pendant
l’etablissement et la liberation de
la connexion
Internet: qui fait quoi ?
application
transport
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical
Applications: à partir du prochain cours de RES224
Transport et réseau: apercu ici, details dans RES223, RES343…
31
32
D. Rossi – RES224
Couches transport et réseau
Protocoles de
Routage
•Choix du chemin
•Plusieurs niveaux
•RIP, OSPF, GP
Couche
Réseau
Couche Transport
TCP, UDP, SCTP, DCCP, …
Protocole IP
•Adressage, acheminement
•Format des datagrammes
•Traitement des paquets
Table
de
routage
Protocole ICMP
•Rapport d’erreur
•signalisation
Transport: TCP et UDP
TCP
• UDP
– Service à circuit virtuel, orienté
connexion, remise fiable et dans l’ordre,
– Impose la vitesse d’emission (cfr slide
suivant) et la taille des messages (de
preference MSS + header TCP = MTU)
– Introduit un délai supplementaire initiale
lié à l’etablissement du circuit (3 way
handshake)
SYN+ACK
ACK + GET
RTT
Data + FIN
FINACK
temps
32
Etablissement
de connexion
• TCP
SYN
Round
Trip
Time
(RTT)
Fermeture Transfert de
connexion données
– Service datagram, non fiable,
non orienté connexion
– Flexibilité dans le bitrate et la taille
des messages à envoyés (limité
par le MTU de la couche IP)
TCP
2.1 Introduction
Transport: TCP et UDP
•
Transfer de données TCP
– A tout moment, vitesse d’emission lié
à la fenetre coulisante (sliding window)
w = min(cwnd,rwnd)
– Rwnd = receiver window, controle de
flux (eviter de transmettre plus que
le recepteur n’est capable de stoquer)
– Cwnd = congestion window, controle de
congestion (eviter de transmettre plus
que le reseau ne peut supporter)
Dynamique de cwnd
– Algos: slow-start, congestion avoidance,
fast-recovery, fast-retransmit, timeout,etc.
– Slow-start: croissance exponentielle, à
partir de 1 segment par RTT(jusqu’à un
seuil sstrhesh), cwnd++ à chaque ACK
– Congestion avoidance: croissance lineaire,
à partir de ssthresh, cwnd+=1/cwnd
à chaque ACK
TCP
TCP
RTT
Transfert de
données
•
RTT
RTT
temps
cwnd
Congestion
avoidance
Packet
losses
Fast-recovery
Slow start
Timeout
123…
Temps (RTT)
Acheminement IP
• Pour chaque paquet
– Decider sur quel lien il doit etre acheminé
– Exclusivement en fonction de l’adresse de destination
– Lookup dans des tables de routage
• Longest prefix matching
• Ternary Content Addressable Memory (TCAM)
33
34
D. Rossi – RES224
Routage IP
• Echange d’information entre routeurs
– But: construction des tables de routage
– Taxonomie:
• Link state: propagation d’information topologique locale
à tous les noeuds du reseau
• Distance vector: propagation d’information globale
seulement aux voisins
Distance vector
Link state
Routage Inter-AS et Intra-AS
C.b
B.a
Passerelles :
A.a
b
a
A.c
C
a
d
A
b
AS = Autonomous System
(cfr. Structure d’Internet)
c
a
B
c
b
•Exécutent le routage
inter-AS entre elles
•Exécutent le routage
intra-AS avec les autres
routeurs de l’AS
Couche réseau
Couche liaison
Couche physique
34
2.1 Introduction
Structure d’Internet
• Taxonomie par etendue geographique des reseaux
– Réseaux étendus (WAN, Wide Area Network)
• Pays, continents
– Réseaux métropolitains (MAN, Metropolitan Area Network)
• Ville, collectivités locales
– Réseau locaux (LAN, Local Area Network),
• De la salle au campus
– Réseaux personnels (PAN, Personal Area Network)
• Autour de l’individu (dans sa sphere d’action), capteurs (communication limité)
• Taxonomie par hierarchie d’Interconnexion
– Entités = Internet Service Providers (ISP)
– 3-tiered architecture (unofficielle mais courante)
• Tier-1 ISP: se connectent avec tous les autres Tier-1 ISPs (topologie
maillée)
• Tier-2 ISP: se connectent (typiquement) à plusieurs Tier-2 ISP
• Tier-3 ISP: se connectent à un Tier-2 ISP
Structure d’Internet
•
Tier-3 (ISP locaux)
•
Tier-2 (ISPs régionaux)
•
Tier-1 (National/International
Service providers NSPs)
– Se connectent aux ISPs régionaux
local
ISP
– Se connectent aux NSPs
– aka NBP (National Backbone
Provider) e.g. BBN/GTE, Sprint,
AT&T, IBM, UUNet
– Échangent du trafic d'égal à égal
(peering relationship, sans generer
des couts de transmission)
– Connectent les réseaux ensemble de
façon privée ou via un réseau public
– Les NSPs doivent être connectés
entre eux par des NAPs (Network
Access Points)
regional ISP
NSP B
NAP
NAP
NSP A
regional ISP
local
ISP
35
36
D. Rossi – RES224
Structure d’Internet
• Les Tier-2 se connectent aux Tier-1 ISP
– Customer-provider relationship, engendre des couts
d’acheminement
Tier-2 ISP
Le Tier-2 paie pour
accéder à l'Internet
via le Tier-1
Les Tier-2 font
aussi du peering
Tier-2 ISP
Tier 1 ISP
NAP
Tier 1 ISP
Tier 1 ISP
Tier-2 ISP
Tier-2 ISP
Tier-2 ISP
Structure d’Internet
• …et ainsi de suite
Virtual
Tier 3
ISP
ISP
Clients des ISP
de plus haut
niveau pour la
connectivité à
l'Internet
Tier-2 ISP
local
ISP
Tier-2 ISP
Tier 1 ISP
NAP
Tier 1 ISP
Tier-2 ISP
local
local
ISP
ISP
36
local
ISP
local
ISP
Tier 1 ISP
Tier-2 ISP
local
ISP
Tier-2 ISP
local
ISP
2.1 Introduction
Structure d’Internet
Dans son voyage de bout en bout, un paquet va donc traverser
plusieurs AS, et plusieurs routeurs dans chaqun des AS
Virtual
Tier 3
ISP
ISP
local
ISP
local
ISP
Tier-2 ISP
local
ISP
Tier-2 ISP
Tier 1 ISP
Tier 1 ISP
Tier 1 ISP
Tier-2 ISP
local
local
ISP
ISP
Tier-2 ISP
local
ISP
Tier-2 ISP
local
ISP
References
• Organismes
– http://www.ietf.org
• IETF (Internet Engineering Task Force)
– http://www.isoc.org
• Internet Society
– http://www.w3c.org
• World Wide Web Consortium
– http://www.ieee.org
• IEEE (Institute of Electrical and Electronics Engineers)
– http://www.acm.org
• ACM (Association for Computing Machinery)
• Optional reading
– « A brief history of the Internet », by those who made the history
http://www.isoc.org/internet/history/brief.shtml
37
38
D. Rossi – RES224
?? || //
38
2.2 Les applications client-server
RES 224
Architecture des applications Internet
Les applications Internet
dario.rossi
Dario Rossi
RES224
http://www.enst.fr/~drossi
v250811
Plan
• Terminologie et perimetres de RES224
•
Application et Protocoles
• Architectures applicatives vs Architectures reseaux
•
•
Applicatives: Client-server, Content distribution networks,
Peer-to-peer
Reseaux: IP Multicast, Content centric networking
• Couche applicative et couches TCP/IP
•
•
Interactions
Interface (socket)
• Type d’applications et mesure de performance
•
•
•
Applications orientés donnés vs application multimedia (voix, video)
Quality of Service (QoS) vs Quality of Experience (QoE)
Choix du protocole de transport
• References
39
40
D. Rossi – RES224
Que sont les applications ?
User
•
application
transport
network
data link
physical
Router
•
AS1
AS2
Billing •
Server
Video
Server
application
transport
network
data link
physical
application
transport
network
data link
physical
•
Raisons d'être des réseaux
informatiques
Nombreuses types:
– accès distant, email, transfert de
fichiers, newsgroups, forums de
discussion, Web, multimédia
téléphonie, vidéoconférence,
audio et vidéo à la demande, etc.
Vagues de popularité:
– Mode texte (1980s) Web (1990s)
P2P (2000s) Social (2010s):
Souvent : logiciels distribués entre
plusieurs systèmes
– Communication entre les
applications
Applications: une vue d’ensemble
•
“Application” est un terme abusé ou
imprecis
– L’application de messagerie c’est
votre logiciel mail preferé ?
•
On definira plusieurs aspect qui sont
commun aux applications
– Architecture
– Procole
– Processus:
Architecture
Hote
Agent utilisateur
Socket
Protocole
Socket
Daemon
Serveur
• agents et daemons
– Communication inter-processus
• Sockets
40
2.2 Les applications client-server
Applications réseaux : processus
– programme applicatif qui
s’exécute sur un hôte
– définies par le système
d’exploitation
– des messages applicatifs
– avec un protocole de couche
application
• Ces processus peuvent avoir
un role different en fonction
de l’architecture
• Le processus émetteur et
récepteur supposent qu'il
existe une infrastructure de
transport en-dessous
RES223
INF
• Deux processus
communiquent sur un même
hôte avec des communications
interprocessus
• Deux processus s’éxécutant
sur deux hôtes différents
communiquent en échangeant
RES224
• Les entités communicantes
sont des processus, soit
Les protocoles
Les humains utilisent des protocoles sans arrêt… un example avec Alice et Bob
Bonjour
Bonjour
T’as du feu ?
…
Protocoles humains:
• Emission de messages spécifiques
– “Quelle heure est-il ?”, “S’il vous plait”, “Veuillez agréer” …
•
Actions spécifiques accomplies après réception de messages ou d'événements
particuliers
41
42
D. Rossi – RES224
Protocoles applicatifs
•
Définissent :
– Le type des messages
• Requête, réponse…
application
transport
network
data link
physical
– La syntaxe des types de messages :
• Format des champs du message
– La sémantique des champs,
• La signification des champs
– Les règles de communication
• Déterminent quand et comment un
processus envoie des messages et
y répond
•
application
transport
network
data link
physical
application
transport
network
data link
physical
Protocoles applicatifs
– du domaine public
• Définis dans les IETF RFC (ex HTTP,
SMTP, client-server)
• Definis dans autres fora (e.g., Gnutella,
BitTorrent BEP, etc.)
– Propriétaires
• ex KaZaA, Skype, téléphonie IP,
Architectures applicatives
• Terminologie
Service consumer
Communication
logique L4/L7
Service producer
Internet
Infrastructure
physique
Local/regional nets
Servers
8
42
2.2 Les applications client-server
Architectures: Client-server
• Paradigme client-server
– Problemes de passage à l’echelle
9
Architectures: Client-server
• Le côté client d'un système terminal
communique avec le côté serveur
d'un autre système
• Le serveur doit servir toutes le requetes
de plusieurs clients (problemes de charge)
• Examples
client
application
transport
network
data link
physical
•
Web :
– navigateur Web = côté client de HTTP
– serveur Web = côté serveur de HTTP
• Email :
– Serveur émetteur = côté client de SMTP,
– Serveur récepteur = côté serveur de SMTP.
application
transport
network
data link
physical
Les applications Internet typiquement s’appuient
sur deux entités: le client et le serveur
43
44
D. Rossi – RES224
Architectures: Client-server
Client :
• Demande un service
• Initie le contact avec le serveur,
ouverture active de la connession
• Implementé dans un logiciel usager
• Web browser, mail reader, …
Client
application
transport
network
data link
physical
Serveur:
• Propose des services
• Reste à l’écoute pour des requetes de service
• Implementé dans un daemon
Server
• Le serveur Web, SMTP, …
Remarque :
• Les applications client-server peuvent etre
centralisés
• Une même machine peut implémenter les
côtés client et serveur au meme temps
application
transport
network
data link
physical
Les applications Internet typiquement s’appuient
sur deux entités: le client et le serveur
Architectures: CDN
• Content distribution network (CDN)
– Infrastructure couteuse
12
44
2.2 Les applications client-server
Architectures: CDN
Client
•
Content distribution networks
– Business de replication de
contenu le plus proche à l’usager
– But: eviter le cout de transit
customer-provider entre AS et ne
le payer qu’une seule fois
application
transport
network
data link
physical
• Examples
– Videos YouTube (pre-google era),
patches Microsoft, Antivirus
– 28 services CDNs commerciaux
en 2011, parmi les plus connus
Akamai, Limelight,
CDN
application
transport
network
data link
physical
application
transport
network
data link
physical
Architectures: Multicast, CCN
RES223,
RES343,
…
• IP Multicast, Content Centric Networking CCN
– Complexes (IP multicast utilisé pour TV over IP)
14
45
46
D. Rossi – RES224
Architectures: Peer-to-peer
• Paradigme P2P
Couche application (deployment facile)
Use toutes les resources (scalable)
Ressources en periferie (deplace couts)
15
Architectures: Peer-to-peer
Peer
peer
• Joue le role de client et server à la fois
• Client, demande des service à d’autres peers application
transport
• Serveur, fournit des services à d’autres peers network
Remarques :
• Les serveurs sont des machines dediés,
mises en place par un fournisseur de services
•
•
46
L’application client-server peut faire
assumption que les serveurs soient
toujours disponibles (sauf panne).
Les peers sont des machines d’usager, qui ne
fournit donc aucune garantie de service
•
•
data link
physical
Les applications doivent faire assumption
que les peers ne soient pas toujours disponibles
Les applications peer-2-peer sont fortement distribués
application
transport
network
data link
physical
2.2 Les applications client-server
Interaction
L7 overlayPeer-to-peer
vs IP underlay
Architecture:
Les applications
• •Paradigme
P2P Internet (L7 OSI) posent
sur l’infrastructure TCP/IP sous-jacente
P2P
overlay
IP
underlay
17
Interaction L7 overlay vs IP underlay
• Les applications Internet (L7 OSI) posent
sur l’infrastructure TCP/IP sous-jacente
L7
overlay
Interaction L7/L3:
applications vs
couches TCP/IP
IP
underlay
18
47
48
D. Rossi – RES224
Interaction L7 overlay vs IP underlay
• Control du trafic emis (“network friendly” lorsque
competition equitable par rapport à TCP)
L7
overlay
Problematique
type L4, solutions
autres que TCP
(e.g., Skype,
BitTorrent, etc.)
IP
underlay
19
Interaction L7 overlay vs IP underlay
• Routage applicatif du trafic (“network aware” quand le
trafic est localisé autant que possiable au sein du ISP)
$$
L7
overlay
Problematique
type L3, solutions
autres que IP, RIP,
OSPF et BGP
IP
underlay
AS1
AS2
20
48
2.2 Les applications client-server
Socket: Interface applicative TCP/IP
•
•
Socket: Application program interface (API)
– Porte d'entrée des données du reseau d'un processus applicatif
– Interface entre la couche application et les couches TCP/IP
Le développeur contrôle la partie application des sockets
– il n'a que peu de contrôle sur la partie transport: choix
du protocole et ajustement de quelques paramètres
process
process
socket
socket
TCP/IP
TCP/IP
Internet
driver
driver
NIC
NIC
Contrôlé par le
développeur de
l'application
Contrôlé
par l'OS
Contrôlé
par le fabricant
Software
Software
Hardware
NIC = Network interface card
Socket: Interface applicative TCP/IP
• Traduction
• Sockets:
– Type de protocole de transport
– Addresse IP de la machine
– Addresse TCP du processus
(numero de port)
– Service d’annuaire pour la
translation d’addresses (DNS)
• Remarque: focus de RES224,
details dans la suite
– Regles pour les numeros
de port (/etc/services)
• Addresses
– Point de vue human: URL
• Remarque: focus de RES223,
rappel dans la suite
• http://www.google.com
– Point de vue réseau
• Remarque
• L3: Addresses IP
• L4: Numero de port
– Le quintuplet (IPs,Ps,IPd,Pd,proto)
est unique dans le reseau
http://www.google.com
[email protected]
TCP/IP
137.192.164.1:37562
Internet
TCP/IP
74.125.39.180:80
49
50
D. Rossi – RES224
Adressage: point de vue transport
•
•
•
•
Numéro de port: Entier sur 16bits (=65K), double fonctionalité
Multiplexage (numeros ephemères > 32K)
Un client telecharge deux
– Permet de différencier parmi les processus locaux fichiers en parallel à partir
auxquels le message contenu dans le datagramme du meme serveur HTTP:
il ne voudrait pas que les
IP recu doit être transmis
paquets se melangent au
niveau TCP !
Resolution (numeros reservés < 32K)
– Permet un addressage simple et deterministe au
Une fois resolu l’adresse
niveau transport
IP d’un serveur Web, on
– Well-known port numbers : RFC 1700, /etc/services connait déjà le port du
• E.g., 80 pour le serveur Web, 53 pour le DNS, etc. service HTTP (on evite
ainsi la 2eme resolution)
Remarques
– Un nouveau numéro de port est affecté à chaque nouvelle
application par IANA (affectation generalement suivie)
– Un serveur Web peut tourner sur un autre port (eg 8080) ,
une appli P2P peut tourner sur un port arbitraire (eg 80, 53)
– Les applicatif P2P n’utilisent pas/plus de numero de port
standard (e.g., pour se cacher)
Processus: Agents et Daemons
Agent utilisateur
• Implemente le coté client d’un ou
plusieurs protocoles applicatif
–
–
Implémentation du côté client du
protocole HTTP: processus qui envoie et
reçoit les messages HTTP via un socket
Implementation de l’interface avec
l’usager (visualisation + navigation +
plugins + etc. ) qui depasse le cadre
protocolaire et de RES224
•
Effectue l’open active de la
connession TCP (cfr RES223)
• Utilise (obbligatoirement)
un port ephemere
Exemples:
• DNS
– Nslookup, dig
• Web
– Browsers: Mozilla, Chrome, IE
– Crawlers: wget, …
• Rapatrie un site Web, ou partie d’un
site, effectue un mirror, …
50
Daemon
• Implemente le coté serveur d’un
protocole applicatif
• Reste à l’ecoute des requetes en
provenance des agents utilisateurs
(open passive)
• Utilise (generalement)
un port reservé
Exemples
• DNS
– Berkeley Internet Name Domain
(BIND)
• Web
– Apache, Microsoft Internet
Information Server,
– Squid (proxy)
2.2 Les applications client-server
Socket: Quel protocol de transport ?
• TCP et UDP fournissent des
service tres differnet
– (cfr. RES223/RES240)
application
transport
network
data link
physical
• Lequel choisir ?
– Étude des services fournis
par chaque protocole
– Etude du type de contenu
de l’application
– Sélection du protocole qui
correspond le mieux
aux besoins de l'application
• Beaucoup de protocoles de
transport existent
– TCP: Transport Control
Protocols
– UDP: User Datagram Protocol
– RSVP: Resource reSerVation
Protocol
– SCTP: Stream Control Transport
Protocol
– DCCP: Datagram Congestion
Control Protocol
– T/TCP: Transaction TCP
– RUDP: Reliable UDP
– etc.
Socket: Quel protocol de transport ?
• Types de flux liés
au type de contenu
– Data
•
•
•
•
Taille des paquets (Bytes)
Flux HTTP
Wb
e-mail
telechargement
...
Taille determinée par
TCP: train de paquets
taille MSS, suivi par ACK
– Multimedia
• Video (on
demand,
broadcast)
• Telephone
• ...
• … avec different
requis pour leur
transport!
Flux Skype
Taille determinée par
l’encodeur de la voix (à
bitrate variable)
Numero de paquet
51
52
D. Rossi – RES224
Socket: Quel protocol de transport ?
• Types de flux liés
au type de contenu
Temps d’interarrivée (sec)
Flux HTTP
– Data
•
•
•
•
Wb
e-mail
telechargement
...
Temps determiné par RTT
(~20ms) entre 2 trains,
pacquets back-to-back
(~0ms) pendant un train
– Multimedia
• Video (on
demand,
broadcast)
• Telephone
• ...
• … avec different
requis pour leur
transport!
Flux Skype
Temps determiné par
l’encodeur de la voix (à
bitrate variable)
Numero de paquet
Socket: Quel protocol de transport ?
Qualité de Service (QoS)
(network-centric)
Qualité d’Experience (QoE)
(user-centric)
•
•
•
•
•
Bande passante (Throughput)
Bits ecoulés par unité de temps
Probabilité de perte (Loss rate)
Probabilité qu’un packet arrive avec
des erreurs detectables mais
pas corrigeables (wireless)
Probabilité qu’un packet soit perdu
à une file d’attente (wired)
Delai (Delay)
Temps avant que le pacquet arrive
Gigue (Jitter)
Variabilité du delai
Faciles à mesurer, objectifs, peu
parlant vis à vis des applications
52
Orienté données
– Temps de completement
(Completion time)
– Reactivité pour applications
interactives
– Fiabilité
– …
•
Multimedia
–
–
–
–
Peak Signal to Noise Ratio (PSNR)
Mean Opinion Score (MOS)
Structural similiarity (SSIM)
Percetpual Evaluation of Video
Quality (PEVQ)
– …
Compliqués, couteuses, pas
d’entente à niveau mondiale
2.2 Les applications client-server
Flux orientés données
• Examples
– Web, telechargement, sessions interactives, e-mail
• Besoin primaire: fiabilité
–
–
–
–
L’information doit etre transmise sans erreur,
En cas d’erreur, il faut retransmettre
L’information doit arriver dans le meme ordre!
Utilisent la bande passante disponible (applications élastiques)
• Autres characteristiques:
– Pas de contrainte real-time
• Un delai de 3 second pour le debut de l’affichage
limite typique pour un service de type Web
• Edition de texte à distance bcp plus interactif
(terminaux, Google documents)
– Telechargement un fichier peut etre long
• …mais il devrait etre le plus court possible
Flux de streaming
• Example:
– Contenu multimedia
– Audio (radio) or video
(television) generé en real-time
Packet
sequence
number
Source
• Besoin primaire: coherence
temporelle
– Ni accelerer ni decelerer!
– Maintenir la bande passante
constante
Deadline
Receiver
Frozen stream
• Remarque:
– Estimer ou connaitre le delai du
reseau (end-to-end delay)
– Introduir un delai avant de
commencer la vision (playout
buffer)
Delay
Playout buffer
Time
53
54
D. Rossi – RES224
Flux voix
•
Transmission numerique
– Echantillonage, quantification,
codage avant de transmettre
•
•
Echantillonage du signal sur un
spectre de 4 kHz (après Nyquist
8000 samples per second )
•
Quantification à 256 niveaux
(8 bits par echantillon)
Besoin primaire
– eviter délai long et/ou
très variable
– Signal lent, assez robuste
aux pertes
•
Pulse Code Modulation (PCM)
Amelioration possibles
– Suppression de silences (ou
description de silences pour
reduire silence innaturel)
– Encodage de differences (pour
reduire bits de quantification)
– Compression “lossy“
(suppression des frequences
moins importantes)
01001100
time
•
Flux de 1 byte chaque 125 µs
(64 kbit/s de bande passante)
Flux voix
• Type de flux
– Pas besoin d’une bande passante importante
• PCM : 64 kbit/s ; iLBC (Skype) : 13,3 kbit/s ; GSM : 13,3 kbit/s
– Petit taille de paquet (80 bits to 1 kbit)
– Temps entre paquet egalement petit (<100 ms)
• Characteristiques
– Besoin de limiter le delai end-to-end
– Pas possible d’utilizer des techniques de “buffering”
• Maximum “mouth-to-ear” delai tolerable: 300 ms
(45 ms si il n’y a pas de suppression d’echo)
– Robuste contre les pertes
• qualité plus que acceptable avec 15% de perte
• Moins robuste si on utilise de la compression
(il n’y a plus de information redundante)
54
11001111
2.2 Les applications client-server
Flux video
• Contraintes similaires à la
voix, mais debit important
–
–
–
–
• Codage MPEG
– I-frames: “Intracoded” figure
complete JPEG (~1 chaque
seconde)
– P-frames: “predictive”, delta
par rapport à la precedente
– B-frames: “bidirectional”,
differences de la precedente
et suivante!
– D-frames: “DC coded”, utilisé
pour “fast-forward” (~1
chaque seconde)
TV : 720 x 576 pixels
16 bits de profondeur de couleur
25 images par second
166 Mb/s flow si pas comprimé
• La compression est donc très
importante
– MPEG1: qualité VCD 1-2 Mbps
– MPEG2: qualité DVD 5-10 Mbps
– MPEG4: qualite HDTV 20-60
Mbps
• Idée: exploiter redondances
spatiales et temporelles
I
I
B B
P
B B
B B
P
D’autres types de flux existent
•
Online gaming
– Très sensible au delai (quelques centaines
de ms, end-to-end)
– Debit typiquement peu important
•
Flux multimedia
– Transportent different types de flux (video,
voix en plusieurs langues, sous-titres, ...)
– Necessite aussi de gerer la synchronization
•
Jam-session distribuée
– Playback temps réel (50ms maximum, <20
preferablement)
– Debit potentiellement bas (MIDI) our elevé
(6 Mbps sequencing @192 kHz, 32bit)
– Example applications: Musigy, eJamming,
• Autres flux
– Pas associé à un service
specifique
• Toutes les applications
dependent de leur
fonctionnement!
– Flux de controle du reseau
• Information de controle,
routing, etc.
• Operation and management,
verification des liens, …
– Services applicatif mais
“transparents”
• Translation de noms DNS
• Signalization et notification
d’appel
• Mise en place de securité
55
56
D. Rossi – RES224
Applications, QoS et transport
Application
Transfert de fichiers
e-mail
Web
Audio/vidéo Temps réel
Pertes
Sans pertes
Sans pertes
Sans pertes
tolérant
Audio/vidéo enregistré
Jeux interactifs
Messagerie instantanée
tolérant
tolérant
Sans pertes
Application
e-mail
Accès distant
Web
Transfert de fichiers
streaming multimedia
Téléphonie sur IP
Bande passante
élastique
élastique
élastique
audio: 5Kb - 1Mb
vidéo:10Kb - 5Mb
idem
Quelques Kbps
élastique
Protocole applicatif
SMTP [RFC 821]
telnet [RFC 854]
HTTP [RFC 2068]
FTP [RFC 959]
Propriétaire (RealPlayer)
Propriétaire (Skype)
Sensible au delai
Non
Non
Non
Oui,
<100’s ms
Oui, quelques s
Oui, 100’s ms
Oui et non
Protocole transport
TCP
TCP
TCP
TCP
TCP ou UDP
En général UDP
Mais TCP est de plus en plus utilisé pour contourner
les limitations de securité (e.g., firewalls) mises en place
Dans la suite…
56
Service
Addressage
Protocole
DHCP, DNS
Transport
orienté donnés, UDP
eMail
Acces aux donnés
VoD
P2P
P2P VoIP
P2P BitTorrent
SMTP/POP+IMAP TCP
HTTP, FTP
TCP
YouTube
bang-bang over TCP
*
souvent les deux
Skype
UDP, TCP si necessaire
BitTorrent
UDP depuis 2010
Remarque
rapidité avant
fiabilité
interaction L7/L4
interaction L7/L4
controle L7
controle L7
2.2 Les applications client-server
?? || //
57
58
D. Rossi – RES224
RES 224
Architecture des applications Internet
dario.rossi
Addressage:
DNS et DHCP
Dario Rossi
RES224
http://www.enst.fr/~drossi
v250811
Plan
•
•
Introduction
L’espace de nommage DNS
– Nom et labels,
– Root et TLD
– Zones et domaines
•
Base d’information du DNS
– Ressources
– Example de base
•
Architecture et protocole DNS
– Résolution de noms
– Format de requetes/reponses
– Examples de resolution
•
Apercu du protocole DHCP
– Machine à étas finis
– Format des messages
•
58
References
2.3 Addressage: DNS et DHCP
Domain Name System (DNS)
Introduction
• Le niveau 2 relie physiquement les
équipements sur la base d’une adresse
physique (MAC)
• Le niveau 3 relie les équipements
logiquement entre eux sur la base d’une
adresse logique (IP)
• Le niveau 4 relie les applications entre elles
sur la base d’un identifiant (port)
• Le niveau 7, et donc les usagers, références
les équipements par un nom
Application
Transport
Network
Data link
Physical
– Nom symbolique « intuitif ou facile ».
– Plus simple à retenir qu’une adresse
59
60
D. Rossi – RES224
Introduction
•
Internet, grand réseaux dynamique
– Beaucoup d'adresses dans le réseau
– Adresses dynamiques et adresses statiques
•
On associe aux adresses IP un nom
symbolique
– Plus « agréable » à mémoriser qu'un
numéro
– Les noms doivent être uniques
– Nécessité de tables de correspondance
(exemple: /etc/host, HOST.TXT)
•
• Table distribué
Resolution (IP, nom)
– ARPANET: mis-à-jour periodique et
centralisée de la base (chaque nuit),
puis rediffusion à tous les hotes
– Probleme: Internet grandit
– Solution: table distribué
– Extension de l’adresse à la
notion de ressource
• Ressources correspondantes
à des addresses
– Organisation de l’espace de
nommage en domaines
• Chaque domaine gere ces
correspondances
Organismes
•
Internet Corporation for Assigned Names and Numbers (ICANN)
– Assure la gestion des noms des adresses et des identifiants de ports etc.
•
Internet Assigned Numbers Authority (IANA)
•
Regional Internet Registries (RIRs)
– Assure le travail technique de ICANs
– Delegation du travail de IANA
– RFC1466 définit les règles d'allocation des adresses IP aux RIR et aux LIR
•
60
RIRs
– APNIC (Asia Pacific Network Information Centre)
– ARIN (American Registry for Internet Numbers) qui a en charge les
Amériques et une partie de l'Afrique subsaharienne;
– RIPE NCC (Réseaux IP Européens - Network Coordination Centre) qui a
en charge l'Europe, le Moyen-Orient, l'ex-URSS et la partie de l'Afrique
non couverte par ARIN.
– AfriNIC (African Regional Network Information Centre)
– LACNIC (Latin American and Caribbean IP address Regional Registry).
2.3 Addressage: DNS et DHCP
Domain Name System (DNS)
• Les composantes du DNS
–
–
–
–
–
Un espace de nommage
Une base de données (spécification des ressources)
Un protocole pour interroger la base
Une architecture: resolver (client) et serveur
Configuration du client et server
• Plusieurs RFCs décrivent l’annuaire Internet:
– RFC 883 et 884 ensuite 1034 et 1035
• 1984: Paul Mockapetris définis le DNS
– RFC 1032, 1033, 1034, 1035, 1101, 1122, 1123, 1183, 1713,
1794, 1912, 1995, 1996, 2010, 2136, 2137, 2181, 2308, 2317,
2535-2541, 2606, etc.
• Optional readings RFC1034 et RFC1035
Espace de nommage
•
•
•
Structure d’arbre avec une racine
– La racine (root) est représentée par un point « . »
Un nœud représente un domaine
– À chaque nœud de l’arbre est associé un label
– Les informations associées à chaque nœud du domaine
sont contenus dans une base et gérées par ce nœud
(en fait, cela est simpliste… plus de details dans la suite)
– Chaque nœud peut avoir des fils, dont le labels doivent
etre différent (pour garantir l’unicité)
.
it
fr
x
z
Label
– Désigné par des caracteres alphanumériques et « - »
• En fait, limitation de praticité, car 255 valeurs seraient possibles…
mais ne sont pas sur le clavier !
y
– La taille maximum d’un label est de 63 caractères
– Pas de sensibilité à la casse (case insensitive)
• A nouveau, pratique courante et pas limitation technologique
61
62
D. Rossi – RES224
Espace de nommage
•
Nom de domaine
– Concatenation de labels sur un chemin donné
• Séparé par des dots « . » du point de vue des etres humains
• Suite de pair (longueur, label) du point de vue des messages du reseau
– La taille d’un nom de domaine est au maximum de 255 characterès
• La profondeur de l’arbre est limitée par cette taille
•
Nom de domaine complet
– Fully Qualified Domain Name (FQDN )
– doit etre unique dans le reseau (pas d’ambiguité dans la resolution)
– est composé de l’ensemble des labels d’un feuille de l’arbre jusqu’à la racine
• Par default, le DNS ajoute le nom de domaine locale (e.g., www deviens www.enst.fr.)
• Le « . » finale est souvent omis (mais www.enst.fr deviens www.enst.fr. )
•
Hormis la premiere, le choix des labels est laissé aux besoins des usages
– Le choix de www.enst.fr. est pour retrouver intuivement le nom de la machine
de l’ENST qui supporte une application WEB.
Espace de nommage
root
.
tld
edu
mil
gov
com
net
int
ddn
ibm
hp
nic
www
www
org
fr
…
ca
enst
infres
rms
62
ch
tsi
siav
2.3 Addressage: DNS et DHCP
Root
•
Root server
– Par definition, au plus haut de
la hierarchie
a NSI Herndon, VA
– points d’entrée dans le
c PSInet Herndon, VA
système de résolution globale
d U Maryland College Park, MD
g DISA Vienna, VA
– Ils ont un rôle de « reroutage »
h ARL Aberdeen, MD
des résolutions
j NSI (TBD) Herndon, VA
– Connaissent tous les autres e NASA Mt View, CA
root server
f Internet Software C. Palo
– Connaissent tous les TLDs Alto, CA
– [a-m].root-servers.net
– Essayez
• dig @a.root-servers.net
b.root-server.net
•
k RIPE London
i NORDUnet Stockholm
m WIDE Tokyo
b USC-ISI Marina del Rey, CA
l ICANN Marina del Rey, CA
Site de référence
– http://www.root-servers.org/
– http://www.publicroot.net/hint.txt
Root
L ettre
ancien nom
A
B
C
ns1.isi.edu
D
terp.um d.edu
H
I
VeriSign
USC -ISI
Cogent
University of
M aryland
NASA
E
F
G
O rganisation
ISC
U.S. DoD N IC
U.S. Arm y
aos.arl.arm y.m il
Research Lab
ns.nic.ddn.m il
nic.nordu.net
Autonom ica
J
VeriSign
K
R IPE NCC
L
M
ICA N N
W ID E Project
Ville / Etat
D ulles(*), Virginie,
M arina del Rey, C alifornie
D istribution par anycast
C ollege Park, M aryland,
M ountain View, Santa C lara
C ounty, C alifornie
N airobi, distribution par anycast
C olom bus, O hio
A berdeen Proving G round,
M aryland
Stockholm , Suède, distribution
par an ycast
D istribution par anycast
Londres, R oyaum e-Uni,
distribution par anycast
Los Angeles, C alifornie
T okyo, distribution par anycast
R égion
(continent)
États-Unis
États-Unis
États-Unis
États-Unis
K enya
États-Unis
États-Unis
U nion
européenne
U nion
européenne
États-Unis
Japon
63
64
D. Rossi – RES224
Top-Level Domain
• Les Top-Level Domain (TLD):
– Domaines de premier niveau (RFC1591)
– Connaissent tous les root server
– Connaissent les serveur delegués de
sous-domaine
• Deux types
– ccTLD: country code TLD
• ex. fr, ca, ma, us, de, …
• http://www.icann.org/cctld/cctld.html
– gTLD: generic TLD
• ex. com, org, net, …+ 7 nouveaux biz, aero, name, pro, musuem,
info et coop et encore récemment mobi
• http://www.icann.org/gtld/gtld.html
Domaine vs Zone
•
•
Domains et zones
– Un domain identifie une sous arborescence de l’espace de nommage
(Domain = lieu topologique)
– Une zone contient une base avec l’ensemble des informations
associées à ce nœud (Zone = borne administrative)
La délégation
– Un nœud délégue la gestion à un nœud fils, ceci devient une zone
• Ce qui confère la distribution des donnés au sein du DNS
• Ceci permet aux TLD de deleguer toutes les nombreuses sous-zones
(e.g., plusieurs millions de site.com.)
– Le nœud père doit être en possession des adresses où se trouvent la
base d’information des fils auxquels il a delegué des zones
• ‘‘Glue records’’ pour eviter references circulaires dans la resolution
64
2.3 Addressage: DNS et DHCP
Espace de nommage
Domaine fr
Zone fr
Zone enst
fr
enst
infres
edf
tsi
rd
siav
prod
…
rms
…
Espace de nommage
Domaine= sous-arbre,
lieu topologique
Zone= forme arbitraire,
bornes administratives
Zone
enst
ses
fr
edf
enst
Zone
infres
…
rms
infres
tsi
siav
rd
prod
Domaine
enst
65
66
D. Rossi – RES224
Domaine vs Zone
•
•
•
Zone
– Administré séparément.
– Une zone peut être subdivisée en plusieurs zones.
– L'autorité d'une zone est déléguée à une personne qui est chargée
de créer des serveurs de domaine
– Remarque: il existent des boites dont le business est de gerer
des bases de donnés DNS pour des tiers
À chaque fois qu'une zone est créé,
– L'administrateur alloue un nom à cette zone (verifié avec RIR),
– Notifie les adresses IP des serveurs DNS de sa zone à la zone père
– Les noms de machines dans cette zone sont inseré dans le database.
Remarque: pour une zone on a obligatoirement
– un serveur primaire et un serveur secondaire.
– Le serveur secondaire copie la base du premier (zone transfer).
– La difference n’est que dans le nom, les deux serveurs sont pareils
Service DNS
66
•
Le service principale du DNS consiste dans la resolution de noms:
– Associer une adresse IP à un nom
– Associer un nom à une adresse IP (résolution inverse)
– Associer un nom à une ressource
•
La résolution de nom est basé sur un modèle client/serveur
– Le serveur est celui qui détient les informations
– Chaque serveur est autoritaire sur sa propre zone
– Il consulte d’autres serveurs DNS au delà du périmètre de sa zone
•
Notion de serveur autoritaire
– Autoritaire: serveur responsable de l’information, concernant sa zone,
qu’il est en train de communiquer (le secondaire est aussi autoritaire )
– Non autoritaire: serveur qui détient l’information mais qui n’est à
l’origine de celle-ci (information retenue temporairement)
2.3 Addressage: DNS et DHCP
Ressources et base d’information du DNS
• La base d’information:
– Contient des enregistrements ou Ressources Records (RR)
– Un enregistrement représente une ressource de l’annuaire
et une association entre plusieurs objets
• Enregistrement de type Start Of Authority (SOA) :
– indique l'autorité sur la zone.
– contient toutes les informations sur le domaine:
• le délai de mise à jour des bases de données entre
serveurs de noms primaires et secondaires,
• le nom du responsable du site
• Enregistrements de type Name Server (NS) :
– les adresses des serveurs DNS responsables de la zone
Ressources et base d’information du DNS
• Enregistrement de type Adresse (A , A6, AAAA):
– définit les noeuds fixes du réseau ceux qui ont des
adresses IP statiques
• Enregistrements de type Mail eXchanger (MX) :
– Identifie les serveurs de messagerie (recepteur)
• Enregistrements de type Canonical Name (CNAME):
– définit des alias sur des noeuds existants.
• Enregistrement de type Pointeur (PTR):
– résolution de noms inverse dans (domaine in-addr.arpa.)
• Pour chaque RR, le DNS garde:
– Le nom, le type, la classe, la durée de vie, la valeur
67
68
D. Rossi – RES224
Example
enst.fr.
IN SOA ns1.enst.fr. hstr.enst.fr. (
20001210011 ; numéro de série
10800
; rafraîchissement (REFRESH)
3600
; nouvel essai (RETRY)
604800
; Obsolescence (EXPIRE)
86400 )
; TTL minimal de 1 jour
enst.fr.
IN NS ns1.enst.fr. ; commentaire
enst.fr.
IN NS ns2.enst.fr. ; enst.fr peut être
;remplacé par @
@
IN NS ns1
IN NS ns2
Example
ns1.enst.fr.
ns2.enst.fr.
smtp.enst.fr.
localhost.enst.fr.
www
ftp
enst.fr.
IN
IN
IN
IN
IN
IN
IN
A
A
A
A
CNAME
CNAME
MX
1.0.194.137.in-addr.arpa. IN PTR
68
137.194.192.1
137.194.160.1
137.194.200.1
127.0.0.1
ns1.enst.fr.
ns2.enst.fr.
smtp.enst.fr.
ns1.enst.fr.
2.3 Addressage: DNS et DHCP
Architecture/Protocole DNS
• Norme (assez) stable
– RFC 1034 : concepts et facilités du DNS
– RFC 1035 : spécification de l ’implémentation
• Protocole
–
–
–
–
Paradigme client-server, fortement distribué
Messages de type requête/réponse, format binaire
Besoin primaire: rapidité, eviter lourdeur, fiabilité
Transport: UDP et TCP
• UDP plus rapide (evite three way handshake)
• La taille des messages UDP est limité à 512 octets
• Si la réponse est supérieure à 512 octets, elle est tronquée, la
requête est retransmise une deuxième fois mais basée sur TCP
• TCP pour transfer de zone
Architecture/Protocole DNS
• Transfer de zone
– Implementé comme query DNS (ressource AXFR)
– Le serveur secondaire copie la base de données
du server primaire avec TCP (fiabilité impérative)
– Debut et fin par le SOA
• Numero de serie incrementale pour confronter versions
• Policy
– Polling periodique effectué par le serveur secondaire
– Effectué chaque REFRESH secondes
– En cas de problemes, re-essaye chaque RETRY
secondes
– Problemes persistent: la base est jetée après EXPIRE
sec
69
70
D. Rossi – RES224
Architecture/Protocole DNS
• Modalités d’interrogation
– Mode iteratif
• Obligatoire
• Complexité sur le le client
– Mode récursif
•
•
•
•
Optionnel
Complexité sur le serveur
Difficile pour troubleshooting en cas de panne
Benefique pour le cache
• En pratique
– Recursif jusqu’au serveur DNS locale
– Iteratif depuis le serveur DNS locale
Resolution
Resolution recursive
de type A pour y.x.com
2
.
3
Remarque:
La reponse dans le cache du resolver
pourra servir les prochaines requetes
6
fr
it
5
7
enst
4
com
x
z
y.x.com
8
DB
ns1
Cache
Cache
1
70
www
y
Lien logique
Lien physique
Requete
Reponse
2.3 Addressage: DNS et DHCP
Resolution
Resolution iterative
de type A pour y.x.com
2
Remarque:
La reponse dans le cache du resolver
pourra servir les prochaines requetes
.
3
fr
it
com
4
enst
x
5
y.x.com
8
DB
ns1
www
y
Cache
Cache
6
1
z
Lien logique
Lien physique
Requete
Reponse
7
Format des requêtes/réponses
1
32
IDENTIFICATION
FLAGS
NBR. DE QUESTIONS
NBR. DE REPONSES
NBR. DE RRs AUTORITAIRES
NBR.de RRs SUPPLEMENTAIRES
QUESTIONS
REPONSES
AUTORITAIRES
INFORMATIONS SUPPLEMENTAIRES
71
72
D. Rossi – RES224
Format des requêtes/réponses
•
•
Identification (16 bits): pour associer une réponse à une requête
Flags (16 bits):
– 1 QR : Question = 0 , Réponse = 1
– 2 OPCODE (3 bits)
0 question standard / 1 question inverse / 2 requête de statuts du serveur
–
–
–
–
–
–
–
–
5 AA: = 1 Authorative Answer
6 TC : = 1 Truncated Response
7 RD: = 1 Recursion Desired sinon question itérative
8 RA : = 1 Recursion Allowed indique le support de la récursion
9 Reserved : = 0
10 AD Authentic Data
11 CD Checking Disabled
12 RCODE (4bits) Reponse Code
•
•
•
•
•
•
•
0 Pas d’erreur.
1 Erreur de format, question non interprétable.
2 Problème sur le serveur.
3 Nom dans la question n’existe pas
4 Type de la question n’est pas supporté
5 Question refusée.
6-15 Réservées
Format des requêtes/réponses
Question
Nom
Type
Classe
Nom
5
v
e
r
d
i
4
e
n
s
t
2
f
r
0
FQDN
72
2.3 Addressage: DNS et DHCP
Format des requêtes/réponses
Type
Symbole
Valeur
Description
type
A
1
Adresse IP
x
type de
demande
x
NS
2
Nom du serveur de nom autoriaire
x
x
CNAME
5
Nom canonique
x
x
PTR
12
Pointeur
x
x
HINFO
13
Information sur le host
x
x
MX
15
Serveur de messagerie
x
x
AXFR
252
Requête pour zone de transfert
x
ANY / *
255
Requête pour tous enregistrement
x
Format des requêtes/réponses
Class
1= (IN) Internet address
2 = Non attribué ou non supporté
3 = Réseau Chaos du MIT
4 = attribué au MIT pour Hesiod
73
74
D. Rossi – RES224
Format des requêtes/réponses
Réponse
Nom
Type
Classe
Durée de vie de la réponse (TTL)
Longueur de la donnée
Donnée
Example
• Démonstration et analyse de trafic
Question:
00 07 cb 4c 40 e1 08 00 46 60 d6 42 08 00 45 00
00 38 46 32 00 00 80 11 28 b9 c0 a8 00 0a d4 1b
36 fc 04 18 00 35 00 24 35 ab a2 b3 01 00 00 01
00 00 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01
74
2.3 Addressage: DNS et DHCP
Example
• Démonstration et analyse de trafic
Question:
00 07 cb 4c 40 e1 08 00 46 60 d6 42 08 00 45 00
00 38 46 32 00 00 80 11 28 b9 c0 a8 00 0a d4 1b
36 fc 04 18 00 35 00 24 35 ab a2 b3 01 00 00 01
00 00 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01
Ethernet / IP / UDP / DNS
Example
• Démonstration et analyse de trafic
Question:
00 07 cb 4c 40 e1 08 00 46 60 d6 42 08 00 45 00
00 38 46 32 00 00 80 11 28 b9 c0 a8 00 0a d4 1b
36 fc 04 18 00 35 00 24 35 ab a2 b3 01 00 00 01
00 00 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01
ID / QR OPCODE AA TC RD / RA Z RCODE
75
76
D. Rossi – RES224
Example
• Démonstration et analyse de trafic
Question:
00 07 cb 4c 40 e1 08 00 46 60 d6 42 08 00 45 00
00 38 46 32 00 00 80 11 28 b9 c0 a8 00 0a d4 1b
36 fc 04 18 00 35 00 24 35 ab a2 b3 01 00 00 01
00 00 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01
QDCOUNT / ANCOUNT / NSCOUNT / ARCOUNT
Example
• Démonstration et analyse de trafic
Question:
00 07 cb 4c 40 e1 08 00 46 60 d6 42 08 00 45 00
00 38 46 32 00 00 80 11 28 b9 c0 a8 00 0a d4 1b
36 fc 04 18 00 35 00 24 35 ab a2 b3 01 00 00 01
00 00 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01
3 www 3 lcl 3 fr 0 / TYPE / CLASS
76
2.3 Addressage: DNS et DHCP
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
Ethernet / IP / UDP / DNS
77
78
D. Rossi – RES224
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
ID / QR OPCODE AA TC RD / RA Z RCODE
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
QDCOUNT / ANCOUNT / NSCOUNT / ARCOUNT
78
2.3 Addressage: DNS et DHCP
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
3 www 3 lcl 3 fr 0 / TYPE / CLASS
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
C0 0C = 11000000 00001100 => Offset Ptr +12 (to 0x03)
79
80
D. Rossi – RES224
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
CNAME / Internet / TTL=8m / Size of Ptr Len / Ptr
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
C0 0C = 11000000 00010000 => Offset Ptr +16 (to 0x03)
80
2.3 Addressage: DNS et DHCP
Example
• Démonstration et analyse de trafic
Réponse:
08 00 46 60 d6 42 00 07 cb 4c 40 e1 08 00 45 00
00 56 00 00 40 00 38 11 76 cd d4 1b 36 fc c0 a8
00 0a 00 35 04 18 00 42 17 48 a2 b3 81 80 00 01
00 02 00 00 00 00 03 77 77 77 03 6c 63 6c 02 66
72 00 00 01 00 01 c0 0c 00 05 00 01 00 00 01 e1
00 02 c0 10 c0 10 00 01 00 01 00 00 01 e1 00 04
c1 6e 98 37
TYPE/ CLASS/ TTL / Data Len / 193.110.152.55
Example
drossi@nonsns:~$ dig www.lcl.fr
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23398
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.lcl.fr.
IN
A
;; ANSWER SECTION:
www.lcl.fr.
417
lcl.fr.
417
CNAME
A
IN
IN
lcl.fr.
193.110.152.55
;; Query time: 50 msec
;; SERVER: 212.27.54.252#53(212.27.54.252)
;; WHEN: Thu Dec 20 22:21:20 2007
;; MSG SIZE rcvd: 58
81
82
D. Rossi – RES224
Résolution de nom inverse
Conclusions et remarques
• Protocole simple et flexible
– Déploiement et mise en œuvre rapide
– Adapté à plusieurs réseaux / usages
(voir SPF dans le cours SMTP)
• Plusieurs attaques structurelles possibles
– Nécessité de laisser passer le trafic DNS
(pas de filtrage)
– Exposé aux attaques mais suffisement robuste
• Talon d’Achille de l’Internet
–
–
–
–
82
9/11 twin towers (cfr. TD)
Ne supporte pas tous types de ressources
Évolutions RFC4423 (Host Identity Protocol, HIP)
Internet of things ?
2.3 Addressage: DNS et DHCP
Dynamic Host Configuration
Protocol (DHCP)
Protocole DHCP
•
DHCP
–
–
–
–
•
Basé sur Bootp + extensions (“backward compatibility)
RFC 2131 (updated 3396, 4361, 5494)
Client/server, request/reponse
Format PDU binaire sur UDP (port 67/68)
Fonctionnalités
– Gestion centralisé de l’allocation de ressource (addresses IP)
– Mise en place automatique de paramètres de configuration
• Addresse IP, netmask, default GW, DNS resolver
83
84
D. Rossi – RES224
Protocole DHCP
• En brief:
– Le client émet en broadcast un paquet de type DHCPDISCOVER,
pour identifier les serveurs DHCP disponibles
– Le serveur répond par un paquet DHCPOFFER (broadcast), qui
contient les premiers paramètres
– Le client établit sa configuration et envoie un DHCPREQUEST
pour valider son adresse IP
– Le serveur répond par un DHCPAK avec l’adresse IP pour
confirmer l’attribution.
DISCOVER
OFFER
OFFER
REQUEST
ACK
Protocole DHCP
DHCP Discover
C
DHCP Offer
Eth
macC
FF:FF:FF:FF:FF:FF
Eth
macS
macC
IP
0.0.0.0
255.255.255.255
IP
IPs
IPoffered
UDP
68
67
UDP
67
68
DHCP
Type: Discover
Transation ID: x
Opts: Hostname, preferred IP
DHCP
Type: Offer
Transaction ID: x
Params: netmask, gw, DNS,
lease time, …
S
Discover
Offer
DHCP Request
84
Eth
macC
FF:FF:FF:FF:FF:FF
IP
0.0.0.0
255.255.255.255
UDP
68
67
DHCP
Type: Request
Transation ID: x
Opts: Hostname, IPoffered,
DHCP server IPs
DHCP Ack
Request
Ack
Eth
macS
macC
IP
IPs
IPc
UDP
67
68
DHCP
Type: Ack
Transaction ID: x
Params: mask, gw, DNS, lease
2.3 Addressage: DNS et DHCP
Protocole DHCP
• Baux DHCP (DHCP lease)
– Prêt d’un adresse IP donnée pour une durée limitée (soft state)
– Demande (par le client) de prolongation du bail : DHCPREQUEST
– Optimisation des adresses IP en jouant sur la durée des baux
• Courte durée pour les réseaux où les ordinateurs se branchent
et se débranchent souvent,
• Longue durée pour les réseaux constitués en majorité de machines fixes.
• Mode Client/Serveur
– Les clients : machines disposant du protocole TCP/IP, et d’une application
DHCP (pump, dhclient)
– Les serveurs : machines ou routeurs configurés manuellement disposant du
service serveur DHCP (dhcpd)
– Fonctionne au-dessus d’UDP; les échanges sont à l’initiative des clients
Protocole DHCP
NACK
NACK
Init
Init/ Reboot
NACK
Discover
Selecting
Decline
Rebooting
Nack
Request
Collecte des messages Offers
Request
ACK
Requesting
Rejet des messages
Offer non valides
ACK
ACK
Rejet des messages
Offer/Ack/Nack
Bound
Expiration
bail
Rebinding
Expiration
du délai
ACK
Renewing
85
86
D. Rossi – RES224
Protocole DHCP
Allocation d’adresses IP
Statique
Manuelle
Adresse fixée
par l’administrateur
Automatique
Adresse fixée
définitivement
Dynamique
Dynamique
Adresse prêtée pour
une certaine durée (bail)
DHCP
References
• Optional reading:
– D.Wessel and M. Fomenkov, Wow, that’s a lot of
packets, Passive and Active Measurement (PAM),
2003
http://www.caida.org/publications/papers/2003/
dnspackets/wessels-pam2003.pdf
86
2.3 Addressage: DNS et DHCP
?? || //
87
30/08/20
88
D. Rossi – RES224
RES 224
Architecture des applications Internet
Accès au données
dario.rossi HTTP et FTP
Dario Rossi
RES224
v250811
http://www.enst.fr/~drossi
Plan
• Web
•
•
•
•
Histoire du Web
Architecture du Web
Le protocole HTTP
Performance
• FTP
• Apercu du protocole
• Connectivité
88
30/08/20
2.4 Accès aux données: HTTP et FTP
HTTP
Histoire du World Wide Web
•
•
•
•
•
•
1945,
1965,
1989,
1991,
1993,
1994,
•
•
•
•
•
•
•
•
•
•
1994,
1995,
1995,
1996,
2001,
2004,
2005,
2008,
2009,
2010,
Vannevar Bush's memex
Ted Nelson invente le mot "Hypertext"
Tim Berners Lee (CERN) debut du Web
première demonstration publique
Marc Anderson, (Univ. of Illinois at Urbana Champaign), Mosaic
World Wide Web Consortium agreement (http://www.w3c.org),
signé par MIT et CERN
Java (SUN Microsystems) rencontre le Web
Netscape devient une societé à la bourse
JavaScript
Internet Explorer commence la guerre des browsers
estimation de la taille du “deep Web”
Web 2.0
AJAX
YouTube deuxième moteur de recherche après Google
Twitter et l’Iran,
facebook, ou comment inviter 20000 personne à ton anniversaire
89
30/08/20
90
D. Rossi – RES224
Architecture du Web
Architecture du Web
Client HTTP
protocole HTTP
Browser
HTTP Proxies
Interaction HTTP / TCP
90
HTML
Serveur HTTP
Server
Focus de
RES224
RES240
30/08/20
2.4 Accès aux données: HTTP et FTP
Architecture du Web
• Hebergement (Web hosting)
– Hebergement de plusieurs
p’tit sites sur une seule
machine (eg. Geocites)
– Services d’hebergement
encore existent (eg. lately,
virtual HTTP servers)
GeoCities was shut down
26 Oct 2009. There were
at least 38 million userbuilt pages.
Architecture du Web
• Server farms
– Enormement de machines
pour faire face a à la charge
due aux nombre de clients
– Cela marche pour des
bottleneck de CPU, mais
pas de capacité
(un des centres)
Google
91
30/08/20
92
D. Rossi – RES224
Architecture du Web
• CDN
User
– Specialisés dans la
diffusion du contenu
– Redirection des requetes
CDN
AS1
AS4
• via DNS
(cfr optional readings)
• via HTTP redirection
(plus loin dans le cours)
CDN
AS2
CDN
– Multi-B$ business
CDN
• ~30 market players
• Akamai, Limelight
AS3
Video
Server
Vocabulaire du Web
Page Web:
Agents utilisateur
•
•
Ensemble d’objets
–
–
•
Page HTML de base
Objets (images, fichiers, …)
référencés par des URLs
Uniform Resource Locator (URL)
proto://nom.de.domaine:porte/chemin/d/acces.htm
•
Browser
– MS Internet Explorer, Firefox, Chrome, ..
Crawler, spider
– Wget, DeeperWeb, WebCrawler, …
Daemon
•
•
Server Web :
– Apache, Google Web server, MS Internet
Information Server, …
Proxy server
– Squid cache, Apache Traffic Server, …
Internet
http://y.x.com
92
y.x.com
30/08/20
2.4 Accès aux données: HTTP et FTP
Universal Resource Locators
•
Une precision: non seulement les objet HTTP sont désignés par URLs
•
URLs contients Protocole, machine, port, répertoire et fichier…
– …mais aussi parametres (cfr plus tard), etc. et ils peuvent donc
devenir arbitrairement compliqué…
HTML – HyperText Markup Language
• Page HTML interpreté
par le browser
• Page HTML stoqué sur
le server Web
(b)
93
30/08/20
94
D. Rossi – RES224
Documents dynamiques
• HTML pas seulement statique, mais dynamiquement crée
en fonction des demande d’usager
HTTP
• Plusieurs composant de l’architecture Web,
– coté serveur (e.g., CGI, PHP, SHTML)
– coté client (Javascript, Ajax,…)
Remarque: HTTP
intervient seulement
dans quelques cas
HTTP
HTTP
Le protocole HTTP
•
Paradigme client/server
– Client: le browser, qui demande, reçoit, affiche les objets Web
– Serveur: le serveur Web, qui envoie les réponses aux requêtes des browsers
•
Simple protocole requête/réponse
– Importance de normes: interoperabilité
– HTTP1.0 : RFC 1945
– HTTP1.1 : RFC 2068
Win95
IE5
Linux
Firefox
Machine Unix
Netscape
Internet
94
30/08/20
2.4 Accès aux données: HTTP et FTP
Le protocole HTTP
Service de transport (simplifié)
HTTP est « sans état »
•
•
•
Orienté données, donc TCP
Le client initie une connexion TCP
(socket) avec le serveur, port 80
Le serveur accepte la connexion TCP
– Quelle est la porte du client ?
Des messages HTTP (protocole
applicatif) sont échangés entre le
browser (client) et le serveur Web
La connexion TCP est fermée
– Remarque: cela depends
des options et/ou de la
version du protocle
•
•
•
•
•
Le serveur ne maintient aucune
information au sujet des requêtes
précédentes des clients
Pro: simplicité d’implementation,
passage à l’echelle
Contre: manque d’information
Les protocoles « avec état » sont
complexes !
•
•
L’histoire passée doit être gardée
Si le serveur ou le client crashe les
états peuvent être incohérents et il
faut les resynchroniser
Format des messages HTTP : requete
•
•
•
Deux types de messages HTTP : requête, réponse
Format ASCII
Message de requête HTTP :
Ligne de requête
(methode, ressource,
version protocole)
GET /~drossi/index.html HTTP/1.0
Host: www.enst.fr
Connection: close
Lignes User-agent: Mozilla/4.0
d’entête Accept: text/html, image/gif,image/jpeg
Accept-language:fr
Le retour chariot
indique la fin du message
95
30/08/20
96
D. Rossi – RES224
Format de message HTTP : reponse
•
•
•
Deux types de messages HTTP : requête, réponse
Format ASCII
Message de reponse HTTP :
Ligne d'état
(version, code et
message d'état)
Lignes
d’entête
HTTP/1.0 200 OK
Connection: close
Date: Thu, 06 Aug 2002 11:00:15 GMT
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Apr 2001 …...
Content-Length: 621
Content-Type: text/html
Vide
Données (e.g.,
html , image)
data data data data data ...
Méthodes
HTTP/1.0
• GET
– Rapatrie des objet
• POST
– Envoie des objet
• HEAD
– Requete d’information de
l’entete concernant l’objet;
– Le serveur laisse l’objet hors
de la réponse
96
HTTP/1.1
• GET, POST, HEAD
• PUT
– Charge le fichier dans le corps
vers le chemin spécifié dans
l’URL
• DELETE
– Efface le fichier indiqué dans
le champ URL
30/08/20
2.4 Accès aux données: HTTP et FTP
(Charger du contenu de formulaire)
Méthode POST
• Une page web peut contenir des
inputs de formulaires
• Input envoyé au serveur dans le
corps du message POST
Méthode GET
• Input envoyé directement dans le
champ URL du message GET
www.somesite.com/q?animal=monkeys&fruit=banana
Examples d’Entetes
97
30/08/20
98
D. Rossi – RES224
Codes de réponse HTTP
200 OK
– La requête a réussi et l’objet demandé est à la suite dans le corps du message
301 Moved Permanently
– L’objet demandé a changé définitivement de place (voir corps du message)
400 Bad Request
– La requête est erronée
404 Not Found
– Le document demandé n’est pas disponible sur le serveur
505 HTTP Version Not Supported
Mechanismes HTTP
• Interaction avec le niveau transport
– Connexion persistantes
– Pipelining des requetes
• Contourner l’absence d’etat de HTTP
– Authorization
– Cookies
• Architecture et performance
– GET Conditionnel
– Redirection
– Proxy HTTP
98
30/08/20
2.4 Accès aux données: HTTP et FTP
Mechanismes HTTP
• Interaction avec le niveau transport
– Connexion persistantes
– Pipelining des requetes
• Contourner l’absence d’etat de HTTP
– Authorization
– Cookies
• Architecture et performance
– GET Conditionnel
– Redirection
– Proxy HTTP
HTTP non persistant
Côté serveur
Côté client
SYN
1a. Le client HTTP initie une
connexion TCP au serveur
HTTP sur le port 80
(open active)
SYN+ACK
2. Le client HTTP envoie les
requêtes HTTP (contenant des
URLs) sur la connexion TCP
temps
5. Le client HTTP reçoit la réponse
contenant le fichier HTML,
l’affiche, et trouve les URLs
référencées
GET
Data
FIN
1b. Le serveur HTTP attend une
connexion TCP sur le port 80.
Il accepte la connexion, et
l’annonce au client (open passive)
3. Le serveur HTTP reçoit le message
de requête, génère le message de
réponse contenant l’objet requis,
et l’envoie sur la connexion TCP
4. Le serveur ferme la connexion TCP
(half close)
6. Les étapes 1-5 sont répétées pour chaque URL référencée
(e.g. images à telechargér), potentiellement en parallel
99
30/08/20
100
D. Rossi – RES224
SYN
RTT
Temps de completement: 2RTT+tTX
• tTX depends de l’etat de la connexion
• Slow start: 1,2,4,8,… segments par
RTT au debut
SYN+ACK
ACK + GET
RTT
World Wide Wait!
• Le browser ne peut que commencer
à afficher les données apres 2 RTT
• Slow-start pour tous les objects
Data + FIN
FINACK
Fermeture Transfert de
connexion données
Temps de réponse: 2RTT
• Un RTT pour la connexion TCP
• Un RTT pour la requete HTTP et le
premier segment TCP de la reponse
HTTP (~1460 octets)
Etablissement
de connexion
Interaction avec TCP
HTTP persistant
Côté serveur
Côté client
SYN
1a. Le client HTTP initie une
connexion TCP au serveur
HTTP sur le port 80
(open active)
SYN+ACK
GET
2. Le client HTTP envoie la requête
HTTP sur la connexion TCP
Data
4. Le client HTTP reçoit la réponse
HTML, les URLs référencées et
envoye >1 requetes (pipelining)
100
GETs
Data
temps
6. Une fois la page completé, le
browser ferme la connexion
(ou sinon le server après un
timeout de k*300s )
1b. Le serveur HTTP attend une
connexion TCP sur le port 80.
Il accepte la connexion, et
l’annonce au client (open passive)
3. Le serveur HTTP reçoit le message
de requête, génère le message de
réponse contenant l’objet requis,
et l’envoie sur la connexion TCP
5. Le serveur envoye les reponses sur
la meme connexion dès que
l’algorithme de controle de
congestionde TCP le permet
(cwnd)
30/08/20
2.4 Accès aux données: HTTP et FTP
Connexion Persistantes et Pipelining
Connexion non-persistante
Connexion Persistante
•
•
•
HTTP/1.0
Le serveur interprète les requêtes,
répond et ferme la connexion TCP
•
•
Probleme (cfr slide suivant)
•
•
•
Au moins 2 RTTs pour lire
chaque objet (handshake)
Chaque transfert doit subir
le slow-start de TCP
Exemple : page contenant:
– 1 HTML + 10 petits JPEG
Performance
•
Remarque
•
Les navigateurs HTTP/1.0 utilisent
plusieurs connexion en parallel !!
Par défaut dans HTTP/1.1 introduite
ensuite dans HTTP/1.0
Une seule connexion TCP
Pipelining: le client envoie les requête de
tous les objets requis dès qu’ils sont
référencés dans le HTML
=> pas obligé d’attendre une reponse
pour envoyer une nouvelle requete
•
Gain en performance pour le client
– Moins de RTTs en debut de connexion
=> moins de delai
– Moins de slow start (maintien des
parametres) => plus de bande passante
Gain en performance pour le serveur
– Moins de ressources employés (socket)
=> plus de clients servis
Mechanismes HTTP
• Interaction avec le niveau transport
– Connexion persistantes
– Pipelining des requetes
• Contourner l’absence d’etat de HTTP
– Authorization
– Cookies
• Architecture et performance
– GET Conditionnel
– Redirection
– Proxy HTTP
101
30/08/20
102
D. Rossi – RES224
Contrôle d'accès
•
Intérêt: accès restreint
•
HTTP fournit des codes et des entêtes
d'état pour permettre l'authentification
GET
– Server: 401 Authorization Required
– Client : Authorization : …
• user name
• Password (en clair!
•
401: authorization req.
WWW authenticate:
HTTP est sans état
GET
Authorization: <cred>
– Le client doit être autorisé à chaque
requête
– Necessaire d’utiliser l’ en-tête
Autorisation: dans chaque requête
– Autrement erreur 401
•
200 OK...
GET
Authorization: <cred>
Totalement insecure
– sauf si utilisé avec SSL/TLS (Secure
Socket Layer / Transport Layer Security)
200 OK...
Cookies
Le cookie, soit
– Un identificatif unique,
transporté dans l’entete HTTP
– Introduisent de l’etat dans
un protocole sans etat
– Elegant, simple, scalable, espion
– Flexible: interpretation arbitraire
– RFC 2109
Quatre composantes
1) Cookie dans HTTP request
2) Cookie dans HTTP response
3) Fichier cookie chez l'utilisateur et
géré par le browser
4) Database derriere le site Web
102
Pros/Cons
– Authorisation implicite
– Caddies (e-commerce)
– État session (Web e-mail)
– Publicité/offre personalisé
– Privacy issues
Proprietés
– Transportés par HTTP
– Gérés au dela de HTTP
– >300 cookies par browser
– >4096 bytes par cookie
– >20 cookie par domaine
– Remarque: 1 cookie =
plusieurs segments TCP
30/08/20
2.4 Accès aux données: HTTP et FTP
Cookies
GET /index.php?id=122
HTTP/1.1
Accept: */*
Referer:
http://www.test.fr/index.
php?id=19
Accept-Language: fr
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0
(compatible; MSIE 7.0;
Windows NT 5.1; .NET
CLR 1.1.4322; .NET CLR
2.0.50727; InfoPath.2)
Host: www.test.fr
Connection: Keep-Alive
Cookie: user=8441e98f05;
TestCookieAlone=ok;
Nombre_visite=6;
Derniere_visite=20%2F12
%2F2006+%E0+09%3A20
HTTP/1.1 200 OK
Date: Wed, 20 Dec 2006
08:20:17 GMT
Server: Apache/1.3.31 (Unix)
PHP/4.4.4 mod_ssl/2.8.18
OpenSSL/0.9.6b
X-Powered-By: PHP/4.4.4
Set-Cookie: user=8441e98f05;
Nombre_visite=6;
expires=Wednesday, 27-Dec-06
08:20:18 GMT
Set-Cookie:
Derniere_visite=20%2F12%2F2
006+%E0+09%3A20;
expires=Wednesday, 27-Dec-06
08:20:18 GMT
Keep-Alive: timeout=15,
Connection: Keep-Alive
GET
200 OK
Setcookie:
GET
Cookie:
DB
Operations
specifiques
au Cookie
200 OK
Contenu
personalisé
Espionnage avec les Cookies
•
•
et
et
representent des image 1x1 pixels, de couleur transparent
ne sont pas hebergés sur site1.com, … mais sur spy.com
Site1.com
Site2.com
Spy.com
<img src=spy.com/img-1.png>
HTML
Img-1
Img-2
HTML
GET
•
•
•
•
•
Site1.com vous renvoye vers Spy.com pour
Spy.com recoit une requete pour img-1, vous
envoye un cookie C, et sait que C a visité Site1
Quand vous visitez Site2.com, votre browser doit
utiliser le meme cookie C pour obtenir img-2, etc.
Spy.com peut alors connaître le comportement de C
En moyenne, environ 80 sites comme spy.com par session
GET
,C
SETC
103
30/08/20
104
D. Rossi – RES224
Mechanismes HTTP
• Interaction avec le niveau transport
– Connexion persistantes
– Pipelining des requetes
• Contourner l’absence d’etat de HTTP
– Authorization
– Cookies
• Architecture et performance
– GET Conditionnel
– Redirection
– Proxy HTTP
GET Conditionnel
•
Objectif
– ne pas envoyer un objet que le
client a déjà dans son cache
•
Problème
–
les objets contenus dans le cache
peuvent être obsolètes
•
Solution
•
Operation
GET
If-modified-since: <date>
– GET conditionnel
– client: spécifie la date de la copie
cachée dans l’entete
If-modified-since: <date>
– serveur: la réponse est vide si la copie
cachée est à jour HTTP/1.0 304 Not
Modified
•
GET
If-modified-since: <date>
Remarque
– Plus efficace que effectuer une
requete HEAD, verifier la date de
l’objet et telecharger si modifié
– HEAD aurait besoin d’un RTT en plus
104
HTTP/1.0
304 Not Modified
HTTP/1.0 200 OK
…
Data
30/08/20
2.4 Accès aux données: HTTP et FTP
HTTP Redirection
YouTube
Web
Frontend
GET get_video?video
_id=XYZ HTTP/1.1
HTTP/1.1 303 See other
location: http://sjcv110.sjc.youtube.com/g
et_video?video_id=XYZ
YouTube
video
server
Geolocalisation
de l’hote, choix
d’un serveur
video proche
peu chargé
GET get_video?video
_id=XYZ HTTP/1.1
200 OK
Video data
Proxy server (Cache Web)
Proxy
server
Interet
– ne contacter le serveur
d’origine que si necessaire
Remarque
– Le cache locale HTTP permet
aux browsers de garder les
pages lues (~/.mozilla/Cache)
– Ce cache n’est pas partagé
Web
server
Deux types de proxy
– Explicite: configuration du browser
pour qu'il pointe vers le proxy server
– Transparent: intercepte et modifie les
packets
Fonctionnement
– Si l’objet est dans le cache, le proxy
le renvoie tout de suite
– Sinon il demande au serveur
d’origine, cache, et répond ensuite
– Proxy = client et serveur
105
30/08/20
106
D. Rossi – RES224
Proxy server (Cache Web)
Principe
• Le cache est proche du client
• Cache partagé par tous les
clients du meme reseau
Cout
• Réduction du débit à l’acces, economie
de bande passante dans le coeur
• Réduction du Opex (facture ISP) avec
investissement Capex (serveur proxy)
N
….
2
Performance
• Réduction du temps de réponse
– Delai plus faible en LAN (<1ms) que
sur Internet (parfois >100ms)
– Capacité plus importante en LAN
(Gbps) que sur le lien d’acces
Privacy?
• …
1
Machine
InternetUnix
Netscape
FTP
106
30/08/20
2.4 Accès aux données: HTTP et FTP
File Transfer Protocol (FTP)
•
Service
– Transfert de fichiers entre hôtes
Chatacteristiques
– RFC 959
– Server FTP: port 21
– Deux connexion: control et data
– Protocole à etat (repertoire,
authentication, ...)
•
•
Paradigme client/server,
– Requete/reponse; format
binaire, textuel, ou compression
– Eventuellement le client peut
instruir une communicaton entre
plusieurs serveurs
Machine Unix
Netscape
Internet
FTP : commandes, réponses
Example de commandes :
Example de réponses :
•
•
•
•
•
•
•
Envoyées comme du texte ASCII
sur le canal de contrôle
USER username
PASS password
LIST renvoie la liste des fichiers
du répertoire courant
RETR filename : rappatrie le
fichier (get)
STOR filename : stocke le
fichier sur l'hôte distant (put)
•
•
•
•
status code et explication
(similaire à HTTP)
331 Username OK,
password required
125 data connection
already open; transfer
starting
425 Can’t open data
connection
452 Error writing file
107
30/08/20
108
D. Rossi – RES224
FTP : commandes, réponses
% ftp hostname
Connected to hostname
Name (hostname:moi):
331 Password required for moi
331 Guest login ok, send e-mail address as password
Password:
230 user moi logged in
Commande internes du logiciel ftp
ftp>
qui implemente le protocole FTP
Commandes internes
– ?, cd, lcd, ls, dir, pwd, open, close, bin, ascii, get, put,
prompt, hash, mget, mput, del, mkdir, quit.
Active FTP
– Connexion de contrôle
– Role: échange des commandes et des
réponses entre le client et le serveur
– contrôle hors-bande (mais capacité
physique partagée avec données)
– Client: open active depuis le port non
privilegié N vers le port 21 du serveur
– Client: envoye PORT N+1 au serveur et
open passive du port N+1
– Connexion de données
– Role: echange de fichiers de données
vers/depuis l'hôte distant
– Server: open active vers le port N+1
du client à partir du port 20
– Souci de connectivité si hote derriere NAT
ou Firewall (SYN vers port N+1 bloqué)
=> passive FTP
TCP
ports
20
N N+1
SYN
SYN+ACK Connexion
de control
ACK
PORT N+1
SYN
Connexion
data
SYN+ACK
ACK
108
21
30/08/20
2.4 Accès aux données: HTTP et FTP
Passive FTP
– Connexion de contrôle
– Role: échange des commandes et des
réponses entre le client et le serveur
– contrôle hors-bande (mais capacité
physique partagée avec données)
– Client: open active depuis le port non
privilegié N vers le port 21 du serveur
– Client: envoye PASV au serveur
– Server: envoye PORT P (P>1023) au
client et open passive du port P
– Connexion de données
– Role: echange de fichiers de données
vers/depuis l'hôte distant
– Client: open active vers le port P du
serveur à partir du port N+1
– “Sens interdit” respecté dans le cas du
Firewall
– Table de association de NAT respecté en
cas d’addresse intranet du client
TCP
ports
21
N N+1
P>1023
SYN
SYN+ACK
ACK
PASV
Connexion
de control
PORT P
SYN
SYN+ACK
Connexion
data
ACK
References
• Mandatory readings
– S. Alckok, R. Nelson, Application flow control in YouTube video
streams, ACM SIGCOMM CCR Vol. 41, No. 2, April 2011
http://dl.acm.org/citation.cfm?id=1971166
– A. Finamore et al. YouTube everywhere: Impact of Device and
Infrastructure Synergies on User Experience, ACM IMC’11, Nov 2011
http://docs.lib.purdue.edu/ecetr/418/
• Optional readinds
– J.C. Mogul, The case for persistent-connections HTTP,
ACM SIGCOMM 1995
http://dl.acm.org/citation.cfm?id=217465
109
30/08/20
110
D. Rossi – RES224
?? || //
110
30/08/20
2.5 eMail: SMTP POP et IMAP
RES 224
Architecture des applications Internet
eMail
dario.rossi
SMTP/POP3+IMAP4
Dario Rossi
http://www.enst.fr/~drossi
RES224
Plan
• eMail
• Architecture
• SMTP + POP/IMAP
• Le protocole SMTP
• Commandes,
• Format de messages
• Apercu de POP/IMAP
• SPAM et email forging
• References
111
30/08/20
112
D. Rossi – RES224
eMail: les principes
• Principes similaires au courrier postale
– Acheminement en fonction de l’adresse de destination
– Incapacité d’authentifier l’adresse de l’expéditeur
– Système asynchrone décorrélation entre l’E/R
• L’émetteur n’attend pas la disponibilité du récepteur
• Le récepteur consulte sa boîte à son rythme
– Pas de garantie de remise des messages
• Acquittement sous le contrôle du récepteur
• Pros/contres
–
–
–
–
Plus rapide que le courrier papier
Meme pas besoin d’un timbre (mais d’un PC et forfait ISP)
SPAM (debut 2000, plus de SPAM que de eMail legitime)
Usage excessif (cause de improductivité)
eMail: Architecture
Message Handling System (MHS)
SMTP
Message Transport System (MTS)
MTA
Message
Transfer
Agent (MTA)
MTA
UA
POP3/IMAP4
MS
MTA
User
Agent (UA)
UA
MS
MTA
UA
112
MS
Mailbox
Storage (MS)
MTA
MS
UA
30/08/20
2.5 eMail: SMTP POP et IMAP
eMail: Architecture
•
•
Agents utilisateurs
– Composition, édition de mail
– Envoye du message vers
server (protocole SMTP)
– Lecture des mails (POP/IMAP)
– Thunderbird, Outlook, elm,
pine, etc.
in
out
Daemons
– Serveur de mail SMTP
– File d'attente des messages
sortants
– Boite à lettre des messages
entrant pour les usagers
locaux
SMTP
Internet
in
out
eMail: Architecture
1) Alice utilise son mailer pour
composer un message; le mailer
d’Alice envoie le message à son
serveur de mail, le message est
mis dans la file de sortie
2) Le serveur de mail effectue un
DNS MX lookup pour trouver le
serveur destinataire
2.
1.
SMTP
DNS MX
3.
SMTP
3) Le côté client du serveur SMTP
ouvre une connexion TCP avec le
serveur mail de Bob, et lui envoye
le message d’Alice
4) Le serveur mail de Bob place le
message dans la boite aux lettres
de Bob, qui consulte la boite avec
POP/IMAP
4.
Internet
POP/
IMAP
113
30/08/20
114
D. Rossi – RES224
Simple Mail Transfer Protocol (SMTP)
Simple Mail Transfer Protocol
• Defini dans RFC 821, 822,
• Upgradés par RFC 2821, 2822, ....
5321 (Octobre 2008)
• Utilisation du port TCP 25
• Example d’implementation:
sendmail, postfix
Interet
• Transfert fiable du message depuis
un client vers un serveur
– Non seulement au niveau TCP,
mais au niveau application
• Transfert direct entre le serveur
émetteur et le serveur récepteur
– Historiquement, transfer
en plusieurs etapes
Characteristiques
• Paradigme Client-server
– Client: émetteur de mail
(PUSH)
– Serveur: récepteur de mail
– A chaque etape, le
serveur deviens client
• Protocole à etat
– handshake (établissement
de la connexion)
– transfert des messages
– Fermeture de la connexion
• Interaction commande / réponse
– Commande : texte ASCII
– Réponse : code d'état + phrase
• Format de messages
– 7bit ASCII
Envoi en plusieurs etapes
Relay
•
•
•
Gateway
•
•
114
Dans quels cas on a besoin de
plusieurs etapes ?
• Apres tout, grace au DNS on
connait le destinataire finale..
Source routing (deprecated)
Serveur SMTP peu puissants
• e.g., ne re-essayent pas la
trasmission, mais deleguent
par default à un Relay SMTP
Different protocoles
• E.g., il faut passer par un
Gateway qui connait les deux
protocoles (SMTP et XYZ)
A chaque passage, il y a une
delegation de responsabilité
pour la fiabilité de bout en bout
30/08/20
2.5 eMail: SMTP POP et IMAP
Protocole SMTP: requetes
•
•
Les requêtes (en ASCII) se terminent par CR LF
Requêtes minimales supportés par toutes les implantations
– HELP
: Renvoie les commandes disponibles
– HELO domaine
: Identification du domaine
– MAIL FROM expéditeur
: Identifie l'expéditeur par son adresse
– RCPT TO récepteur
: Identifie le récepteur par son adresse
– DATA
: Début du corps du message (se
termine par un '.' sur la première
colone seul sur une ligne)
– RSET
: Reset
– VRFY
: Vérifier l'adresse d'une personne
– QUIT
: Fin
Protocole SMTP: reponses
• Les réponses sont composées de trois digits :
–
–
–
–
2XX : Réponse positive
3XX : Réponse positive intermédiaire
4XX : Réponse négative transitoire
5XX : Réponse négative définitive
• Les deux derniers digits précisent le code retour; examples:
–
–
–
–
–
–
220 Service ready
251 User not local the message will be forwarded
354 Start mail input
452 Command aborted; insufficient storage
500 Syntax error; unrecognized command
551 User not local
115
30/08/20
116
D. Rossi – RES224
Example de session SMTP
S:
C:
S:
C:
S:
C:
S:
C:
S:
C:
C:
C:
S:
C:
S:
220 hamburger.edu
HELO crepes.fr
250 Hello crepes.fr, pleased to meet you
MAIL FROM: <[email protected]>
250 [email protected]... Sender ok
RCPT TO: <[email protected]>
250 [email protected] ... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
Do you like ketchup?
How about pickles?
.
250 Message accepted for delivery
QUIT
221 hamburger.edu closing connection
Escape
• Si on écrit un mail avec des lignes qui ne contiennent
qu’un seul point “.”, comment on peut faire ?
– SMTP va interpreter cela comme la fin du message
• Escape!
– En emission:
si la ligne edité par l’emetteur commence par “.”,
on y ajoute un “.” supplementaire au debut
– En reception:
si la ligne commence par un “.”, on l’enleve;
si il ne reste que “CRLF”, c’est bien la fin de DATA
• Trick commun avec les protocoles de niveau 2
– HDLC, PPP
116
30/08/20
2.5 eMail: SMTP POP et IMAP
Format de mail
Format du courriel echangé par SMTP
• Definit dans RFC 822
•
Lignes d'en-tête, (différentes
des commandes SMTP) ex :
– To :
– From :
– Subject :
•
Corps du message
– le texte (et piece jointes) en
caractères ASCII uniquement
•
Pas de besoin de CRC, Checksum…
pourquoi?
En-tête
Ligne
vide
Corps du message
Formats de mail – RFC 822
•
Pour le protocole de mail, pas de difference entre To/Cc:
– Difference psychologique pour les usagers
– Difference entre To/Cc et Bcc (ce dernier est caché au 1ers)
•
From/Sender peuvent differer (e.g., Boss/Secretaire )
117
30/08/20
118
D. Rossi – RES224
Formats de mail– RFC 822
•
•
X- « custom » headers
– X-Confirm-reading-to: « the sender wish to be notified… »
Maintenant beaucoup de header lines pour spam, virus
– X-spam-assassin, X-virus-guardian, …
Exemples d’entete
118
30/08/20
2.5 eMail: SMTP POP et IMAP
Exemples d’entete
119
30/08/20
120
D. Rossi – RES224
Comme commande ou dans l’entete?
• “MAIL FROM” vs “From:”, “RCPT TO” vs “To:”
– Pourquoi en double ?!
• Destinataire multiples ([email protected], @b.com, @c.com)
– Le meme message est enovyé à plusieurs destinataires par different
commandes RCPT TO vers different serveurs
– Comment pourrait [email protected] savoir de b.com et c.com?
• Mailing list,
– on veut par exemple qu’en cas d’erreur, un message soit addressé à
l’administrateur de la liste et non pas à l’emetteur du message
– Le serveur SMTP en reception substitue la commande MAIL FROM:
avec l’administrateur, les erreurs en phase de transmission lui seront
addressé
– L’entete From: reste unchangé, les appartenant à la liste pourront
repondre à celui qui a ecrit le message
120
30/08/20
2.5 eMail: SMTP POP et IMAP
Format de mail: envoi de donnés non ASCII
• Le corps du mail ne contient que
de caractères ASCII
– Alphabets avec accents (Français,
Allemand)
– D’autres alphabets (Juif, Russe)
– Pas d’alphabet du tout (Chinois,
Japonais)
– Qu’est ce que c’est un alphabet ?
(audio, images, binaire)
• Pour envoyer de données
multimédia ?
– Multi-purpose Internet Mail
Extensions (MIME)
– Défini et mis en service en 1993,
RFCs 1341,2045,2049
– Le but du MIME est de coder
tout ‘attachement’ en caractères
ASCII
En-tête
Corps du message
α
β
Format de message: MIME
•
Lignes supplémentaires dans l'en-tête
du message pour déclarer un contenu
de type MIME
MIME version
Méthode utilisée
pour coder les données
tTpe, sous-type
des données multimédia
(déclaration de paramètres)
Erom: [email protected]
To: [email protected]
Subject: Picture of crepes.
MIME-Version: 1.0
Content-Transfer-Encoding:
base64
Content-Type: image/jpeg
base64 encoded data .....
.........................
.........................
.........................
.........................
Données codées
......base64 encoded data
121
30/08/20
122
D. Rossi – RES224
Format de message: MIME
Types MIME definis à l’origine
Remarque:
Lorsque un nouveau
format de fichier
apparait (e.g., Flash,
3DTV) apparait, il
suffit de declarer un
nouveau type/soustype MIME
Format de message: MIME
From: [email protected]
To: [email protected]
Subject: Picture of yummy crepe.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=StartOfNextPart
Dear Bob, Please find a picture of a crepe.
--StartOfNextPart
Content-Transfer-Encoding: base64
Content-Type: image/jpeg
base64 encoded data .....
.........................
......base64 encoded data
--StartOfNextPart
Do you want the reciple?
--StartOfNextPart--
122
30/08/20
2.5 eMail: SMTP POP et IMAP
Problemes de SMTP
• Mail storm
– Reference circulaire dans les mailing lists
• l’addresse de la mailing list B est dans la mailing list A
et l’addresse de la mailining list A est dans B…
• Solution: limiter le nombre de “passages” de serveur en serveur
(max 100 “Received:” champs dans l’entete)
– Ca vous rappelle quelque chose ?
• Difference d’implementation (timeouts)
– Considerez les cas: A->B , A->Relay->B
– En cas de problemes transitoires à B, si A re-essaye
pendant
un temps TA, et Relay essaye pendant TR < TA
– On a des niveaux de fiabilité differents
SMTP vs HTTP
Comparaison avec HTTP
Characteristiques SMTP
•
Client
– HTTP client: PULL
– SMTP client: PUSH
•
•
Interaction requete/reponse
Connexion
– utilise des connexions persistantes
(plusieur messages entre serveurs)
– Moins de socket, moins de slowstart, plus de debit sur des gros
volumes (imaginez Aol & Gmail)
•
Messages
– demande que les messages (entête ET corps) soient en ASCII
– Certaines chaînes de caractères ne
sont pas autorisées dans les
messages (ex : CRLF.CRLF)
– Le serveur SMTP utilise
CRLF.CRLF pour reconnaître la
fin du message
– Les messages doivent alors être
codés (généralement en base-64)
– Les 2 ont des interactions
commande/réponse ASCII
et des codes d'état
•
Encapsulation
– HTTP : chaque objet est encapsulé
dans son propre message de
réponse
– SMTP : Un message contenant
plusieurs objets est envoyé dans
un seul message "multipart"
123
30/08/20
124
D. Rossi – RES224
eMail : Protocoles d'accès
•
•
SMTP : livraison/stockage chez le serveur en réception
POP/IMAP/HTTP: protocoles d'accès mail depuis le destinataire
– Post Office Protocol (POP3, RFC 1939)
• Simple: authentication et téléchargement
• Protocole avec etat, peu d’état associé aux messages (lu/non lu)
– Internet Message Access Protocol (IMAP4, RFC 1730)
• Plus de fonctionnalités (plus complexe)
• Manipulation sophistiqué de messages stockés sur le serveur
– HTTP
• Service Webmail type Gmail Hotmail , Yahoo! Mail, etc.
2.
1.
SMTP
DNS MX
4.
Internet
3.
SMTP
POP/
IMAP
Protocole POP3
•
POP3 : Post Office Protocol (version 3)
– Retrait de message sur un serveur de messagerie
– RFC 1939, TCP port 110
– Position du côté de l’UA (Thunderbird, Outlook,…)
– Supporté par tous les serveurs de messagerie
•
Characteristiques
– Paradigme Client/Server de type Requete/Reponse
– Direction des donnés: de MTA vers UA (PULL)
– Format PDU en ASCII
– Les messages sont gérés et manipulés localement coté UA
– Au plus, on peut laisser une copie des messages dans la MS du server
– Peux flexible: un message ne peut être lu que s’il a été récupéré entierement
124
30/08/20
2.5 eMail: SMTP POP et IMAP
Protocole POP3
Phase d'autorisation
•
Format requête
cmd arg1 arg2
Format réponse
-ERR arg1 arg2 …
+OK arg1 arg2 …
Protocole à état
(3 phases):
– Authorization
– Transaction
– Update
•
•
•
Commandes
Authorization
USER nom
PASS passws
QUIT
Transaction
STAT
LIST [msg]
RETR msg
DELE msg
NOOP
RSET
QUIT
S:
C:
S:
C:
S:
+OK POP3 server ready
user alice
+OK
pass hungry
+OK user logged on
Phase de transaction
C:
S:
S:
S:
C:
S:
S:
C:
C:
S:
S:
C:
C:
S:
list
1 498
2 912
.
retr 1
<message 1 contents>
.
dele 1
retr 2
<message 1 contents>
.
dele 2
quit
+OK POP3 server off
Protocole IMAP4
•
IMAP 4 : Internet Message Access Protocol (ver. 4)
– Acces au serveur de messagerie
– TCP port 143, RFC 2060
– Position du côté de l’UA (Thunderbird, Outlook,…)
– De plus en plus supporté par les serveurs de messagerie (Gmail depuis ~2009)
•
Characteristiques
– Paradigme Client/Server de type Requete/Reponse
– Direction des donnés: de MTA vers UA (PULL)
– Format PDU en ASCII
– Initialement conçu pour les stations ayant peu de ressources
=> maintenant bien adapte pour les usageres en mobilité
– Gestion, stockage, état du côté serveur
– Etat associé aux messages (vu, effacé, réponse faite, dossiers, tags)
– Flexible, IMAP4 donne accès aux entêtes des messages
• e.g., entetes telechargés d’abord; corps lorsque on selectionne un message;
les pièces jointes ne sont telechargé que si demandé par l’usager
125
30/08/20
126
D. Rossi – RES224
Protocole IMAP4
Commande
Réponse attendue Description
LOGIN
OK LOGIN
domaine_nt/compte_nt/
alias mot_de_passe
Se connecte à la boîte aux
lettres.
SELECT dossier
Mode du dossier &
OK SELECT
Sélectionne un dossier à
afficher.
FETCH n°_de_message
Texte du message & Récupère le message à l’aide du
OK FETCH
n° de message.
STORE Indicateurs de
message \indicateur
OK STORE
EXPUNGE
OK
Supprime tous les messages
marqués.
LOGOUT
OK
Termine la session.
Marque un message en vue de
sa suppression ou pour lui
affecter l’état Lu/Non lu.
IMAP4 vs POP3
126
30/08/20
2.5 eMail: SMTP POP et IMAP
SPAM
• Attaques sur la messagerie
– Messages non sollicités ou SPAM
• Types d’attaques
– Email bombing
• Emission du même message n fois vers un récepteur
– Email spamming
• Emission du même message vers n récipient
– Email spoofing (forging)
• Emission d’un message avec un identifiant usurpé
– Email phishing
• Email Spamming + Email Spoofing + Fausse URL
• Note: faux URL parfois bien caché !
– Email infection
• Email Spamming + Email Spoofing + malaware
(virus/backdoor/trojan…)
127
30/08/20
128
D. Rossi – RES224
SPAM
• Quel emetteur ?
–
–
–
–
–
–
–
Usurpation des adresses (ip, usagers, serveurs)
Usurpation des comptes utilisateurs
Usage des maileurs officiels et gratuits (yahoo, voila,hotmail, …)
Usurpation des ressources de serveurs relais
Outil de scan de plage d’adresses pour trouver un service sur le port 25
Configuration de compte en forwarding
Mise en place de serveur et domaine officiel pour une courte durée
• Quel recepteur ?
– Récolte des adresses email par Web, news, irc, salon, exploration des
domaines, robots, listes de discussions
– Achat/échanges de listes
– Sous-traitance, Marketing officiel
SPAM
• Consequences
– Pour le réseau:
•
•
•
•
Saturation et occupation de ressources (bande passante réseaux)
Etablissement de connexion TCP pour le serveur de messagerie
Encombrement serveur (beaucoup de messages inutiles à traiter)
Encombrement mailbox client
– Pour les usagers:
•
•
•
•
Nuisance
Détournements des informations et code malicieux
Non conformité aux mœurs
Impact économique
– En debut 2000s SPAM > mail normal
– En 2010, SPAM enormement reduit (~100 botnets)
128
30/08/20
2.5 eMail: SMTP POP et IMAP
SPAM
• Efficacité du SPAM ?
– SPAM au large (e.g., botnets) pas rentable
– Social SPAM (e.g., postcards envoyé sur comptes hackés,
address books personalisé) plus efficace
• Rentabilité du business model ?
– Cfr. Additional readings
SPAM
• Solutions
– Filtrage
• Filtrage sur l’entête (Blacklist, Whitelist, Greylist, …)
• Filtrage sur le contenu (filtres Bayesien, de Bloom, Support Vector Machines ou
technique d’apprentissage/reconnaissance d’images)
• Inconvénient: les faux positifs
– Authentification
• Domains Key, S/MIME,
• Inconvénient: infrastructure de distribution de clés,
– Identification
• Sender Policy Framework (use de DNS RR, cfr. slide suivant),
• Incovenient: necessite d’application globale
– Induire un coût à l’émetteur
• Cout en $ par message
• Inconvénient: pénalise tout le monde et nécessite
de modifier les protocoles sous jacents.
– Bloquer les compte banquaires des spammers
• 3 banques utilisés par >90% de spammers
• Inconvenient: problème de regulation (pas technique)
129
30/08/20
130
D. Rossi – RES224
Email forging
• Don’t try this at home
– Btw, the server knows who you are :)
• Sender Policy Framework (SPF)
– RFC4404 (april 2006)
– Le DNS est deja utile pour trouver le serveur
responsable de recevoir les mails @example.com
• example.com. MX mail.example.com
– Le DNS peut etre utilisé aussi pour definir qui est
authorisé à envoyer les mail de @example.com
• example.com. TXT "v=spf1 +mx a:mail.example.com -all“
– En reception, query TXT d’ example.com.
• Rejection d’un message de la part de [email protected]
envoyé par d’autre serveur que mail.example.com !
Pas besoin d’outils compliqués… il suffit un terminal
Email forging
• Don’t try this at home
– Btw, the server knows who you are :)
• Sender Policy Framework (SPF)
– RFC4404 (april 2006)
– Le DNS est deja utile pour trouver le serveur
responsable de recevoir les mails @example.com
• example.com. MX mail.example.com
– Le DNS peut etre utilisé aussi pour definir qui est
authorisé à envoyer les mail de @example.com
• example.com. TXT "v=spf1 +mx a:mail.example.com -all“
– En reception, query TXT d’ example.com.
• Rejection d’un message de la part de [email protected]
envoyé par d’autre serveur que mail.example.com !
130
30/08/20
2.5 eMail: SMTP POP et IMAP
Conclusions
• eMail
– La plus vieille “Killer application” d’Internet
– Architecture plus compliqué, 3 protocoles
– Encore d’actualité (cepandant Facebook et similia)
– Comme les autres applications Internet, evolution
continue (e.g., SPAM comes and goes) et usages
inattendus (IMAP)
References
• Optional readings
– J. Caballero, C. Gier, C. Kreibich and V. Paxon,.
Measuring pay-per Install, The commoditization of
Malware Distribution , USENIX Security Simposium
Aug 2011 ,
http://www.icir.org/vern/papers/ppi-usesec11.pdf
– C. Kanich, et al. Show me the money: characterizing
spam-advertised revenue, USENIX Security Simposium
Aug 2011 ,
http://www.icir.org/vern/papers/ppair-usesec11.pdf
131
30/08/20
132
D. Rossi – RES224
?? || //
132
30/08/20
2.6 Les applications P2P: Introduction et DHT
RES 224
Architecture des applications Internet
Peer-to-peer
dario.rossi
Dario Rossi
http://www.enst.fr/~drossi
RES224
Agenda
• Introduction on P2P
–
–
–
–
Recap on client-server vs P2P paradigms
Interest of P2P paradigm
P2P networks and Overlay graphs
(vanilla) taxonomy of P2P applications
• Finding content (today)
– Napster, Gnutella, DHTs (Chord)
• Diffusing content
– BitTorrent, P2P-TV
• Transmission strategies:
– Skype and BitTorrent congestion control
133
30/08/20
134
D. Rossi – RES224
Client-server paradigm
• Clients:
• Server:
– Runs on end-hosts
– Runs on end-hosts
– On/off behavior
– Always on
– Service consumer
– Service provider
– Issue requests
– Receive services
– Do not communicate
directly among them
– Satisfy requests from
many clients
– Need to know the server
address
– Need a fixed address (or
DNS name)
Client-server paradigm
N-1
N
3
Server S distributes
content to clients 1..N
2
1
134
S
30/08/20
2.6 Les applications P2P: Introduction et DHT
Client-server paradigm
N-1
N
3
Server has to provide all the
necessary upload bandwitdth
2
1
S
Peer-to-peer paradigm
• Peers:
– Runs on end-hosts
– On/off behavior
A
– Service providers and consumers
B
C
– Communicate directly among them
D
– Need to discover other peers
– Need to define communication rules
?
E
– Need to handle peer arrival and departure (churn)
135
30/08/20
136
D. Rossi – RES224
Peer-to-peer paradigm
N-1
N
3
Peers may assist S using
their upload bandwidth
2
1
S
Peer-to-peer paradigm
N-1
N
S
3
Peers-2-peer network do not
necessarily need a server
2
1
136
Notice that:
•Servers are typically
needed for bootstrap
•Servers aren’t needed
for resource sharing
0
30/08/20
2.6 Les applications P2P: Introduction et DHT
Client-server vs Peer-2-peer
• Interest of P2P
N-1
N-1
N
3
3
2
2
1
S
N
1
S
What is the minimum download time under either paradigm ?
Client-server
N-1
F-bits long file
1 server
Us server upload rate
N clients
Di download rate of the i-th client
Dmin = min( Di ) , slowest client
Ti download time of the i-th client
T = max( Ti ), system completion time
N
3
2
1
(1) T >= F / (Us/N) = NF / Us
(2) T >= F / Dmin
S
Assuming simple fluid model, server policy
is to give each of the N peers an equal share
of its rate Us/N, what is T equal to ?
i.e., the download cannot be faster than the
share of the server upload capacity allows
i.e., the download cannot be faster than the
downlink capacity of the slowest peer allows
T >= F/min( Us/N, Dmin )
137
30/08/20
138
D. Rossi – RES224
Peer-2-peer
N-1
F-bits longffile
1 source peer (having the content at time t=0)
Us source peer upload rate
N sink peers
Ui,Di upload & download rate of the i-th peer
Dmin = min( Di ) , slowest peer
Ti download time of the i-th client
T = max( Ti ), system completion time
N
3
2
1
S
Assuming simple fluid model, source gives bits
to peers at rate Si, each peer replicate received
data toward the others N-1 peers at rate <= Si
(1) T >= F / Us
(2) T >= NF /( Us + ΣUi )
i.e., no peer can receive faster than the source can send
i.e., the overall data cannot be downloaded faster than
the aggregated system capacity (with peers) allows
(3) T >= F / Dmin
i.e., the download cannot be faster than the downlink
capacity of the slowest peer allows
T = F/min(Us, (Us + ΣUi)/N, Dmin)
Client-server vs Peer-2-peer
File diffusion time Tmin (sec)
•Interest of P2P
•P2P protocols can offload server capacity, allowing better scaling
•Example with file diffusion, conclusion hold for many services
800
700
600
500
400
300
200
100
0
peer-2-peer
client-server
File size F=10MB,
Peer upload Ui=500 Kbps,
Source upload Us=10 Mbps
Peer download Di >> Ui
10 20 30 40 50 60 70 80 90 100
Number of peers/clients N
138
30/08/20
2.6 Les applications P2P: Introduction et DHT
P2P Overlays
• P2P network is a graph, where edges represent
logical connections between peers
• Logical connection:
– Ongoing communication via TCP , UDP sockets
– Reachability with application layer routing tables
P2P Overlays
• P2P networks are commonly called Overlay
as they are laid over the IP infrastructure
Notice that:
•Not all logical links
are also physical
•Not all physical links
are used
AA
B
Application
Overlay
C
IPa
IPb
IP Network
IPc
139
30/08/20
140
D. Rossi – RES224
P2P Overlays
• P2P overlay graphs are very much different, and
depend on implementation choices
App1
App2
P2P Overlays
• Peers
– Come and go due to their will or due to failures
– Have/have not the resources you are looking for
– May/may not be willing to give the resource
• Challenges
–
–
–
–
140
Effectively locate and share resources
Be resilient face to churn, failures
Scale to possibly several million users
Incentivate peers participation to the system
30/08/20
2.6 Les applications P2P: Introduction et DHT
P2P services
• P2P software spans a fair range of services
– File-sharing and content distribution
– Voice/video call
Filesharing
Live TV / VoD
BitTorrent eDonkey
SopCast TVAnts PPLive Joost
– Television/VoD
– CPU sharing
– Indexing and search
VoIP/Chat
• Structure
Search
SW Updates
– Structured
– Unstructured
– Hierarchical
Skype Gtalk
Kademlia
RapidUpdate
DebTorrent
Apt-p2p,…
– Structure is not tied to a particular service
Unstructured Overlay
• Peers
– arbitrarily connected
• Lookup
– typically implemented
with flooding
• Pro/Cons
– Easy maintenance, diffusion
– Highly resilient
– High lookup cost
• Examples
– BitTorrent, P2P-TV systems, Gnutella (up to v0.4)
141
30/08/20
142
D. Rossi – RES224
Hierarchical Overlay
• Peers and super-peers
– Ps connect to super-Ps
– Super-Ps know Ps resources
• Lookup
– Flooding restricted to super Ps
• Pros/cons
– Lower lookup cost,
– Improved scalability
– Less resilient to super-Ps churn
• Examples
– Skype, eDonkey, Gnutella (v0.6 on)
Structured Overlay
• Peers
– Carefully connected shaped
as a precise overlay topology
• Lookup
– Overlay routing algorithm
are not based on flooding
– Takes advantage of regularity
• Pros/cons
– Effective lookup & great scalability
– Complex maintenance
• Examples
– Distributed Hash Tables (DHTs)
142
30/08/20
2.6 Les applications P2P: Introduction et DHT
P2P problems
– Lookup and routing
• Find the peers having the resource of interest
(e.g., a file or a contact for messaging/calls)
– Diffusion and scheduling
• Ensure timely and effectively diffusion of a
resource (e.g., a file, or a TV stream).
– Transport and congestion control
• Interactive, near real time applications (Skype)
• Background applications (e.g., BitTorrent)
P2P Lookup
143
30/08/20
144
D. Rossi – RES224
P2P Lookup
• Lookup strategies
– Centralized indexing (Napster)
– Query via flooding (Gnutella)
– Distributed Hash Tables (Chord)
• Note
– For the time being, we do not deal with the problem of
what to do after the resource is located on the overlay
Lookup via Centralized Indexing
• Centralized server
– keeps list L of peers ressources
L
• Peers
– keep the ressources R
– notify ressources to server
• at startup
• anytime a new ressource
is added or deleted
144
R
30/08/20
2.6 Les applications P2P: Introduction et DHT
Lookup via Centralized Indexing
• Example
– peer A search for ressources R
stored at peer B
1:R?
– A ask server S location of R
– S reply to A with address of B
2:B!
5:R!
– A contact B and fetches R
A
– A notifies S he owns R
3:R?
B
4:R
– S updates ressources list
Lookup via Centralized Indexing
• Example (2)
– peer C search for ressources R
now stored at peers A, B
1:R?
– C ask server S location of R
2:A,B
– S reply to C with addresses A, B
– C selects best peer among A,B
A
• probes delay, bandwidth, ...
probe
C
B
– C gets R, then notifies S, which updates list L
145
30/08/20
146
D. Rossi – RES224
Lookup via Centralized Indexing
• Pro
– Simple architecture
– Only limited traffic to server (query)
– Peer-centric selection of best candidates
• Cons
– Central database not scalable
A
– Single point of failure
C
B
– Single legal entity (Napster was actually shutdown)
Flooding on Unstructured Overlay
• Peers
– each peer only stores its own content
• Lookup
– Queries forwarded to all neighbors
– Flooding can be dangerous
• generate a traffic storm on the overlay
– Need to limit flooding
• drop duplicated query for the same element
• limit flooding depth (application level Time To Live field)
146
30/08/20
2.6 Les applications P2P: Introduction et DHT
Flooding on Unstructured Overlay
• Example
– A look for R, stored at D, F
H
G
• sets TTL=4, send query to B,C
B
R
D
– B,C haven’t the ressource R
C
• forward query to D, E respectively
A
– D has the ressource
R?
E
F
R
• Route the query back on the overlay
– E hasn't ressource,
• Forward to G, H drops: TTL exceeded, query not reach F
Flooding on Unstructured Overlay
• Example
– A look for R, stored at D, F
H
G
• sets TTL=4, send query to B,C
– B,C haven’t the ressource R
B
R
Voilà R! D
C
• forward query to D, E respectively
– D has the ressource
• Route the query back on the overlay
A
R, please
E
F
R
– E hasn't ressource,
• Forward to G, H drops: TTL exceeded, query not reach F
147
30/08/20
148
D. Rossi – RES224
Flooding on Unstructured Overlay
• Pros
– Query lookup is greedy
– Simple maintenance
– Resilience to fault due to flooding
• Cons
– Flooding sends (too) many messages
– Scalability is compromised
– Lookup may not even find ressources (due to TTL)
Flooding on Hierarchical Overlay
• Peers
– Each peer only stores its own content
• Super-peers
– Indexes the content of the
peers attached to them
• Lookup
– Peers contact their super-peers
– Flooding is restricted to super-peers
148
30/08/20
2.6 Les applications P2P: Introduction et DHT
Flooding on Hierarchical Overlay
• Pros
– Lookup is still greedy and simple
– Efficient ressource consumption
– Hierarchy increases scalability
• Cons
– Increased application complexity
– Less resilient to super-peers churn
– Need to carefully select super-peers
Lookup on Structured Overlay
• Peers
– Arranged on very specific topologies
– Implement topology-dependant
lookup to exploit overlay regularity
• Indexing
– Peers and ressources are indexed
– Structured overlay implement a Hash semantic
• Offer insertion and retrieval of keys
• For this reason, are called Distributed Hash Tables
149
30/08/20
150
D. Rossi – RES224
Lookup on Structured Overlay
• Indexing idea
– take an arbitrary key space;
e.g., real in [0,1]
ressource
– take an arbitrary map function
e.g., colorof(x)
– map peers to colors
λpeer = colorof(peer)
peer
– map ressources to colors
λressource = colorof(ressource)
peer
– assign ressources to closest
peer (in terms of λ)
Lookup on Structured Overlay
• Indexing in practice
(e.g. the Chord DHT)
– Secure Hash SHA1 (B bits)
ID=SHA1(x) in [0,2^B-1]
– Peers are assigned
peerID = SHA1(peer)
– Ressources are assigned
fileID = SHA1(file)
– File is inserted at the peer
whose peerID is closer to
(but not greater than) fileID
– Peers are responsible for
a portion of the ring
150
P2m-1
P1
P8
P14
F10
F15
F38
F50
P32
P56
P38
P51
P48
30/08/20
2.6 Les applications P2P: Introduction et DHT
Lookup on Structured Overlay
• Lookup for a fileID
P2m-1
P1
P8
– Simplest lookup possible
P14
• Every node uses its successor
• Successor pointer needed for
lookup correctness
P8.Lookup(F56)
– Highly inefficient
P21
successor
• Linear scan of the ring
P32
– Longer-reach pointers
P38
• can be used to improve
lookup efficiency
P51
P48
P42
Lookup on Structured Overlay
P2m-1
– Keep a list of log(N) ``fingers''
P1
P8
• pointers to peers at
exponential ID space distance
successor
P14
• 1st finger is at peerID + 2^1
P21
• 2nd finger is at peerID + 2^2
• k-th finger is at peerID + 2^k
• if peerID + 2^k does not exist,
take closest following peer
P32
P56
P38
P51
P48
P42
151
30/08/20
152
D. Rossi – RES224
Lookup on Structured Overlay
fingers
P2m-1
– Keep a list of log(N) ``fingers''
• pointers to peers at
exponential ID space distance
• 1st finger is at peerID + 2^1
•
2nd
finger is at peerID + 2^2
k
1
2
3
4
5
6
P1
P8
finger
+1
N14
+2
N14
+4
N14
+8
N21
+16
N32
+32
N42
+1
+2
P14 +4
+8
P21
+16
• k-th finger is at peerID + 2^k
• if peerID + 2^k does not exist,
take closest following peer
P56
P32
+32
P51
P38
P48
P42
Lookup on Structured Overlay
• Lookup for a fileID
– Idea: make as much progress
as possible at each hop
P2m-1
P1
P8
• Greedy algorithm
P14
– so, choose the finger whose ID
is closest to (but strictly <) fileID
P8.Lookup(F56)
P21
– next hops do the same
– intuitively, distance from
fileID halves at each step
P56
P32
(dichotomic search)
P51
P38
P42
P48
– consequently, mean lookup length
is logarithmic in the number of nodes
– lookup can be recursive or iterative
152
30/08/20
2.6 Les applications P2P: Introduction et DHT
Lookup on Structured Overlay
• Pros
– ``Flat'' key semantic
– Highly scalable
– Tunable performance
• State vs lookup length tradeoff
• Cons
– Much more complex to implement
– Structure need maintenance (churn, join, refresh)
– Difficult to make complex queries (e.g., wildcards)
References
• Optional readings
– I. Stoica et al. Chord: A scalable peer-to-peer lookup service
for Internet applications, ACM SIGCOMM 2001
http://dl.acm.org/citation.cfm?id=964723.383071
153
30/08/20
154
D. Rossi – RES224
Conclusions
• Peer-2-peer systems
– Class of massively distributed, hence scalable,
applications
– Peers play both client and server roles
– Application level network, overlayed on IP
– May offer a number of different services
(lookup, download, VoIP, video, social networking etc.)
– Significant differences between different systems!
– Very fast-paced limitedly-controllable evolution:
Skype was yesterday! today is PPlive… tomorrow!?
– Operators are now interested (not only for law-suit!)
?? || //
154
30/08/20
2.7 Echange de données en P2P: BitTorrent
RES 224
Architecture des applications Internet
P2P: BitTorrent
dario.rossi
Dario Rossi
http://www.enst.fr/~drossi
RES224
Plan
• Acteurs et fonctionnement général
– Swarm, torrent, tracker, seed, leecher, chunks, ...
• Algorithmes et performance
– Strict priority, Rarest first, Tit for tat, Anti-Snubbing,
End-game mode, Chocking/Unchocking, …
– Note: On ne regardera ni le type ni le format de messages
• Bilan
– Efficacité vs complexité
• References
• LEDBAT: control de congestion BitTorrent
– Pas objet du controle
155
30/08/20
156
D. Rossi – RES224
Principes du partage en P2P
•
•
Plusieurs méthodes pour localiser une ressource:
– Méthode distribuée
• Contact direct entre chaque utilisateur
• Gnutella, eDonkey, Skype, ...
– Méthode centralisée
• Contact établi par une entité
• Napter, Bittorrent , ...
Plusieurs méthodes pour rapatrier la ressource:
– Méthode atomique
• Peu efficace: ressource rapatriée en entier à partir d'un seul peer
• Napster, Gnutella (initiellement), …
– Méthode parcellisé
• Très efficace : Ressources rapatriées de plusieurs peers (en parallèle)
• BitTorrent, eDonkey, P2P-TV,…
Principes de BitTorrent
Leecher
File chunks
Chunk transmission
Messages over:
•TCP (old)
•UDP/LEDBAT (new)
156
30/08/20
2.7 Echange de données en P2P: BitTorrent
Terminologie
• Torrent
– Point de depart
– Fichier descriptif de la ressource, contenant aussi l’addresse
IP et numero de port TCP d’un tracker
• Tracker
– Bootstrap server
– Elément central qui permet aux peers de prendre contact
les uns avec les autres (ne possède pas la ressource)
• Swarm
– Ensemble des peers participants au partage d’une meme
ressource
– Consititué par plusieurs types de Peers
• Leecher : Peer qui télécharge
• Seed : Peers qui upload seulement (possède une copie complète)
Terminologie
• Parcelisation
– Permets plusieurs échanges simultanées:
• Upload des chunks dès qu’on les possède
• Download les chunks qu’on ne possède pas
– Le fichier partagé est divisé en plusieurs parties égales
• Appelées Chunks (typiquement 256Ko)
– Divisés à leur tour en plusieurs morceaux
• Appellés Pieces (15Ko)
157
30/08/20
158
D. Rossi – RES224
.torrent
• Fichier .torrent
– Necessaire par un peer pour participer au partage d’une ressource
(e.g., fichier de données) avec BitTorrent
– Localisé avec une methode indépendante de BitTorrent
(en général stoqué sur serveur Web)
• Ce fichier contient:
– Liste des chunks (pour savoir qu’est qu’il faut telecharger)
– Mapping entre les différents chunk (pour reconstruire le fichier
ensuite)
– Checksum des chunk (pour en vérifier l’intégrité)
– L’identifiant d’un tracker (adresse IP/numéro de port)
• Multi-tracking
• Trackerless (Kad DHT, Peer Exchange PEX gossiping)
Tracker
•
Role: connaître les adresses IP et numero de ports des peers:
– Restituer une liste partielle de peers pour un torrent
– Aucune préférence, environ 50 peers choisis aleatoirement
• Meilleurs choix sont possibles: e.g., selection basée sur la
proximité des peers, ou distance inter-AS (IETF ALTO)
•
•
158
Ne posséde pas d’information relative:
– Aux chunks que chaque peer possède
– A la capacité d’upload des peers
Les mises à jour avec le Tracker se font par les peers :
– Périodiquement (environ 30 min)
– Au premier contact (bootstrap)
– Quand un peer quitte le systeme
– Quand un peer a besoin de nouveaux peers
30/08/20
2.7 Echange de données en P2P: BitTorrent
Swarm
• Le swarm a un but
– améliorer les performances de téléchargement
• Pour cela, deux objectifs precis:
– Optimiser le temps de téléchargement du fichier
• plus de seed = plus des sources potentielles
– Contraindre les leechers au téléchargement
• Obliger au partage: pas d’upload, pas de download!
• Pour cela, deux moyens
– Le choix des chunks (et des pièces) à télécharger
– Le choix des peers avec lesquels on échange
Algorithmes
• Pipelining:
– Utilisation de requêtes multiples pour soin de
performance: ouverture de plusieurs connections en
parallèle (environ 5)
– Chaque connection telecharge des pieces
• Selection des pieces:
– augmenter la disponibilité (Rarest First) et la rapidité
d’obtention des chunks (Strict Priority, Random First
Piece)
• Selection des peers:
– Tit for tat = résoudre le dilemme du prisonnier
• Travailler pour le bien de tous pour améliorer ses propres
performances (optimistic unchoking)
• Pénaliser ceux qui ne répondent pas à cette règle
(choke, anti-snubbing)
159
30/08/20
160
D. Rossi – RES224
Pieces: Strict Priority
• But:
– Obtenir le plus vite possible des chunks complets
• Fonctionnement :
– Si on envoie une requête pour une pièce d’un chunk
– Alors on devra envoyer des requêtes pour les autres pieces
du meme chunk en priorité aux autre chunks
• Interet
– L’upload n’est possible qu’une fois le chunk
totalement téléchargé
– Enfait, seulement une fois le chunk téléchargé on peut
verifier son integrité et validité (checksum dans le .torrent)
Chunks: Rarest First
• But:
– minimiser les chevauchements (overlap) des chunks
Non chevauchement
Chevauchement
• Interet:
– Equalization des ressources
• pas de “hotspot” i.e., de ressource peu disponible
– Meilleur utilisation de la capacité disponible
• faire en sorte qu’il y a toujours quelque chose à telecharger
• il est moins probable qu’à un certain instant tous les peers
soient interessés par le meme chunk peu (ou pas) disponible
160
30/08/20
2.7 Echange de données en P2P: BitTorrent
Chunks: Rarest First
• Fonctionnement
– Quand un Peer veut envoyer une requête pour un nouveau chunk
– Il demande aux autres Peers voisins les chunk qu’ils possèdent
– Il envoie une requête pour celui qui est le moins présent
• Remarque
– Le choix Rarest First est locale, car un peer ne contacte
que ses voisins
– Au meme temps, comme le voisinage des Peers change, les
choix locales ont tendance à converger vers de choix
globales
Chunks: Rarest First
On contacte les Peers
pour connaître les chunks
qu’ils possèdent et que
nous n’avons pas déjà
(1,2,3,4)
On comptabilise les
disponibilités de chaque
chunk
On choisit aléatoirement
parmi les plus rares :
Random(2, 3)
161
30/08/20
162
D. Rossi – RES224
Pieces: Random First Piece
• Exception des algorithme précédents
– Quand on vient juste de rentrer dans le systeme,
on ne possède pas encore un chunk
– Probleme: on ne peut pas uploader, pour cela
on doit obtenir rapidement un chunk
• Fonctionnement
– Tant qu’on ne possède pas un chunk complet
– Envoie de requête pour des pièces aléatoires de
chunks
Pieces: Endgame Mode
•
•
•
162
Interet:
– Aider des leecher à devenir des seeds, plus de sources dans le
systeme
Problemes
– Malgré Rarest First, certains chunks peuvent rester globalement
plus rare que d’autres (Rarest First est appliqué localement)
– Aussi, en fin de téléchargement, on est plus contraint sur le choix
des chunks à télécharger
Fonctionnement:
– Pour eviter de rester bloqués, meme si toutes les pièces qu’il nous
reste à telecharger sont toutes hautement demandées
– On envoie une requête pour toutes les pièces manquantes
à tous les peers avec lesquels on communique
30/08/20
2.7 Echange de données en P2P: BitTorrent
Peers: Choke/Unchoke
• Probleme
– Si aucun Peer n’upload, alors le partage est impossible
• Solution
– Enforcer la reciprocité des echanges, en arrêtant l’upload vers des
Peers dont on télécharge peu
• Fonctionnement (toutes les 10 secondes)
–
–
–
–
calculer le taux de téléchargement de chaque voisin
conserver les 4 meilleurs
rétablir l’upload vers ceux qui en font à nouveau partie (unchoke)
couper l’upload vers ceux qui ne figurent plus dans le classement
(choke)
Peers: Optimistic Unchoking
• Problemes
– Empêcher qu’on n’échanges que dans un groupe réduit
– Ne pas exclure les nouveaux arrivant
– Permettre de découvrir d’autres peers (possibilité d’obtenir
une meilleur taux d’upload)
• Solution
– Procès avec choix aletoires (avec une frequence plus lente)
• Fonctionnement (toutes les 30 secondes)
– choix aléatoire d’un peer parmi les 4 vers qui on upload
– on lui coupe l’upload (choke)
– on choisit un peer aléatoirement parmi ceux à qui on
upload le moins
– on lui rétablit l’upload (unchoke)
163
30/08/20
164
D. Rossi – RES224
Peer: Anti-snubbing
• Probleme
– Snub = uploader sans télécharger (inverse du free-riding)
– Lié au choke, quand on ne fait pas partie des 4 meilleurs
uploads des peers avec qui on échange
• Solution
– Se créer les condition pour l’optimistic unchoke
(très peu uploader)
• Fonctionnement :
– Si un Peer, a qui on permet l’upload, nous bloque
pendant plus de 1 min (choke)
– On lui coupe l’upload (choke)
– On attend un optimistic unchoke en notre faveur
avant de rétablit l’upload (unchoke)
Peer: Seeding
• Seeding
– Le seul but d’un seed est d’uploader
• Fonctionnement
– Un seul algorithme proche du choke
– Uploader vers ceux auxquels on upload le plus
– Idée: maximiser le transfer vers ceux qui peuvent
devenir seeds plus rapidement
• Remarque
– Ils existent d’autres algorithmes lié au seeding,
– E.g., au debut quand il n’existe qu’un seul seed
(super-seeding)
164
30/08/20
2.7 Echange de données en P2P: BitTorrent
Bilan BitTorrent: Efficacité
• Localisation
– Connaissances simple des Peers qui partagent grâce
•
•
•
•
Au reseau DHT (trackerless)
À une entité centralisée (tracker)
Possibilité de plusieurs trackers au meme temps (multi-tracker)
Possibilité d’utiliser les methodes multi-tracker et trackerless
au meme temps
• Diffusion
– Diffusion rapide grace
• Chunk permettent le pipelining et telechargement en parallel
• Au politiques de choix des chunks et des peers
• A la dissuasion du free-riding (reciprocation et force à l’upload)
Bilan BitTorrent: Complexité
• Beaucoup d'extension
–
–
–
–
–
Plusieurs trackers
Table de hashage (trackless)
Super-seeding
Encryption
LEDBAT congestion control
• Introduite Decembre 2008
• Default Avril 2010
Client
• Beacoup de clients
– µTorrent, Azureus, BitTornado,
BitTyrant, Xunlei …
• Matrice clients vs extension
– Un seul protocole, mais
beaucoup d’implementations
– Importance de la normalisation
(standards)
– BitTorrent Enhancement Protocol (BEP)
Extension
Supporté
Pas supporté
n/a
165
30/08/20
166
D. Rossi – RES224
References
• Mandatory readings
– P. Marciniak et al., Small is not always beautiful, USENIX IPTPS 2008,
Tampa Bay, FL Feb 2008 http://hal.inria.fr/inria-00246564/en/
– A. Legout et al., Clustering and sharing incentives in BitTorrent, ACM
SIGMETRICS, San Diego CA, Jun 2007,
http://hal.inria.fr/inria-00137444/en/
• Optional readings
– C. Testa and D. Rossi, The Impact of uTP on BitTorrent completion time
IEEE P2P 2011, Kyoto Aug 2011
http://www.enst.fr/~drossi/paper/rossi11p2p.pdf
Pas objet du controle
A crash course on LEDBAT, the new
BitTorrent congestion control protocol
Dario Rossi
Joint work with Claudio Testa, Silvio Valenti,
Luca Muscariello, Giovanna Carofiglio, ...
PAM’10
ICCCN’10
LCN’10
Globecom’10
P2P’11
Zurich
+ Zurich
+ Denver
+ Miami
+ Kyoto
+ ...
April 2010 August 2010 October 2010 December 2010 August 2011
166
30/08/20
2.7 Echange de données en P2P: BitTorrent
News from congestion control world
• BitTorrent announces
closed source code,
data transfer over UDP
• BitTorrent + UDP
= Internet meltdown!
• After BitTorrent denial
and some discussion...
• Everybody agreed that
Internet not gonna die
• But is UTP/LEDBAT
the best approach ?
BitTorrent and LEDBAT
• BitTorrent had to explain itself
– Positive side effect of the Internet meltdown buzz
• BitTorrent co-chairs the LEDBAT IEFT WG
– Low Extra Delay Background Transport Protocol
– delay-based protocol
– designed for low-priority transfers
• Protocol goals
– Efficiently use the available bandwidth
– Keep delay low on the network path
– Quickly yield to regular TCP traffic
Novel ingredient: Congestion
control to relieve self-induced
congestion at the access
• Open questions
– Does LEDBAT really achieve its goals?
– What about BitTorrent implementation of LEDBAT?
page 26
167
30/08/20
168
D. Rossi – RES224
LEDBAT evolution
TCP
v5.2.2
Oct ‘08
α1
v1.9-13485
Dec ‘08
α2
v1.9-15380
Mar ’09
β1
v1.9-16666
Aug ‘09
RC1
v2.0.1
Apr’10
Open source
Closed source
First LEDBAT draft
Draft as WG Item
After IETF 77
Simulation
Testbed
Testbed
Testbed
Packet size [Bytes]
1500
α2
β1
TCP
α1
vv1.9-13485
v1.9-15380
1.9-16666
v5.2.2
Mar
Aug
‘09
’09
Oct
Dec ‘08
‘08
Draft
First LEDBAT
assource
WG
Open
source
Closed
draft
Item
1250
1000
750
500
250
0
0
10
20 30
Time [s]
40
50
•TCP transfers, full playload
•α1 small packet overkill !!
•α2 variable framing (not in draft)
•β1 finer bytewise cwnd control
•25 october 2010: IETF draft v3
page 27
Outline
• Analysis of LEDBAT with a twofold methodology:
• Controlled simulation
– LEDBAT algorithm
– TCP vs LEDBAT
– LEDBAT vs LEDBAT
– Fairness issue
– Future Work
• Experimental measurements
– Testbed description
– Implementation evolution
– Bandwidth/Delay
Impairments
– ADSL experiments
– Multiflow experiments
– Future Work
page 28
168
30/08/20
2.7 Echange de données en P2P: BitTorrent
Pas objet du controle
LEDBAT: the new BitTorrent
congestion control protocol
Dario Rossi
Joint work with Claudio Testa
Silvio Valenti and Luca Muscariello
ICCCN’10
Zurich
August 2010
TCP: Loss-based congestion control
•
•
Objective of congestion control
– Limit sender rate to avoid overwhelming the network with packets
TCP detects congestion by means of packet losses
– Increment the congestion window (cwnd) of one packet each RTT
– Halve the congestion upon losses
losses
cwnd
t
– The buffer always fills up -> high delay for interactive applications
169
30/08/20
170
D. Rossi – RES224
LEDBAT: Delay-based congestion control
•
Use Delay as congestion signal (TCP Vegas)
– Increasing delay = increasing queue -> need to decrease rate
LEDBAT monitors the delay on the forward path
– sender and receiver timestamp packets
– sender maintains a minimum over all delays (base delay)
– LEDBAT tries to add just a small fix delay (TARGET)
– a linear controller adjusts the cwnd
•
TARGET
cwnd
TARGET
is reached
• enqueue only TARGET packets
• low extra delay and no loss
t
LEDBAT operations
•
Pseudocode in Draft
1) One-way delay
@RX:
estimation
remote_timestamp = data_packet.timestamp
ack.delay = local_timestamp() - remote_timestamp
ack.send()
@TX:
delay = ack.delay
update_base_delay(delay)
update_current_delay(delay)
2) Base delay =
min(delays)
3) Queuing delay
estimation
queuing_delay = current_delay() - base_delay()
off_target = TARGET - queuing_delay
cwnd += GAIN * off_target / cwnd
4) Linear controller
•
•
170
TARGET = 25ms (“magic number” fixed in Draft)
GAIN = 1/TARGET (our choice compliant with Draft)
30/08/20
2.7 Echange de données en P2P: BitTorrent
Simulation preliminaries
•
•
We implemented LEDBAT as a new TCP flavor in ns2
Simulation scenario
∆T
•
•
B
Link between routers is the bottleneck 10Mb/s, others 100Mb/s
Parameters
– buffer size B
– Time between flow start ∆T
Simulations: LEDBAT vs TCP
CWND [pkts]
80
5,6
40
2
3
4
0
TCP
LEDBAT
Total
1
Queue [pkts]
40
•
•
20
0
1. TCP and LEDBAT start together
2. As soon as q=20pkt -> delay=25ms
-> LEDBAT decreases
3. TCP causes a loss and halves its rate
4. Queue empties, LEDBAT restarts
5. TCP is more aggressive, LEDBAT
always yields
6. Cyclic losses
4
8
Time [s]
LEDBAT is lower priority than TCP
Total link utilization increases (w.r.t
TCP alone)
12
171
30/08/20
172
D. Rossi – RES224
Simulations: LEDBAT vs LEDBAT
80
1. LEDBAT flows start together
2. Queue builds up
3. As soon as q=20pkts -> delay=25ms
-> linear controller settles
4. Queue keeps stable
5. No changes until some other flow
arrives
CWND [pkts]
Total
40
LEDBAT 1
LEDBAT 2
0
Queue [pkts]
40
•
20
•
0
4
8
LEDBAT flows efficiently use
the available bandwidth
The bandwidth share is fair
between LEDBAT flows
12
Time [s]
Fairness issues
•
•
80
What if second flow starts after the first one?
Different view of the base delay -> different TARGET -> unfairness
∆T = 5s, B=40
2nd flow starts before queue builds up
Same TARGET, but the latecomer
gets a smaller share
40
CWND [pkts]
0
80
5
15
25
∆T = 10s, B=40
40
0
80
5
15
25
∆T = 10s, B=100
Same as before but larger buffer
Second gets all, first in stavation
40
0
172
2nd flow starts when first has
reached its TARGET
TARGET_2 = 2 * TARGET_1
after loss, fairness is restored:
base delay estimate is corrected
5
15
Time [s]
25
30/08/20
2.7 Echange de données en P2P: BitTorrent
Simulation wrap-up
•
LEDBAT is a promising protocol
– efficient but yields to TCP
– ... but affected by latecomer advantage
– ... and tuned with magic numbers
•
Further work (not in today talk)
– Sensitivity analysis, i.e., how to select the magic numbers (LCN’10)
– LEDBAT modification to solve latecomer advantage (Globecom’10)
– Overall swarm completion time performance (P2P’11)
– LEDBAT fluid model (submitted to IEEE Transaction on Networking)
– Experiments on real swarm (ongoing)
Pas objet du controle
Yes, we LEDBAT: Playing with the new
BitTorrent congestion control algorithm
Dario Rossi
Joint work with Claudio Testa
and Silvio Valenti
PAM’10,
Zurich
August 2010
173
30/08/20
174
D. Rossi – RES224
Agenda
• Active testbed mesurements
– Single-flow LAN experiments
• Timeline of LEDBAT evolution
• Emulated capacity and delay
– ADSL experiments
• In the wild
• Interaction with cross-traffic
– Multi-flow LAN experiments
• Level of low-priority ?
• Conclusions + future work
LAN experiments (single-flow)
Leecher
• Application setup
Switch
10/100
– uTorrent, official BitTorrent client,
closed source
• on native Windows (XP)
• on Linux with Wine
– Private torrent between the PCs
Forward
Backward
• Network setup
– Linux based router with Netem
to emulate network conditions
– Capture traffic on both sides,
analyze traces
• Experiment setup
Seed
Router
+
Netem
– Flavors comparison
– Delay impairment
• Forward path
• Backward path
– Bandwidth limitation
page 40
174
30/08/20
2.7 Echange de données en P2P: BitTorrent
LAN: LEDBAT evolution (1)
TCP
v5.2.2
Oct ‘08
α1
v1.9-13485
Dec ‘08
α2
v1.9-15380
Mar ’09
β1
v1.9-16666
Aug ‘09
RC1
v2.0.1
Apr’10
Open source
Closed source
First LEDBAT draft
Draft as WG Item
After IETF 77
Simulation
Testbed
Testbed
Testbed
Packet size [Bytes]
1500
α2
β1
TCP
α1
vv1.9-13485
v1.9-15380
1.9-16666
v5.2.2
Mar
Aug
‘09
’09
Oct
Dec ‘08
‘08
Draft
First LEDBAT
assource
WG
Open
source
Closed
draft
Item
1250
1000
750
500
250
0
0
10
20 30
Time [s]
40
50
•TCP transfers, full playload
•α1 small packet overkill !!
•α2 variable framing (not in draft)
•β1 finer bytewise cwnd control
•RC1? evolution continues !
page 41
LAN: LEDBAT evolution (2)
• Throughput evolution
Throughput [Mb/s]
– Different LEDBAT versions
– Each curve in separate experiment
– Also TCP BitTorrent (Linux + Win)
10
α1
TCP Linux
8
β1
6
• Observations
– α1 unstable, small packets
• Small packets recently broke β1 too
α2
– α2 , ß1 stable smooth throughput
4
• 4 and 7 Mbps respectively
TCP WinXP
• Window limits
2
– LEDBAT and TCP influenced by the
default maximum receive window:
0
0
60
120
Time [s]
180
240
•
•
•
•
TCP WinXP = 17 KB
TCP Linux = 108 KB
LEDBAT α2 = 30 KBytes
LEDBAT ß1 = 45 KBytes
page 42
175
30/08/20
176
D. Rossi – RES224
LAN: Constant delay
• LEDBAT is delay based
– constant delay for all packets
– delay = k20 ms, k \in [1,5]
– forward (or backward) path
• Observation
120
β1
α2
5
60
0
0
240
600
Backward path
10
5
– same behavior on both directions
– due to upper bound of maximum
α2=20, ß1=30)
receiver window (α
– not a constraint since bottleneck
shared among many flows
480
Delay [ms]
• Experiment setup
Forward path
10
Throughput [Mb/s]
– it measures the one-way delay
– sensitive to forward delay
– Independent from backward
120
60
Delay
profile
0
240
480
0
600
Time [s]
page 43
LAN: Variable delay
• Observations
– α2 greatly affected by varying
delay on the forward path
– ß1 probably implements some
mechanism to detect reordering
– Minor impact of varying delay on
backward path (ack delay only
affects when decisions are taken)
Throughput [Mb/s]
– random uniformly distributed
delay for each packet
– reordering is possible!
– range = 20 ± {0,5,10,20} ms
– forward (or backward) path
Forward path
10
120
β1
α2
60
8
6
0
0
120
360
Backward path
10
8
6
240
120
60
Delay
profile
0
120
240
0
360
Time [s]
page 44
176
Delay [ms]
• LEDBAT is delay based
• Experiment setup
30/08/20
2.7 Echange de données en P2P: BitTorrent
LAN experiments (multiple flows)
Leechers
Switch
10/100
• Application setup
– uTorrent β1 flavor only
– Private torrents
– Each couple seeder/leecher
shares a different torrent
• Network setup
– Simple LAN environment
– No emulated conditions
• Experiment setup
– Varying ratios of TCP/LEDBAT flows
– Different TCP network stack
• native Linux stack
• Windows settings emulated
over Linux (to measure losses)
Router
+
Netem
Seeds
45
LAN: Multiple TCP and LEDBAT flows
TCP windows
TCP linux
0.98 0.98 0.98 0.98 0.96
1.00 0.75 0.56 0.33 1.00
0.67 0.94 0.93 0.92 0.96
1.00 0.64 0.74 0.87 1.00
0
0.5
TCP
LEDBAT
0.5
LEDBAT
1
1
TCP
Bandwidth Breakdown
Efficiency
Fairness
4-0 3+1
3-1 2+2
2-2 1+3
1-3 0+4
0-4
4+0
4-0 3+1
3-1 2+2
2-2 1+3
1-3 0+4
0-4
4+0
TCP+LEDBAT
TCP+LEDBAT
Observations
•
•
•
•
Throughput breakdown between X+Y competing TCP+LEDBAT competing flows
Efficiency is always high
Fairness among flows of the same type (intra-protocol) is preserved
TCP-friendliness depends on TCP parameters
– TCP Linux stronger than LEDBAT, LEDBAT stronger than TCP WinXP
46
177
30/08/20
178
D. Rossi – RES224
ADSL experiments
• Application setup
Leecher
– uTorrent β1 flavor only
– native Windows (XP) only
– private torrent
ADSL
• Network setup
Internet
– real ADSL modems
– wild Internet
• uncontrolled delay, bandwidth
• Interfering traffic
• Experiment setup
– LEDBAT alone in the wild
– LEDBAT with cross TCP traffic
ADSL
• On forward path
• On backward path
Seed
page 47
ADSL experiments
Throughput [Mb/s]
0.6
•
178
5
Throughput
β1
4
3
0.4
2
RTT
0.2
0
•
TCP forward LEDBAT alone TCP backward LEDBAT alone
RTT [s]
LEDBAT alone
0.8
1
100
200
300
Time [s]
400
500
LEDBAT alone
– Stable throughput, closely matching nominal DSL capacity
LEDBAT connection competing with TCP on forward and backward path
– LEDBAT actually yields to TCP on the forward path
– TCP on backward path has nasty influence on LEDBAT throughput:
• due to shacky RTT (>1 sec due to ADSL buffer!)
• affecting the time at which decisions are taken
600
0
page 48
30/08/20
2.7 Echange de données en P2P: BitTorrent
Measurement wrap-up
• Main messages
– Behavior evolved:
• β1 addressed many issues (has RC1 fixed the others?)
• Draft tells only part of the story
– Efficiency goal:
• Provided that enough flows on bottleneck (or increase window limit)
• Compromised by interfering traffic on backward unrelated path
– Low-priority goal:
• Reasonably met, but is LEDBAT low-priority level tunable ?
• Depends on TCP settings as well: meaning of Low-priority itself is fuzzy
• Future work
– Interaction with other BitTorrent mechanisms (e.g., peer
selection, tit-for-that, etc.)
– Impact on the users QoE (i.e., torrent completion time ?)
– More heterogeneous settings (e.g., larger scale experiments,
access technology, BitTorrent vs our implementation, etc.)
49
?? || //
179
30/08/20
180
D. Rossi – RES224
RES 224
Architecture des applications Internet
P2P: Skype
Based on joint work with the following colleagues:
dario.rossi
Dario Bonfiglio
Marco Mellia
Michela Meo
Nicolo’ Ritacca
Paolo Tofanelli
Dario Rossi
http://www.enst.fr/~drossi
RES224
Agenda
• What is Skype,
• What services it offers
• Very brief overview of how we think it works
• Details on
• Skype service traffic
• congestion control (bandwidth, loss, etc.)
• usage in real networks (call duration, etc.)
• Skype signaling traffic
• Normal period (volume, type and spatial properties)
• Breakdown event (overlay message storm)
• References
180
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Why studying Skype ?
• Skype is very popular
– Succesful VoIP service over the unreliable Internet
– More than 100M users, 5% of all VoIP traffic (in 2007)
– Easy to use, many free services
• voice / video / chat / data transfer over IP
• Understanding Skype is a challenging task
–
–
–
–
Closed design, proprietary solutions
Almost everything is encrypted or obfuscated
Uses a P2P architecture
Lot of different flavors
Skype for Dummies
• Architecture
– P2P design
181
30/08/20
182
D. Rossi – RES224
Skype for Dummies
• Architecture
– P2P design
• Service traffic
–
–
–
–
Voice calls
Video calls
Chat
Data transmission
Skype for Dummies
• Architecture
– P2P design
• Service traffic
–
–
–
–
–
182
Voice calls
Video calls
Chat
Data transmission
nodes may act as
relay when peers are
behind NAT
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Skype for Dummies
• Architecture
– P2P design
• Service traffic
–
–
–
–
–
Voice calls
Video calls
Chat
Data transmission
Skypeout/Skypein
Skype for Dummies
• Architecture
– P2P design
• Service traffic
–
–
–
–
–
Voice calls
Video calls
Chat
Data transmission
Skypeout/Skypein
• Signaling traffic
– Login & auth.
– Look for buddies
– ….
183
30/08/20
184
D. Rossi – RES224
Well hidden traffic
• Encryption
• Skype adopts AES (Rijndel) cipher to encrypt information
• 256-bit long keys, or 1.1X1077 different keys
• 1536 or 2048 bit RSA is used to negotiate the
AES symmetric key
• Public keys are authenticated by the login server
• Obfuscation
• Encrypted payload is also obfuscated using
Arithmetic compression
• Headers that are not encrypted are obfuscated
using the RC4 Algorithm
State-of-the-Art
Encryption +
Obfuscation
Mechanisms
Available Skype Codecs
• E2E calls use preferentially
• SVOPC – starting ~2008
• iSAC – bit rate : 10-32 kbps (adaptive, variable) (30 ms frames)
• iLBC – bit rate : 13.3 kbps (30 ms frames) 15.2 kbps (20 m frames)
• E2O calls use preferentially
• G729 – bit rate : 11.5 kbps (20 ms frames) 30.2 kbps
• Other codecs are available
• iPCM-wb, G711A, G711U, PCM-A, PCM-U, …
• Skype framing alters the original codec framing
• Redundancy may be added to mitigate packet loss
• Frame size may be changed to reduce overhead impact
184
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Skype Source Model
Skype
Message
TCP/UDP
IP
Types of Skype messages
• Depending on transport protocol:
• Use TCP when forced to, with ciphered payload
• Login, lookup and data: everything is encrypted
• Use UDP whenever possible, payload is encrypted
• But some header MUST be exposed
• Otherwise, in case of a packet loss, receiver could not
properly align and decrypt the rest of the flow (as in SSL/TLS)
TX data
RX data
Unreliable
AES
AES
185
30/08/20
186
D. Rossi – RES224
Skype service traffic
Agenda
• Investigate Skype service traffic
• Methodology
• Service traffic (voice & video)
•
•
•
•
Service multiplexing
Skype reactions to transport layer
Skype congestion control
Skype loss recovery
• Users’ behavior
• Call volume and duration
• Further details on the papers
186
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Preliminary Definition
•Useful information
–At installation, Skype chooses
a port at random
–The port is never changed
(unless forced by the user)
–All traffic multiplexed over the
same socket (UDP preferably)
Skype peer
– A Skype peer can be identified
by its L3/L4 endpoint
– Consider only peers that were
ever observed making a call
Skype flow
–UDP is connectionless (no
SYN/SYNACK sequence to detect
start of flow, nor FIN at the end)
– Sequence of packets originated
from a Skype peer (and destined to
another skype peer)
–Flow starts when the first packet is
observed
–Flow ends when no packet
is observed for a given inactivity
timeout (200s)
–Skype sends packets every
~120sec to keep NAT entries open
(IP addr, UDP port)
Methodolody
•Service traffic
•Small scale active testbed
•Measure Skype response to
controlled perturbation
•Control bandwidth, packet loss,
transport protocol, etc.
•User behavior
•Passive measurement technique
on real network (LAN and ISP)
•Own classification framework (see refs)
•Inspect and quantify unperturbed
Skype traffic of real users
7000 hosts
1700 peers
300.103
external
peers
187
30/08/20
188
D. Rossi – RES224
Service Traffic: Normal Condition
• Unperturbed conditions
– Two peers, same LAN
– 100Mbps, <1ms delay
– Use different codecs
• Metrics
– Inter packet gap [ms]
• (time between two
consecutive Skype
messages)
Switched LAN
– Message size [Bytes]
– Bitrate [Kbps]
Service Traffic: Normal Condition
250
200
Bitrate
[kbps]
ISAC
iLBC
iPCM-WB
PCM
G729
Smooth
Transient
Normal
Behavior
150
Aggressive
Startup
100
50
0
0
10
20
30
Time [s]
188
40
50
60
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Service Traffic: Normal Condition
Message
Payload
[Bytes]
300
200
100
ISAC
100
50
G729
300
200
100
iLBC
900
600
300
iPCM-WB
600
400
200
PCM
0
10
20
30
40
50
60
Time [s]
Service Traffic: Normal Condition
70
ISAC
iLBC
iPCM-WB
PCM
E2O G729
60
50
IPG
[ms]
40
30
20
10
0
0
10
20
30
40
50
60
Time [s]
189
30/08/20
190
D. Rossi – RES224
Recall…
On unknown network
conditions (e.g., at the
beginning of a call),
Skype aggressively send
voice blocks twice to
counter potential losses
Service Traffic: Video Source
• Message size
– Typical of voice calls
– Larger video messages
• Multiplexing
– Voice and video
multiplexed over the
same UDP socket
B
[kbps]
600
400
200
0
80
IPG
60
[ms]
– Typical of voice call
– Back-to-back video
messages
800
40
20
0
L
900
[Bytes]
• Variable bitrate
• Inter packet gap
600
300
0
0
10
20
30
Time [s]
190
40
50
60
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Service Traffic: TCP vs UDP
• Transport scenario
– Two peers, same LAN
– Blocking UDP with
a firewall
• Metrics
– Inter packet gap [ms]
• (time between two
consecutive Skype
messages)
Switched LAN
– Message size [Bytes]
– Bitrate [Kbps]
Under TCP
• No initial transient
• IPG unmodified
Kbps
Service Traffic: TCP vs UDP
– TCP PSH,URG flags
• Cautious with
bandwidth
– In case of loss, TCP
goes slow-start
(cwnd=1 segment)
– Call would be
likely stopped,
avoid slow-start!
Bytes
– TCP recover losses
ms
• Old blocks not
multiplexed
B - UDP
B - TCP
80
60
40
20
0
90
IPG - UDP
IPG - TCP
60
30
0
250
200
150
100
50
0
0
L - UDP
L -TCP
10
20
30
40
Time [s]
50
60
191
30/08/20
192
D. Rossi – RES224
Service Traffic: Network impact
• Network impact
– Two peers, same LAN
– Artificially controlled
• bottleneck capacity
• packet losses
• Metrics
– Inter packet gap [ms]
Switched LAN
• (time between two
consecutive Skype
messages)
– Message size [Bytes]
– Bitrate [Kbps]
[Bytes]
[ms]
[Kbps]
Service Traffic: Congestion control
192
100
80
60
40
20
0
100
80
60
40
20
0
300
250
200
150
100
50
0
Average Throughput
Bandwidth limit
Framing
Skype Message Size
0
30
60
90
120 150 180 210 240 270 300
Time [s]
Skype
performs
congestion
control
adapting
coding
rate and
framing
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Service Traffic: Loss recovery
Loss %
60
10
50
Inter-Pkt
Gap [ms]
8
40
6
30
4
20
Skype performs
loss recovery…
10
2
0
0
0
500
Payload
[Bytes]
100
200
300
400
500
10
Loss profile
400
8
300
6
200
4
...by multiplexing old
and new voice blocks
100
2
0
0
0
100
200
Time [s]
300
400
500
Service Traffic: User behavior
• Network scenario
– >1000 peers campus LAN
– Unperturbed environment
• Sniffing traffic at
the edge router
• Metrics
7000 hosts
1700 peers
300.103
external
peers
– Volume of calls
– Duration of calls
– Geolocalization, location of
Skype PSTN gateways, etc.
(see refs.)
193
30/08/20
194
D. Rossi – RES224
User Behavior: Volume
60
40
UDP E2E
UDP E2O
TCP E2E
Flow number
20
0
-20
•Asymmetry in the number
of calls per protocol type:
•Explanation:
- one direction UDP,
- the other TCP
-40
-60
Mon |
Tue
| Wed | Thu
|
Fri
|
Sat
|
Sun
Skype call duration & arrival process
• Free calls are longer, Poisson arrivals
Call duration
Experimental
quantiles
Paying
$kypeout
Free
voice
Negative exponential
quantiles
Call duration [min]
194
Free
video30
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Skype signaling traffic
Agenda
Methodology
• Passive observation
on real networks Typical week
Signaling
•
•
•
•
Volume
Traffic pattern
Geolocalization
Message storm
Skype breakdown (Aug’07)
Flooding storm!
1700 internal peers
100 internal peers
Skype
300,000 external peers classifier 40,000,000 external peers
2,500,000 flows
33,000,000 packets
Internet
Internal peers
External peers
195
30/08/20
196
D. Rossi – RES224
How much signaling traffic ?
Typically, limited amount
of signaling bandwidth
1
Yet, rather active
signaling exchanges
1
0.9
Cumulative
Distribution
Function
(CDF)
0.8
0.7
0.6
0.75
0.5
0.5
0.25
0.4
0.3
1
10
100
1000
Average per-peer
signaling bitrate [bps]
10000
0
0
25
50
75
Number of peers
contacted in 300 sec.
Skype spatial pattern
Each dot is
a packet
1500
1000
Each line
is a flow
500
Peer ID
0
-500
ID++ on
new peer
-1000
-1500
196
Time evolution [hr]
100
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Skype spatial pattern
1500
1000
Outgoing
traffic
500
Peer ID
0
-500
-1000
-1500
Incoming
traffic
Time evolution [hr]
Skype spatial pattern
1500
1000
Skype network
probing
500
Peer ID
0
-500
Calls, buddy
updates, etc.
-1000
-1500
Time evolution [hr]
197
30/08/20
198
D. Rossi – RES224
Skype spatial pattern
1500
1000
80% flows
5% bytes
500
Peer ID
0
-500
20% flows
95% bytes
-1000
-1500
Time evolution [hr]
Skype spatial pattern
•Probes
–Single packet toward unknown
peers; reply possibly follows
–No further traffic is exchanged
between the same peers pair
–Most of the flows (80%) are probes
•Dialogs
–More than a single packet sent toward the same
peer; buddy status & overlay mainenance
–Persistent activity, carries most of the signaling
bytes (95%)
198
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Skype peers location
Paris
Turin
39
Skype peers location
Probability distribution function (pdf)
Probes
Others
• RTT distance
– Between first pair of
(request, reply) packets
• Two threads shape
the Skype overlay
– Probes favor discovery
of nearby hosts
– Buddy signaling and calls
driven by social network
Round Trip Time RTT [ms]
40
199
30/08/20
200
D. Rossi – RES224
Flooding storm
•August 2007 (& 2010): massive Skype outage!
!?
Flooding storm
150
0
Flows
40k
0
Before
During
After
Incoming (-) and Outgoing (+) Traffic
LAN
Peers
40k
40k
Pkts
0
40k
4M
Bytes
0
4M
Thu 9th
200
Sun 12th
Thu 16th Sun 18th
August 2007
Thu 23th
Sun 25th
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
Flooding storm
• Probing traffic increase considerably
–
–
–
–
Before
During
After
4x more flows
3x more packets
10x more external peers
2x less bytes
August 2007
• In the LAN,
– Skype predominant portion of UDP traffic (70% bytes, 94% flows)
• 10 most active LAN peers
– receive almost all Skype traffic (94% bytes, 97% flows)
• The single most active peer
– 50% bytes 75% flows
– contacts 25% of all external pees seen
– namely 11 millions, a 30x increase
Everybody looking for
super-peers; significant
volumes of traffic handled
by some peer
Summarizing this course on
•Skype protocol and format
•Proprietary, complex, well hidden
•Not broken so far
•Service traffic
•Active testbed
•Application layer multiplexing
•Transport layer UDP/TCP usage
•Skype congestion and loss control
•Aggressive with losses
•Conservative with bottlenecks
•User behavior
•Call duration per service
•Call arrival still Poisson
•Free services preferred
•Signaling traffic
•Passive measurement
•Two different threads shapes
the Skype overlay
•Probes : selection driven by
proximity in the IP- network
(the close the better)
•Dialogs: Social network driven
(friends are everywhere)
•Signaling rate and spatial spread
•Large number of contacted
peers
•Typically, very limited bitrate
•Huge broadcast storm in case
of bugs/problems
201
30/08/20
202
D. Rossi – RES224
References
• This course based on
– D. Bonfiglio, M. Mellia, M. Meo, D. Rossi and P. Tofanelli, Revealing
Skype Traffic: When Randomness Plays with You . ACM SIGCOMM,
Kyoto, Japan, August 2007.
– D. Bonfiglio, M. Mellia, M. Meo and D.Rossi, Detailed Analysis of Skype
Traffic . IEEE Transactions on Multimedia, 11(1):117-127, January
2009. (preliminary version appeared at Infocom08)
– D. Rossi, M. Mellia and M. Meo, Understanding Skype Signaling .
Elsevier Computer Networks, 53(2):130-140, February 2009.
– D. Rossi, M. Mellia and M. Meo, Evidences Behind Skype Outage. In
IEEE International Conference on Communications (ICC'09), Dresde,
Germany, June 2009.
• (all on my webpage)
References
• Optional readings
– P. Biondi, F. Desclaux, Silver Needle in the Skype. Black
HatEurope'06, Amsterdam, the Netherlands, Mar. 2006.
– S. A., Baset, H. Schulzrinne, An Analysis of the Skype Peer-to-Peer
Internet Telephony Protocol. IEEE Infocom'06, Barcelona, Spain,
Apr.2006.
– S. Guha, N. Daswani and R. Jain, An Experimental Study of the
Skype Peer-to-Peer VoIP System, USENIX IPTPS, Santa Barbara, CA,
Feb. 2006
– K. Ta Chen, C. Y. Huang, P. Huang, C. L. Lei Quantifying Skype User
Satisfaction, ACM Sigcomm'06, Pisa, Italy, Sep. 2006.
202
30/08/20
2.8 Multimedia et VoIP en P2P: Skype
?? || //
203
204
204
3 Travaux dirigés (TD)
D. Rossi – RES224
3.1 Enoncé
¤ !!"""!#$
¹
¹
%&
'((
¹
) ¹¹
¹¹¹¹¹¹¹¹¹
¹¹
¹
)¹¹¹¹¹
¹¹¹¹¹¹¹¹¹¹¹
¹
)* ¹¹¹
¹¹ ¹
¹
) ¹¹¹ ¹¹¹!¹ ¹¹¹
¹¹
¹
¹¹
)+¹
"¹¹#¹¹
¹¹
¹¹$¹¹
¹%¹¹
&¹¹'¹
¹
),(¹¹¹¹
¹¹
¹ ¹
¹)¹*+,¹¹
'¹¹¹
¹$'' ¹¹¹%¹¹¹
¹$¹'' ¹¹
¹%¹-¹¹¹"¹'¹¹'¹-¹
¹
)- ¹¹¹
'¹¹¹¹*+¹¹./0¹
¹
).¹¹¹¹¹ ¹1¹¹
¹2¹¹¹ ¹
¹¹
¹'¹ '¹¹¹¹
¹
)/¹3¹¹¹4,¹
¹¹¹¹¹¹
¹5¹67¹¹
¹66¹
¹¹
)0(¹¹¹¹
¹8.¹
¹'¹
¹¹19:¹
¹
)(¹'¹
¹¹
¹¹¹¹¹5¹¹
¹
) ¹¹¹
¹¹19:¹¹¹!¹
¹
¹
¹
¹
206
D. Rossi – RES224
%1$1$12
)9"¹¹
)¹
2¹;¹5¹¹¹¹<
=¹-¹
¹
)(¹¹¹
'¹¹2¹¹¹
¹¹
¹¹¹¹66¹¹
¹5¹¹-¹
)*¹¹1¹"¹)¹
¹
'¹'¹3¹¹¹1¹-¹
¹
)¹¹>¹' ¹5¹"¹¹¹-¹
)+>¹¹5¹¹' ¹
¹
'¹
¹¹¹¹
*+¹
¹¹¹
¹¹19:/.>¹-¹
), ¹¹
¹¹19:/.>¹
¹'¹¹¹
¹'¹
?¹
)- ¹)¹¹
'¹' ¹¹¹
'¹'¹
'¹
¹¹¹19:/.>¹
¹
%1$1$1
!¹ ¹ ¹ 1,¹ ¹ ¹ ¹ * ¹ ¹ ¹ @2
¹ ¹ ¹
/,¹¹¹'¹3¹¹ ¹
¹¹¹
@2
¹¹¹/¹>¹¹
'¹¹¹¹¹¹¹¹¹ ¹3¹A¹¹ *¹¹¹ ¹¹
¹
¹@¹¹,¹¹¹¹@
¹.¹
¹¹1¹
!¹ ¹ ¹ >¹ ¹ ¹ ¹ 1¹ ¹ ¹ ¹ ¹ @¹ ¹ @¹ ¹ ¹
¹ ¹ ?¹ ¹ @
¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
'¹*¹¹¹¹¹
@¹
¹¹¹¹¹¹ ¹
¹¹¹¹
¹¹
¹¹
ù !¹¹¹
¹¹ ¹1¹
¹¹¹¹
B6C¹ ¹¹¹5D67¹¹@¹¹
¹¹¹
BEC¹ ¹¹¹5D67¹¹¹¹¹¹
¹¹¹
B+C¹ ¹¹¹5D66¹¹¹¹¹ ¹
¹¹
B0C¹ ¹¹¹5D66¹¹¹ ¹
¹¹
¹
¹
¹
¹
3.1 Enoncé
%&
3
'
A¹
¹
¹/¹<¹¹
¹
,¹¹¹A#¹.9#=¹
¹
« ¹¹ ¹ ¹ ¹ ¹ ¹ ¹ > ¹ ¹ ;¹ ¹ ¹
¹'¹
¹¹
¹
¹
¹¹
¹¹¹
¹
¹2'
¹ ¹
¹ ¹> ¹
¹
¹2'
¹<¹
¹=¹
¹¹¹
¹
¹¹
¹
¹ ¹< ¹
' ¹¹¹
¹
=¹
¹ ¹ ¹ ' ¹ ¹ ¹ ¹ ¹ ¹ ¹ > ¹
'¹
¹¹¹
¹ ¹¹¹¹¹¹¹ ,¹¹¹
¹¹¹
'¹¹¹F¹$¹¹%¹¹¹F¹¹
ù¹¹¹
¹¹' ¹
¹ ¹
¹¹ ¹ ¹ ;¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
2'
,¹¹
¹¹¹¹
¹ ¹¹¹¹'¹
¹¹ ¹ ¹ ¹ ¹ ¹ ¹ '¹ ¹ ¹ ¹ ¹
> ¹ ¹ ¹ '¹ ¹ 3¹ '¹ ¹ ¹ ¹ '¹ ¹
¹¹'¹¹
'¹
¹
¹
¹¹¹
¹¹¹¹¹¹¹
¹¹¹¹
)¹
¹¹
¹
¹¹¹¹
) 92¹ ¹ ¹ ¹ ¹ /¹ ¹ ¹ 2¹ ¹ )F¹¹
'' F¹
¹
) ¹¹¹¹¹
¹¹¹¹¹/¹
¹¹
2¹
¹.¹¹¹'¹¹9¹
) * '¹ ¹ ¹ ¹ ¹ ¹ 2¹ ¹ '¹ ¹ ¹ '¹
¹
) '¹¹¹¹¹
¹¹¹
¹¹¹!¹¹ ¹/¹
)+ ¹¹
'¹
¹˜˜˜* ¹/¹<¹¹¹F¹@¹
@¹¹"
=¹¹¹¹¹¹¹¹2¹¹' ¹¹¹¹¹¹'¹
¹¹'¹¹¹¹¹
'¹'¹
¹ 2¹'¹¹3¹
¹¹¹
¹¹ ¹¹¹¹¹¹¹¹;¹'¹
± ¹¹ ¹¹¹¹¹¹
¹
¹'2¹
'¹
),
(¹¹¹¹2¹
¹¹¹<¹=¹ '¹¹Ø Ø ¹/,¹
¹¹
2¹
¹'¹¹9¹¹.¹-¹<¹¹¹¹)¹¹¹
¹¹¹¹'=¹¹
¹
¹
¹
208
D. Rossi – RES224
I ¹¹¹
¼ ¼
¼
¼
¼
¼ ¼ !
¼
¼"#$$¼%
&'()*
¼
¼
¼
#+,'-.(#/¼0
¼
(1
.$(/¼2¼0
¼
/33
33
.$(/¼2¼0
¼
(1
4++
5 6.76'-76.76'-7
¼¼83¼¼
(1/988¼4:
;.<
/33
33
=8,
>
¼
¼¼?
¼
¼ 3)¼280
¼
"8?*/¼
¼
8
¼
(
¼ ¼280
¼
@!)
)*
¼"#$"
3:2"8
A:B#
C(
//2D"E
777&'()*
¼
)*
¼& ¼@!C¼38
<&+<4"+;"(4"
$,$',<,<F
G$<
H.>
:>,#,#
",I
4"<
.<
4',H.>J
&'$
777H.>
H?¼3
777#+,'-3/6 30)*
¼7",IK
6 30)*
¼7"
1
777.$/6¼280
¼7
7774++
6¼280
¼7.
1
3C
5 LL
?¼
777
6¼280
¼7
¼3
¼)
¼3
¼"#$$¼%
777&'()*
¼
¼3
¼
$,$',<,<F
",I:
H.-J
¹
¹
3.1 Enoncé
.<
"+.'"
:>,#,#
777"+.'"
.¼'"
777&'()*
¼
¼3
¼
$,$',<,<F
",I:
H.-J
.<
:>,#,#
777#+,'-3/6 30
¼7",IK:
(1
777.$/6¼280
¼7
7774++
(1
5 6.76'-76.76'-7
777
(1/988¼>:4>4
6¼280
¼7"
D(1/988¼>:4>4E
1-*I:#¼¼)
¼280
¼"
D1-*I:#¼¼)E
¼
@!
777=;,
)*
¼¼
¹
¹
I ¹¹¹
¹¹¹
::
:::?
:
¹¹¹
::
::
::?
¹¹¹
::
:?::?:
:??
?
¹¹¹
::
?:::
???:???
:?
¹¹¹
::
::?::?:
:?????
:
¹¹¹
::
¹
¹
210
D. Rossi – RES224
?:::
???:???:
::?
:
¹¹¹
::
:?::::?:
:????::
:?
¹
¹¹
¹¹¹¹
::
:?:::
??::??
:
¹¹¹
::=J
:?::;0
??::??:M$
:L:
¼
¼$,
$',<,<F",
I
H.-J.<
::GH.$
::>,#,#
¹¹¹¹¹¹
::=J
:::?:'05
:???:!6$
?#+,'-.(#/
:6 30
¼7
",IK
¹¹
::=J
?::H0
??:?:!6$
:?L:N(1
¹
::=J
::?:?:0O
:??:!4$
9.$(/6
?¼2¼0
¼7
¹
::=J
?::P0
??:?:!4$
:?L:6(1
¹
¹
¹
3.1 Enoncé
::=J
:::?:?:0
:??:!'$
4++
¹¹
::=J
?:::#G0
??:?::!':$
:L:
:5 6.76'-7
6.76'-7
¹
::=J
:::?:0
:?:?::!9$
4#¼¼,4
:/6:
0
¼74
/#
C(
?//2
::-3/+ 3
:::" 8 6 30
¼7;¼
?:+
/ 8
:?DP
5¼AE
#,#H¼
/
?/¼2¼
0
¼"8?*
/
8
¼
:?:/%A
B ::¼K,"(::
?B3K5
¼
/?
:¼
¼83¼¼
¹¹
::=J
?::-J0
??:?:!9$
:?L:9(1/98
8¼--
¹
::=J
::::?:0
:??::!$
Q=;,
¹
::=J
?::I0
??::??:!$
:L:>
¹
¹
212
D. Rossi – RES224
¹¹
::=J
:??:::D@0D
??::??!$
:L:;;;;;;
¹¹
::=J
:::::?:D0
:???!$
:
¹¹
::=J
:::?:?:D0O
:???!$
:
¹¹
::=J
:?:::DM0D
??:??!$
:L:>;;;;;;
¹
¹
¹
¹
I
± ¹
¹
¹
¹
¹
¹
¹
¹
¹
3.1 Enoncé
¤ 4
5
!!"""!#$
56
6¹
E¹
+¹
0¹
G¹
I¹
J¹
4¹
1¹¹¹¹¹¹¹
¹ ¹-¹
1¹¹¹¹¹ ¹,¹
¹?)¹¹¹¹-¹
.¹?¹¹¹ ¹
¹)¹¹-¹
.¹?¹¹¹ ¹*
¹)¹¹-¹
1¹¹¹¹¹ ¹¹<¹H2¹=¹-¹
1¹¹
¹¹¹
¹ ¹¹¹-¹
1¹¹¹¹¹ ¹¹ ¹<¹H2¹=¹-¹
1¹¹¹
¹ ¹¹)¹-¹
¹
¤
7
6¹
E¹
+¹
0¹
G¹
I¹
¹¹¹¹ ¹¹¹
¹¹
¹¹¹.999¹¹
¹¹
¹¹¹
¹
¹¹
¹
¹¹
¹¹¹ ¹)¹¹¹¹
¹¹¹¹)¹¹¹)¹¹
¹
¹¹
¹¹¹
¹¹?K¹)¹¹¹¹¹¹
¹?K¹)¹
¹¹<¹¹)¹=¹¹
J¹ ¹
¹¹ ¹ ¹)¹¹
4¹ ¹
¹¹¹¹
¹)¹¹)¹¹¹
L¹ ¹
¹¹¹¹
¹)¹)¹¹¹ ¹¹<,¹
¹ ¹¹¹=¹¹
67¹ ¹¹)¹ ¹¹¹¹)¹¹
66¹ ¹¹¹¹¹¹)¹)¹¹¹¹ ¹¹
¹
¹¹¹¹
6E¹ ¹¹¹,¹ ¹),¹
¹¹¹!¹¹¹)¹¹
6+¹ ¹¹¹¹¹
¹¹¹
¹¹¹K¹¹
60¹ ¹¹¹¹¹¹¹
¹¹¹¹K¹)¹¹¹¹
¹¹¹2¹¹
6G¹* ),¹ ¹"¹
¹?¹¹¹¹
<,¹
,¹ ,¹,¹,¹ ,¹=¹¹
6I¹ ¹¹¹K?¹¹¹"¹
¹¹¹¹
6J¹?
),¹¹¹¹¹
!¹¹¹¹¹
¹
¹
214
D. Rossi – RES224
64¹?
),¹¹¹¹¹677¹ ¹"¹
¹¹¹
6L¹ ¹"¹
¹¹¹ ¹)¹.¹¹¹
E7¹ ¹
¹¹¹¹¹¹
E6¹>¹
¹¹¹¹ ¹¹6I¹¹¹¹
EE¹#¹
¹¹¹¹ ¹¹EGG¹¹¹
E+¹ ¹
¹¹0,¹¹¹¹)¹¹¹.¹
¹¹
E0¹A¹
¹¹?)¹
¹?¹¹
¹MN¹¹
EG¹¹
¹
¹,¹¹¹ !¹¹ ¹
¹?¹
¹¹¹
¹¹< ,¹),¹)!=¹¹¹
EI¹ ¹¹
¹¹¹¹¹¹.¹
¹¹
EJ¹ ¹¹
¹¹¹
¹¹¹.¹
¹¹¹¹¹¹¹
E4¹ ¹¹¹
¹¹¹/¹
¹¹.¹
¹¹
EL¹ ¹¹
¹
¹¹¹¹>¹
¹¹¹
+7¹¹¹¹¹
¹¹)¹2
¹¹¹¹¹
+6¹¹¹¹¹¹¹)¹
¹¹¹ 5¹¹¹¹¹
+E¹¹¹< ¹"
=¹)¹¹¹)¹¹
¹¹¹
++¹¹/O¹
¹)¹¹¹¹¹¹¹¹ ¹¹¹¹
¹¹¹
+0¹¹
¹
¹¹¹¹¹
¹¹¹¹¹¹>¹
¹¹
+G¹¹¹
¹¹¹¹¹
¹)¹
¹¹
+I¹ ¹¹¹
¹¹¹
¹
¹ ¹
¹¹¹
+J¹¹/9¹
¹¹¹
¹
¹)¹
¹¹¹¹¹
¹¹¹
+4¹ ¹¹¹¹¹¹¹¹¹¹
+L¹ ¹ ¹¹
¹¹
¹¹?K¹
¹¹¹2¹¹¹K¹
)¹¹¹
07¹ ¹"¹¹¹¹¹¹)¹¹¹)¹¹
¹¹¹
¹
¹¹ ¹¹¹
¹
'
*˜Ø˜* *˜
** ˜*ؘ* ˜
6¹ :¹¹LD66¹K,¹¹"¹
¹<> =¹¹¹¹¹?¹
¹¹?¹PK¹)¹¹92¹?)¹¹)¹
¹¹LD66¹
¹
.¹¹j** ¹¹¹< ,¹¹¹1¹¹¹¹¹
¹¹1¹¹¹¹=-¹
E¹ #? ¹¹¹¹¹,¹2¹?)¹¹¹¹¹¹K¹
¹
Ø
,¹
¹ )¹¹¹?¹¹)¹¹
¹¹<
¹¹
)=¹¹)¹¹P¹?¹¹
¹
48
6¹ ¹)¹¹¹1¹ ¹)¹¹¹
¹,¹
¹¹)¹?¹
¹ ¹ ؘ* ¹ )¹ ¹ ,¹ ¹ ¹ ),¹ ¹ ¹ ¹
¹¹ ¹¹¹
¹ ¹ ¹¹.¹¹¹ ¹1¹¹¹ ¹ ?¹
A¹¹
¹¹
)¹?¹¹*¹
¹¹¹*,¹¹?¹¹2¹¹¹
¹¹¹K¹¹¹? ¹¹¹¹
¹¹ ¹¹¹)¹¹
*
*
¹¹A¹
¹¹¹¹
¹¹
)¹?¹)¹¹
¹ ¹¹*?¹¹
¹¹2¹
¹¹¹ ¹-¹
¹
¹
3.1 Enoncé
E¹ 1¹ ¹ ?¹ )¹ ¹ ¹ 1 ¹ K
¹ ) ¹¹
K
¹ ¹ ¹ ?¹ <?¹ ¹?¹ ¹ ?¹ )¹ ¹ ¹
(19AP¹ K)
¹ ¹ ¹ Q9AP¹ ¹ "=¹ -¹ ¹ ¹ ?¹ ¹ ?¹ ¹
¹¹K
¹-¹¹K
¹-¹
¤'
5?¹
¹¹¹ ¹¹K?¹¹
)¹¹¹¹-¹¹
E¹
¹)¹K¹¹¹¹¹¹
¹
)¹¹¹¹¹-¹
+¹ 1)¹
¹)¹K¹¹¹¹2)¹6+¹¹-¹¹
0¹ 5?¹
¹)¹K¹¹¹¹?¹¹
¹¹¹-¹
G¹ . ¹
¹¹ ¹)¹¹)¹¹¹¹?¹)¹
¹¹¹¹ ¹
¹¹?¹
¹¹¹)¹¹K¹K¹<.,¹?¹¹)¹) ¹-=¹
I¹ 1¹
¹¹¹)¹
¹¹˜
)¹¹-¹
J¹ 5?¹)¹ ¹¹
¹)¹K¹¹"¹¹¹¹)¹
)¹-¹
¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹1¹¹¹M1?,¹@¹¹¹¹KN¹¹
¹
¹¹
¹¹/¹E77+,¹
¹
DD???
DDDE77+D
KD?"E77+
¹
¹
6¹
'
* ؘ
6¹ 1¹
¹¹¹¹¹""¹
¹¹
¹?¹ ¹ ¹
¹¹¹""
¹ -¹¹1¹¹¹¹
¹¹¹ ¹¹
¹-¹
E¹ ¹¹ ¹¹))¹
¹
¹""
¹ ¹?¹¹
""¹2¹¹),¹¹¹ ¹¹
¹)¹¹¹
?¹)¹¹¹
¹
¹¹¹8¹¹
¹
¹¹
¹
¹¹¹
¹
Ù5
$$5
¹¹¹¹¹K¹
,¹
¹¹?¹ ¹K¹
¹¹
¹¹<,¹?¹)¹DD¹¹¹
¹¹?¹=¹P¹¹)¹)¹?¹
¹ ¹ ˜Ø
¹*¹
¹¹¹>2D2¹¹¹
¹
¼¼ ¼¼
¼¼
¼
¼
¼ ¼
&,<-(
BB+<"P.",(</
:
:,<$.¼)
BB+<"P.",(</
¹
¹
216
D. Rossi – RES224
¼,<#G¼3
¼
BB+;&(.,J",(</
¼,<<"3
¼
¼
¼,<<"
¼
¼
¼,<<"
%8
8
¼,<<"
¼
¼
¼,<<"
¼
¼
BB+44,,(<+'",(</
¼3
¼,<+
¼
¼::,<+
¼
¼::,<+
¼
¼::,<+:
¼
¼::,<++++/////
¼
¼::,<+
BB+<"P.",(</
5553,<+:
5553,<+:
5553,<+:
5553,<+:
BB+;&(.,J",(</
3,<<"3
3,<<"3
3,<<"3
3,<<"3
3,<<"3
3,<<"3
3,<<"?3
BB+44,,(<+'",(</
?3,<+
3,<+
3,<+
3,<+:
3,<+
3,<+
3,<+:
BB=83/3¼
BB".H./RDE
BBP&</#
(://
BB#"F",I)/
¹
¹
¹
3.2 Correction
¤ !!"""!#$
¹
¹
%&
'((
¹
) ¹¹
¹¹¹¹¹¹¹¹¹
¹¹
¹¹¹¹¹¹¹¹¹ ¹
¹¹¹¹¹¹¹
¹
!¹"¹
¹¹
¹
¹¹
¹¹
#¹$¹¹¹¹¹¹¹¹%¹&¹¹
¹¹
¹¹¹
¹¹%¹¹¹
¹¹¹¹¹¹¹
¹
¹
)¹¹¹¹¹
¹¹¹¹¹¹¹¹¹¹¹
'%¹(%¹")%¹*#%¹+),%¹-$.-¹
¹
)* ¹¹&¹
¹¹ ."¹
$¹¹ ."¹¹¹¹¹
¹¹
¹/¹¹¹
¹
¹¹
¹
¹¹
¹¹¹¹
¹¹
¹¹¹
¹
¹+¹¹
¹
¹ ¹¹¹¹¹¹ ."¹
¹¹¹0¹¹
¹¹¹
¹¹
¹¹¹¹1¹¹¹¹¹
¹
¹
¹
) ¹¹2¹¹¹¹3¹ ¹¹&¹
¹2¹
¹¹
4¹5¹6%¹3%¹¹ %¹%¹7¹¹2¹
¹
¹68%¹
%¹
¹%¹7¹2¹2¹¹¹¹¹%¹
¹¹¹
¹¹¹¹¹9/¹
¹:¹
¹¹
)+¹
:¹¹'¹¹
¹¹
¹&¹;¹¹
¹<¹¹
=¹¹¹
'¹¹¹¹
¹¹¹¹¹¹¹
¹
¹¹¹¹
¹
¹
¹*¹
¹¹¹
¹
¹¹¹¹¹¹
¹
¹
¹¹¹¹&¹
¹
¹¹¹¹
¹¹
¹
¹¹¹¹¹¹&¹¹¹¹¹¹
¹
¹
218
D. Rossi – RES224
¹
¹¹¹'¹6¹¹¹
¹¹¹¹
¹
¹'¹
7¹
¹
),4¹¹¹¹
¹¹
¹¹
¹9¹*#%¹¹
¹¹¹
¹;¹¹¹<¹¹¹
¹;¹¹¹
¹<¹>¹¹¹:¹¹¹¹>¹
¹¹¹¹¹¹¹¹¹¹¹¹¹¹
¹&¹¹¹
¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹
¹ ¹¹¹;¹¹¹¹<%¹¹¹*#¹¹
¹¹¹¹
¹¹¹¹¹¹
¹0-0¹ ¹¹¹;¹¹¹<%¹
¹¹*#¹
¹¹¹¹¹
¹¹¹¹60-07¹¹
¹
¹¹¹
¹¹6 -$-7¹$?¹¹¹¹¹¹¹¹¹¹
¹¹¹¹¹*#¹¹¹¹¹¹
¹42+¹
¹
)- ¹¹¹
¹¹¹¹*#¹¹+),¹
¹¹¹*#¹¹¹¹¹
¹¹¹¹¹¹1¹
¹¹¹¹¹¹¹+),¹¹¹
¹¹¹¹
@¹
¹¹¹¹¹¹¹¹¹¹
¹
)."¹¹¹¹¹¹A¹¹
¹ ¹¹¹¹
¹¹
¹¹
¹¹¹¹
$¹/¹@¹6B-7¹¹
¹¹¹¹%¹¹¹¹@¹
B-¹¹¹¹¹
¹
)/"¹¹¹¹C%¹
¹¹¹¹¹¹
¹(¹D¹¹
(¹¹
¹(¹D%¹¹¹¹ ¹¹¹¹@¹¹¹(¹¹¹
¹¹¹ ¹¹¹¹
¹@¹
¹¹
)04¹¹¹&¹
¹B+¹
¹&¹
¹¹A-E¹
$¹B+¹¹¹¹
¹¹¹ ¹
¹&¹%¹¹¹¹
¹¹¹¹@¹
¹9¹*"¹¹¹%¹¹¹¹
¹
¹B-¹¹¹¹/¹
¹
¹¹20$¹
¹¹@¹
¹
)4¹¹
¹¹
¹¹¹¹¹(¹¹
¹¹ ¹¹¹
¹¹¹¹/¹¹¹¹
¹
) ¹¹¹
¹¹A-E¹¹¹3¹
¹¹¹¹(%¹¹¹¹
¹¹¹¹"5¹'
¹
¹
%1$1$12
)-:¹¹
9¹
¹@¹(¹¹¹¹6
7¹>¹
¹
)4¹¹¹
¹¹ ¹¹¹
¹¹
¹¹¹¹¹¹
¹(¹¹>¹
¹
¹
3.2 Correction
)*2¹¹A¹:¹9¹
¹
¹¹¹¹¹A¹>¹
¹
)¹¹$¹¹(¹:¹¹¹>¹
)+$¹¹(¹¹¹
¹
¹
¹¹¹¹
*#¹
¹¹¹
¹¹A-E)+$¹>¹
), ¹¹
¹¹A-E)+$¹
¹&¹¹¹
¹&¹
5¹
)- ¹9¹¹
¹¹¹¹
¹¹
¹
¹¹¹A-E)+$¹
¹
%1$1$1
3¹ ¹ ¹ A%¹ ¹ ¹ ¹ * ¹ ¹ ¹ ? ¹ ¹ ¹
)""%¹¹¹¹¹¹¹
¹¹¹
? ¹¹¹)""¹$¹¹
¹¹ ¹¹ ¹¹¹ ¹ ¹ ¹ ¹0¹¹ *¹ ¹ ¹¹¹
¹
¹?¹¹%¹¹¹¹?
¹+¹
¹¹A¹
3¹ ¹ ¹ $¹ ¹ ¹ ¹ A¹ ¹ ¹ ¹ ¹ ?¹ ¹ ?¹ ¹ ¹
¹ ¹ 5¹ ¹ ?
¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
¹*¹¹¹¹¹
?¹
¹¹¹¹¹¹¹
¹¹¹¹
¹¹
¹¹
ù 3¹¹¹
¹¹¹A¹
¹¹¹¹
FG¹ ¹¹¹(HD¹¹?¹¹
¹¹¹
F!G¹ ¹¹¹(HD¹¹¹¹¹¹
¹¹¹
F#G¹ ¹¹¹(H¹¹¹¹¹¹
¹¹
F,G¹ ¹¹¹(H¹¹¹¹
¹¹
¹
¹¹¹%¹!¹0¹¹¹
I¹ ¹¹0¹¹?¹
¹¹ ¹¹¹6"I.¹H¹"I.J7¹
¹
/¹0¹¹¹¹
¹
¹¹6B-KJ%¹
7¹¹¹
98¹
¹?85
¹¹?¹
¹¹ ¹
¹¹%¹¹
FG¹¹¹¹¹¹ ¹¹¹¹
¹¹L¹¹¹¹¹
¹¹%¹¹¹!¹0¹¹¹¹¹4
¹@%¹¹/¹
0¹¹¹¹¹¹ ¹6'+.H'+.J7¹¹
¹¹¹¹
¹ ¹ ¹¹¹
¹¹
/¹ %¹¹¹
¹¹¹¹¹
¹0¹
¹¹
¹ %¹¹¹¹
¹¹
¹¹¹%¹
¹¹¹
¹
¹
220
D. Rossi – RES224
$M6#¹07¹K¹6#¹07¹K¹6#¹07¹K¹6!¹07¹M¹¹0¹
"¹¹¹
¹¹¹
¹ ¹¹@¹¹¹¹B-¹(¹¹
%¹¹
$M6!¹07¹K¹6!¹07¹K¹6!¹07¹K¹6!¹07¹M¹C¹0¹
F!G¹¹¹¹¹¹ ¹¹
¹¹¹¹¹ ¹ ¹¹
¹¹¹¹
¹
¹ ¹¹¹
?¹0¹¹¹:59¹
8¹¹¹0¹¹¹¹
%¹
¹¹¹¹¹¹¹!¹
0¹¹¹
$¹M¹6!07¹K¹6!¹07¹M¹,¹0¹
F#G¹¹¹¹?¹¹¹¹¹¹ ¹9¹¹¹¹
%¹¹
¹¹
¹¹¹ *%¹¹¹%¹¹
¹¹¹
?¹0¹6B-¹¹¹¹¹
¹7%¹
¹¹¹
$¹M¹6!07¹K¹0¹K¹0¹K¹0¹M¹N¹0¹
F,G¹¹¹¹¹¹¹¹
¹¹¹%¹¹¹¹¹
@¹
¹¹@¹¹(L¹¹¹¹¹
¹¹¹O¹)""%¹¹¹
0¹¹¹¹¹%¹¹
¹
$¹M¹6!07¹K¹0¹M¹#¹0¹
%&
3
'
0¹
¹
¹")¹6¹¹
¹
%¹¹¹0'¹+-'7¹
¹
ö ¹¹ ¹ ¹ ¹ ¹ ¹ ¹ $¹ ¹ @¹ ¹ ¹
¹¹
¹¹
¹
¹
¹¹
¹¹¹
¹
¹ ¹¹
¹¹$¹
¹
¹ ¹6¹
¹7¹
¹¹¹
¹
¹¹
¹
¹¹6¹
¹¹¹
¹
7¹
¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ $¹
¹
¹¹¹
¹¹¹¹¹¹¹¹%¹¹¹
¹¹¹
¹¹¹/¹;¹¹<¹¹¹/¹¹
ù¹¹¹
¹¹¹
¹¹
¹¹ ¹ ¹ @¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
%¹¹
¹¹¹¹
¹¹¹¹¹¹
¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
$¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
¹¹¹¹
¹
¹
¹
¹¹¹
¹¹¹¹¹¹¹
¹¹&¹¹
9¹
¹¹
¹
¹¹¹¹
) - ¹ ¹ ¹ ¹ ¹ ")¹ ¹ ¹ ¹ ¹ 9/¹¹
/¹
¹
¹
3.2 Correction
¹
) ¹¹¹¹¹
¹¹¹¹¹")¹
¹¹
¹
¹+¹¹¹¹¹-¹
) * ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹
¹
) ¹¹¹¹¹
¹¹¹
¹¹¹3¹¹ 2¹")¹
)+ ¹¹
¹
¹* ¹")¹6¹¹¹/¹?¹
?¹¹:
7¹¹¹¹¹¹¹¹ ¹¹¹¹¹¹¹¹¹
¹¹¹¹¹¹¹
¹¹
¹ ¹¹¹¹
¹¹¹
¹¹ ¹¹¹¹¹¹¹¹@¹¹
R ¹¹ ¹¹¹¹¹¹
¹
¹ ¹
¹
),
4¹¹¹¹ ¹
¹¹¹6¹7¹
¹¹Ø Ø ¹")%¹
¹¹
¹
¹¹¹-¹¹+¹>¹6¹¹¹¹9¹¹¹
¹¹¹¹7¹¹
¹
¹¹¹
¢ ¢
¢
¢
¢
¢ ¢ !
¢
¢"#$$¢%
&'()*
¢
¢
¢
#+,'-.(#/¢0
¢
(1
.$(/¢2¢0
¢
/33
33
.$(/¢2¢0
¢
(1
4++
5 6.76'-76.76'-7
¢¢83¢¢
(1/988¢4:
;.<
/33
33
=8,
>
¢
¢¢?
¢
¢ 3)¢280
¢
"8?*/¢
¢
8
¢
(
¢ ¢280
¢
@!)
¹
¹
222
D. Rossi – RES224
)*
¢"#$"
3:2"8
A:B#
C(
//2D"E
777&'()*
¢
)*
¢& ¢@!C¢38
<&+<4"+;"(4"
$,$',<,<F
G$<
H.>
:>,#,#
",I
4"<
.<
4',H.>J
&'$
777H.>
H?¢3
777#+,'-3/6 30)*
¢7",IK
6 30)*
¢7"
1
777.$/6¢280
¢7
7774++
6¢280
¢7.
1
3C
5 LL
?¢
777
6¢280
¢7
¢3
¢)
¢3
¢"#$$¢%
777&'()*
¢
¢3
¢
$,$',<,<F
",I:
H.-J
.<
"+.'"
:>,#,#
777"+.'"
.¢'"
777&'()*
¢
¢3
¢
$,$',<,<F
",I:
H.-J
.<
:>,#,#
777#+,'-3/6 30
¢7",IK:
(1
777.$/6¢280
¢7
7774++
(1
5 6.76'-76.76'-7
777
(1/988¢>:4>4
6¢280
¢7"
D(1/988¢>:4>4E
1-*I:#¢¢)
¢280
¢"
D1-*I:#¢¢)E
¢
@!
777=;,
)*
¢¢
¹
¹
¹
¹
¹
¹
3.2 Correction
¤
: ¹¹¹
::
:::?
:
: ¹¹¹
::
::
::?
: ¹¹¹
::
:?::?:
:??
?
: ¹¹¹
::
?:::
???:???
:?
: ¹¹¹
::
::?::?:
:?????
:
: ¹¹¹
::
?:::
???:???:
::?
:
: ¹¹¹
::
:?::::?:
:????::
:?
¹
: ¹¹¹¹¹¹
::
:?:::
??::??
:
: ¹¹
¹
::=J
:?::;0
??::??:M$
:L:
¢
¢$,
$',<,<F",
I
H.-J.<
¹
¹
224
D. Rossi – RES224
::GH.$
::>,#,#
: ¹¹¹¹¹¹
::=J
:::?:'05
:???:!6$
?#+,'-.(#/
:6 30
¢7
",IK
: ¹¹
::=J
?::H0
??:?:!6$
:?L:N(1
: ¹
::=J
::?:?:0O
:??:!4$
9.$(/6
?¢2¢0
¢7
: ¹
::=J
?::P0
??:?:!4$
:?L:6(1
: ¹
::=J
:::?:?:0
:??:!'$
4++
: ¹¹
::=J
?:::#G0
??:?::!':$
:L:
:5 6.76'-7
6.76'-7
: ¹
::=J
:::?:0
:?:?::!9$
4#¢¢,4
:/6:
0
¢74
/#
C(
?//2
::-3/+ 3
:::" 8 6 30
¢7;¢
?:+
/ 8
:?DP
5¢AE
#,#H¢
/
¹
¹
3.2 Correction
?/¢2¢
0
¢"8?*
/
8
¢
:?:/%A
B ::¢K,"(::
?B3K5
¢
/?
:¢
¢83¢¢
: ¹¹
::=J
?::-J0
??:?:!9$
:?L:9(1/98
8¢--
: ¹
::=J
::::?:0
:??::!$
Q=;,
: ¹
::=J
?::I0
??::??:!$
:L:>
: ¹¹
::=J
:??:::D@0D
??::??!$
:L:;;;;;;
: ¹¹
::=J
:::::?:D0
:???!$
:
: ¹¹
::=J
:::?:?:D0O
:???!$
:
: ¹¹
::=J
:?:::DM0D
??:??!$
:L:>;;;;;;
¹
¹
¹
¹
¹
¹
226
D. Rossi – RES224
¤ 4
5
!!"""!#$
56
¹ A¹¹¹¹¹¹¹
¹ ."¹>¹¹
."¹¹¹¹9¹¹¹7¹¹¹
¹¹
¹¹+¹
¹¹+¹¹
¹!7¹
¹¹¹¹
9¹¹
¹
!¹ A¹¹¹¹¹ ."¹%¹
¹59¹¹¹¹>¹
¹9¹ ."¹¹¹¹¹+¹¹¹+¹
¹
9¹¹¹+¹
¹¹85%¹¹¹¹¹¹8¹5
¹¹
¹
¹¹¹
¹ ¹6¹¹%¹¹¹%¹7¹
#¹ +¹5¹¹¹ ."¹
¹9¹¹>¹¹
9%¹¹9¹65¹
¹%¹5¹¹5%¹5¹¹¹
""(¹%¹5¹¹¹+¹)¹%¹5¹¹¹ ¹
¹)#¹¹¹ E¹
%¹5¹9¹¹¹¹¹%¹7¹
,¹ +¹5¹¹¹ ."¹*
¹9¹¹>¹
A¹9¹¹8%¹
%¹¹¹¹
¹¹¹¹6¹
%¹5¹9¹¹¹9¹5¹¹8¹¹¹¹ ."¹¹
7¹
N¹ A¹¹¹¹¹ ."¹¹6¹P ¹7¹>¹¹
E+. %¹¹E89¹+¹.¹ ¹
Q¹ A¹¹
¹¹¹
¹ ."¹¹¹>¹2 N#¹6
¹N#7¹
R¹ A¹¹¹¹¹ ."¹¹¹6¹P ¹7¹>¹¹
8%¹
%¹¹¹
¹
C¹ A¹¹¹
¹ ."¹¹9¹>¹9¹
¹
¤
7
¹
!¹
#¹
,¹
N¹
Q¹
."¹¹¹¹¹¹í¤89¹
."¹¹
¹¹¹+---¹¹
¹¹
."¹¹¹
¹
¹¹¹í¤89¹
."¹
¹¹
¹¹¹¹9¹¹¹¹¹í¤89¹
."¹¹¹¹9¹¹¹9¹¹
¹
¹¹í¤89¹
."¹¹¹
¹¹58¹9¹¹¹¹¹¹
¹58¹9¹
¹¹6¹¹9¹7¹¹
R¹ ."¹
¹¹ ¹.¹"9¹¹
¹
¹
3.2 Correction
C¹
S¹
."¹
¹¹¹¹
¹9¹¹9¹¹¹¹í¤89¹
."¹
¹¹¹¹
¹9¹9¹¹¹¹¹6%¹
¹¹¹¹7¹¹
D¹ ."¹¹9¹2 ¹¹¹¹9¹¹
¹ ."¹¹¹¹¹¹9¹9¹¹¹¹¹¹
¹
¹¹¹¹
!¹ ."¹¹¹%¹¹9%¹
¹¹¹3¹¹¹9¹¹í¤89¹
#¹ ."¹¹¹¹¹
¹¹¹
¹¹¹8¹¹
,¹ ."¹¹¹¹¹¹¹
¹¹¹¹8¹9¹¹¹¹
¹¹¹ ¹¹
N¹*9%¹ ."¹:¹
¹5¹¹¹¹
6%¹
%¹%¹%¹%¹%¹7¹í¤89¹
Q¹ ."¹¹¹85¹¹¹:¹
¹¹¹¹¹í¤89¹
R¹.5
9%¹¹¹¹¹
3¹¹¹¹¹í¤89¹
C¹.5
9%¹¹¹¹¹DD¹ ."¹:¹
¹¹¹í¤89¹
S¹ ."¹:¹
¹¹¹
¹9¹+.¹¹¹
!D¹ ."¹
¹¹¹¹¹¹
!¹$¹
¹¹¹¹¹¹Q¹¹¹¹
!!¹'¹
¹¹¹¹¹¹!NN¹¹¹í¤89¹¹
!#¹ ."¹
¹¹,%¹¹¹¹9¹¹¹+¹
¹¹
!,¹0¹
¹¹59¹
¹5¹¹
¹TU¹¹
!N¹¹
¹
¹%¹¹¹3¹¹¹
¹5¹
¹¹¹
¹¹6%¹9%¹937¹¹¹
!Q¹ ."¹¹
¹¹¹¹¹¹+¹
¹¹í¤89¹
!R¹ ."¹¹
¹¹¹
¹¹¹+¹
¹¹¹¹¹¹¹í¤89¹
!C¹ ."¹¹¹
¹¹¹)¹
¹¹+¹
¹¹
!S¹ ."¹¹
¹
¹¹¹¹$¹
¹¹¹í¤89¹¹
#D¹¹¹¹¹
¹¹9¹ ¹¹¹¹¹¹
#¹¹¹¹¹¹¹9¹
¹¹¹ (¹¹¹¹¹
#!¹"¹¹6¹:
7¹9¹¹¹9¹¹
¹¹¹í¤89¹¹
##¹¹)V¹
¹9¹¹¹¹¹¹¹¹¹¹¹¹
¹¹¹í¤89¹
#,¹¹
¹
¹¹¹¹¹
¹¹¹¹¹¹$¹
¹¹
#N¹¹¹
¹¹¹¹¹
¹9¹
¹¹í¤89¹
#Q¹ ."¹¹¹
¹¹¹
¹
¹¹
¹¹¹¹í¤89¹
#R¹¹.)-¹
¹¹¹
¹
¹9¹
¹¹¹¹¹
¹¹¹
#C¹ ."¹¹¹¹¹¹¹¹¹¹
#S¹ ."¹¹¹
¹¹
¹¹58¹
¹¹¹ ¹¹¹8¹
9¹¹¹
,D¹ ."¹:¹¹¹¹¹¹9¹¹¹9¹¹
¹¹¹
¹
¹¹¹¹¹
¹
'
*Ø* *
** *Ø* ¹ E¹¹SH¹8%¹¹:¹
¹6$ 7¹¹¹"¹¹5¹
¹¹.5¹I8¹9¹¹- ¹59¹¹9¹
¹¹SH¹
¹
¹
¹
228
D. Rossi – RES224
+¹¹j** ¹"¹¹6%¹¹¹A¹¹¹"¹¹
¹¹A¹¹¹"¹7>¹¹
A¹¹5¹"¹¹9¹¹ ."¹¹¹
¹¹¹"¹
¹A¹¹¹6%¹55537¹¹¹A¹¹¹ ."¹¹
¹
¹¹¹$ ¹¹¹¹3¹
¹¹¹¹¹¹A¹¹
¹¹¹+¹
%¹8¹¹A¹
5
¹¹¹¹ ."¹
¹¹¹¹
!¹ '5¹¹¹¹¹%¹ ¹59¹¹¹¹"¹¹8¹
¹
Ø
%¹
¹9¹¹¹5¹¹9¹¹
¹¹6
¹¹
97¹¹9¹¹.I¹5¹¹¹
¹¹ ."¹¹¹"¹¹9¹
¹9¹::
¹
¹
¹
9%¹9¹¹¹¹¹(5%¹¹
¹
¹
¹¹::¹
¹¹9¹ %¹¹¹¹ ."¹¹
¹¹¹ ¹¹¹*¹¹ ¹¹5¹
%¹¹¹
¹
¹¹5
¹¹
¹¹¹9¹¹¹¹$ ¹¹
¹
4:
¹ ¹9¹¹¹A¹¹9¹¹¹
¹%¹
¹¹9¹5¹
¹ ¹ Ø* ."¹ 9¹ ¹ %¹ ¹ ¹ 9%¹ ¹ ."¹ ¹
¹¹ ¹¹¹
¹ ¹ ¹¹+¹¹¹ ¹A¹¹¹ ¹ 5¹
0¹¹
¹¹
9¹5¹¹*¹
¹¹¹* %¹¹5¹¹ ¹¹¹
¹¹¹8¹¹¹5¹¹¹¹
¹¹¹¹¹9¹¹
*
*
¹¹0¹
¹¹¹¹
¹¹
9¹5¹9¹¹
¹ ."¹¹*5¹¹
¹¹ ¹
¹¹¹
¹>¹
!¹ A¹ ¹ 5¹ 9¹ ¹ ¹ A¹ 8
¹ 9¹¹
8
¹ ¹ ¹ 5¹ 65¹ ¹5¹ ¹ 5¹ 9¹ ¹ ¹
4A-0I¹ 89
¹ ¹ ¹ W-0I¹ ¹ :7¹ >¹ ¹ ¹ 5¹ ¹ 5¹ ¹
¹¹8
¹>¹¹8
¹>¹
¤'
¹
!¹
#¹
,¹
¹
(5¹
¹¹¹ ."¹¹85¹¹
9¹¹¹¹>¹¹
¹¹ ."¹¹¹9¹
¹5¹¹
9¹¹¹¹¹
¹¹¹¹¹'¹6¹5¹
>7¹
¹9¹8¹¹¹¹¹¹
¹
9¹¹¹¹¹>¹
I%¹
¹
¹9¹
¹'¹%¹¹
9¹¹¹¹¹
¹¹¹
¹¹¹¹¹
¹¹¹%¹¹¹
¹¹¹
¹
¹9¹¹5¹¹¹¹¹$¹¹ ¹%¹¹
¹¹¹
¹¹T:U¹¹¹5
%¹
¹¹¹¹9¹ ."¹¹
¹$¹
¹+
%¹¹¹
¹¹¹$¹ ¹9¹¹¹
%¹9¹¹ ¹¹¹¹$¹ ¹¹¹ ."¹¹
5
¹
¹¹¹¹5¹'¹
A9¹
¹9¹8¹¹¹¹ 9¹#¹¹>¹¹
E¹?¹¹5¹9¹¹9¹¹¹¹¹¹N!¹9¹2 ¹8¹
(5¹
¹9¹8¹¹¹¹5¹¹
¹¹¹>¹
$¹¹¹E+. ¹¹¹¹¹¹¹
¹¹¹
¹¹¹
¹¹¹607¹9¹¹¹¹¹¹%¹¹
¹
3.2 Correction
¹
¹
¹
¹
¹
¹
¹
¹
N¹
Q¹
R¹
¹¹¹¹¹¹9¹¹9¹¹
¹¹9¹
¹¹+¹¹
%¹0¹¹
¹¹¹¹¹¹
5¹¹¹¹¹¹¹
¹¹¹¹
¹9¹¹
¹
¹5¹8¹¹8:¹¹%¹5¹¹¹
0X8¹M¹X0-$IX8¹Y¹X42-"+*.X8¹
¹¹%¹¹¹¹
¹¹
¹¹¹0XDX8¹
5¹¹99¹5¹¹9¹:5
¹¹
¹
0XDX8¹O¹0X¹¹
¹9¹%8¹
¹
¹¹
¹%¹9¹¹0XDX8¹¹¹
¹¹¹9¹¹
¹¹5¹FD%8G¹%¹¹¹5¹59¹¹¹ ."¹¹¹¹5¹
¹5¹0¹%¹¹¹
¹
0XDX¹M¹¹8¹6¹0XDX8¹7¹¹
5¹¹¹¹
¹¹¹0¹¹0X¹5¹¹
¹"¹¹
0XDX8¹¹¹¹¹5¹¹9¹
¹O¹0X%¹¹¹¹¹¹¹
8¹¹¹5¹¹
¹¹¹¹¹
¹¹¹%¹¹¹5¹
9¹¹¹¹¹5¹0X8¹¹%¹
¹¹¹ ¹¹6¹
¹
¹¹07¹9¹5¹¹
¹¹¹
¹¹9¹¹9¹¹¹¹¹¹9¹¹¹
¹¹
¹¹¹58¹¹
+¹
¹¹ ."¹9¹¹9¹¹¹¹5¹9¹
¹¹¹¹ ."¹
¹¹5¹
¹¹¹9¹¹8¹8¹6+%¹5¹¹9¹9¹>7¹
E¹¹5¹
9¹
¹¹¹
¹¹¹ ."¹¹9¹¹¹
¹¹¹+
%¹¹9¹8¹¹¹ ¹
¹¹
%¹5¹
¹¹¹¹¹9¹
9¹9
¹¹20$¹
HH
¹¹¹9¹¹5¹"¹¹¹¹
¹TU¹
¹¹ ¹69¹7%¹¹9¹5¹¹
¹¹¹ ."¹¹¹¹¹
5¹¹¹¹
¹
¹¹ ¹0¹¹5¹5¹¹¹¹¹¹
¹¹
A¹
¹¹¹9¹
¹¹
9¹¹>¹
.¹¹
¹
¹¹9¹¹
¹¹¹¹¹ ."¹
¹6+%¹¹¹7¹9%¹¹¹
¹
¹9¹9¹¹
¹TU¹¹¹
%¹¹¹¹¹
9¹
¹¹¹9¹+¹
%¹¹¹¹¹¹5¹¹
¹¹¹¹¹%¹¹¹¹:
5
¹¹%¹:¹¹¹9¹9¹5¹
¹"¹¹¹¹¹5¹¹¹¹
¹
(5¹9¹ ."¹¹
¹9¹8¹¹:¹¹¹¹9¹
9¹>¹
9¹¹9%¹¹DD¹¹¹¹
9¹E9¹%¹9¹¹¹
¹6Z,N[7%¹¹¹
¹6!N[%¹%¹5¹¹¹+ ¹
7¹*¹¹¹
¹85¹$ ¹6![7¹¹¹¹¹
¹¹¹¹
¹
9¹
¹6R[7¹*%¹¹¹¹¹¹6%¹¹9¹¹¹
¹ ."¹¹¹¹ ¹¹7¹¹¹![¹¹
'¹¹
%¹¹¹¹
¹9¹A¹¹¹TA5%¹?¹¹¹¹
8U¹%¹¹¹¹¹
¹¹)¹!DD#%¹¹¹
HH555
HHH!DD#H
8H5:!DD#
¹
¹
¹
¹
¹
230
D. Rossi – RES224
'
* Ø
¹ A¹
¹¹¹¹¹::¹
¹¹
¹5¹¹ ."¹
¹¹¹::
¹>¹¹A¹¹¹¹
¹¹¹¹¹
¹>¹
¹¹¹¹
¹ ."¹¹6%¹¹¹ ."¹7¹
¹¹$¹¹¹¹
¹¹¹¹$¹¹67¹
¹¹¹¹+¹
¹
¹5¹¹¹¹
¹67¹¹¹
¹
58¹¹ ¹:¹¹6%¹¹
¹7¹¹5¹A¹¹
¹
¹¹A¹¹¹¹¹
¹¹¹¹$¹¹67¹
¹ ."¹¹65¹¹ ¹
¹¹¹58¹
¹¹ ."¹7¹
¹
67¹ ¹9¹¹¹¹¹5¹¹¹ ."¹¹¹¹9¹
!¹ ¹¹ ."¹¹99¹
¹
¹::
¹¹5¹¹
::¹ ¹¹9%¹¹¹ ."¹¹
¹9¹¹¹
5¹9¹¹¹
¹
¹¹¹B¹¹
¹
¹¹
¹
¹¹¹
E9¹TU¹¹::
¹%¹¹¹ ."¹¹¹
¹¹
."¹8:¹
9%¹¹¹¹ ¹9¹¹
¹¹¹
(5%¹¹
¹ ¹ ."¹¹
¹58¹
¹¹8¹¹
::
¹¹¹9¹¹¹
¹
¹
L5
$$5
¹¹¹¹¹8¹
%¹
¹¹5¹ ."¹8¹
¹¹
¹¹6%¹5¹9¹H."H¹¹¹
¹¹5¹7¹I¹¹9¹9¹5¹
."¹¹ !Ø
¹*¹
¹¹¹$ H2 ¹¹¹
¹
¢¢ ¢¢
¢¢
¢
¹
I¹ 9¹(+.'*¹9¹
I¹ ¹¹¹¹9¹¹.)-¹¹¹9¹¹¹¹¹¹¹¹
I¹ F\8
G]¹¹:¹(+.'*¹¹
¢
¢ ¢
&,<-(
I¹ 9¹(+.'*¹9¹
I¹ ¹%¹¹¹¹¹¹¹(+.'*¹
¹¹¹¹
I¹ F\8
G]¹¹:¹(+.'*¹¹
BB+<"P.",(</
:
:,<$.¢)
I¹ 0¹9¹
I¹
¹: ¹#DS!CQ¹
I¹ LL¹42-"+*.¹"-+*.¹
I¹ LCQS!#D:
¹¹¹¹¹+.¹¹¹¹¹¹0¹
BB+<"P.",(</
¢,<#G¢3
¢
¹
¹
3.2 Correction
BB+;&(.,J",(</
¢,<<"3
¢
¢
¢,<<"
¢
¢
¢,<<"
%8
8
¢,<<"
¢
¢
¢,<<"
¢
¢
BB+44,,(<+'",(</
¢3
¢,<+
¢
¢::,<+
¢
¢::,<+
¢
¢::,<+:
¢
¢::,<++++/////
¢
¢::,<+
¹
¹ )V¹9¹49¹9¹
¹ F\¹
G]¹
¹:¹)V¹¹¹
¹ .¹¹¹¹."¹¹¹
¹¹¹9H
¹¹
BB+<"P.",(</
5553,<+:
5553,<+:
5553,<+:
5553,<+:
BB+;&(.,J",(</
3,<<"3
3,<<"3
3,<<"3
3,<<"3
3,<<"3
3,<<"3
3,<<"?3
BB+44,,(<+'",(</
?3,<+
3,<+
3,<+
3,<+:
3,<+
3,<+
3,<+:
BB=83/3¢
BB".H./RDE
BBP&</#
(://
BB#"F",I)/
¹
I¹  ¹¹
¹
I¹ :¹¹R¹¹
I¹ :¹ ¹¹
¹
¹¹¹:¹
¹¹¹
¹¹
¹ ¹
¹
¹¹
¹R¹
R¹ ¹¹¹
¹R¹ ¹¹R¹¹¹: ¹¹
¹
¹
232
232
4 Travaux pratiques (TP)
D. Rossi – RES224
4.1 Enoncé
¹ ¹
""##$$$
#%
!¹
&
""##
#$'#
¹ ¹ ! " #
$ % & ¹ " '$%
& ( )*+,
- . ¹ " #'
( !
, " #" & ' ("'
‚ !
" # $
% &' ‚ (
) ) *+‚ "##$$$))
%
, +-./010
. $ !%
*
#
, 23 .
1˜ O 4 56
$( ) /˜ ' ( 23
3 ((
5'
$ . $ 4 23
7˜ . $ 4 236
% % 23 5 8 % $ ( $ 9 :˜ 8 % $ $ 9 4 + $ ) % $ 9 4 ( ;˜ -
, 4
, <
˜ <
= 5 234
D. Rossi – RES224
˜
, ), 5
$ % ) <
4
!
, 9
, , #
"+
8
% 7
!87# +. 1>7> 4 % % $ % ‚ )
8 %
17?1>:1>//:; % 8 % 17?1>:10@0@ 6
, . 8
L˜ "
$
, , = % 8. , , % 8) ,"
!%
+-.#‚ ) %
17?1>:1>//:;% 17?1>:/1:170' $ $ ,
% +-.A;: 3 +-.A;A 3 B 8
+-.A;; 8
3 +-.A;> 38
+-.A;0 +-.A;? "8
+-.A0@ ,C 8
+-.A01 "4 (
8
"-'
6
, 9 D
6
%% $ %
$ D % D 4
D, !83-
, #
%
$ % <
L˜ C6.
3,, < D C6.5
L˜ ' 3,, < DE
,F 23
L˜ .
3,, < $ 4
G ‚
4
L˜ 6
3,, <)% % $ %
4.1 Enoncé
*
* (,
< <==GGGG
,==
<==GGGG
,=
-6O< < <==G
G
,=
" 4 , *
, ) , ) 4
, ' % , ' %
-
%
, L˜ 4 . ' @4@0 !.# 4 HH @4@0 -
6 ) , L˜ , 4 17?1>:1>///>
-
HH17?1>:1>///>
, 4 17?1>:1>///> 17?1>:1>//>
-
HH17?1>:1>///>
HH17?1>:1>//>
L˜ 3
, 4 % I % $ % 4
$ "
3
$ -
4 -
3
,3
4
)%% $ 4
!
# ‚ J , 6
"4 % 4 + $(% ) ,
, 4 6
)
-
% , $ 236
236
D. Rossi – RES224
5 Lectures obligatoires (LO)
Les documents ci-joints se focalisent sur SPDY, que la version française de Wikipedia definit comme:
SPDY (prononcé Speedy pour rapide en anglais) est un protocole réseau expérimental fonctionnant sur la couche application créé pour transporter du contenu Web. SPDY est une
proposition, conçue par Google, visant à augmenter les capacités du protocole HTTP sans
toutefois remplacer ce dernier.
Le but premier de SPDY est de réduire la durée de téléchargement des pages Web1 en classant par ordre de priorité et en multiplexant le transfert de plusieurs fichiers (ceux composant
une page web) de façon à ce qu’une seule connexion soit requise.
Comme il est facile d’apercevoir, les pages de Wikipedia en français ou en anglais ont souvent un
volume et une qualité de contributions et de references trés differentes. Mais si Wikipedia en anglais
s’avère être une meilleure source pour les articles techniques, il ne reste qu’un point de départ, auquel
il faut donc pallier avec des documents supplementaires. Or, le fait d’utiliser des documents externes á
donc plusieurs buts:
1. repondre aux questions d’efficacité que l’on peut soulever
2. montrer un approche d’ingenierie rigoureux pour atteindre ces resultats
3. montrer des differents type de sources (articles de vulgarisation ou techniquement pointus)
4. vous solliciter á l’usage de l’anglais pour vous source d’information
Faute d’espace, on n’a pas retenu utile inclure les documents normateur IETF, qui sont en cours
d’évolution, d’autant plus que la situation est loin pour l’instant d’etre completement figée. Specifiquement, les lectures obligatoires (dont on ne retrouve pas référence á partir de Wikipedia!) se concentrent
cette année sur:
1. le Whitepaper de Google sur SPDY (Sec. 5.1), qui trés succintement decrit le protocole ainsi que
ses resultats
“SPDY: An experimental protocol for a faster web,” Whitepaper available at http:
//dev.chromium.org/spdy/spdy-whitepaper
2. un article de vulgarisation (Sec. 5.2) qui est utile pour comprendre les enjeux derrier SPDY, et qui
se consacre au protocole
Bryce Thomas, Raja Jurdak, and Ian Atkinson. “SPDYing up the web.” Communications of the ACM Vol. 55. No. 12, pp. 64-73, December 2012
3. un premier article technique (Sec. 5.3) qui creuse les resultats de SPDY dans un contexte mobile
Jeffrey Erman, Vijay Gopalakrishnan, Rittwik Jana, K. K. Ramakrishnan “Towards a
SPDY’ier mobile web?.” Proceedings of 9th ACM Conference on Emerging Networking Experiments and Technologies (CoNEXT’13), December 2013.
4. un deuxieme article technique(Sec. 5.4) qui creuse les resultats de SPDY dans un contexte Internet
Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy, and David
Wetherall, “How Speedy is SPDY? ” Proceedings of the 11th USENIX Conference
on Networked Systems Design and Implementation (NSDI’14). April 2014.
La lecture de ces documents devrait:
1. cristaliser votre vision concernant les limites actuels de HTTP/1.0 et HTTP/1.1
2. vous informer sur le fait que SPDY est efficace (ou pas?) à resoudre ces problemes
Les questions QCM de l’examen sur ces LO exploreront ces deux aspects.
5.1 SPDY (Google whitepaper)
SPDY: An experimental protocol for a faster web -...
http://dev.chromium.org/spdy/spdy-whitepaper
The Chromium Projects
Home
SPDY >
Chromium
SPDY: An experimental protocol for a faster web
Chromium OS
Quick links
Report bugs
Discuss
Sitemap
Other sites
Chromium Blog
Google Chrome
Extensions
Executive summary
As part of the "Let's make the web faster" initiative, we are experimenting with alternative protocols to help reduce the latency of web pages. One of these experiments is
SPDY (pronounced "SPeeDY"), an application-layer protocol for transporting content over the web, designed specifically for minimal latency. In addition to a specification of
the protocol, we have developed a SPDY-enabled Google Chrome browser and open-source web server. In lab tests, we have compared the performance of these applications
over HTTP and SPDY, and have observed up to 64% reductions in page load times in SPDY. We hope to engage the open source community to contribute ideas, feedback, code,
and test results, to make SPDY the next-generation application protocol for a faster web.
Google Chrome Frame
Except as otherwise noted, the
content of this page is licensed
under a Creative Commons
Attribution 2.5 license, and
examples are licensed under
the BSD License.
Background: web protocols and web latency
Today, HTTP and TCP are the protocols of the web. TCP is the generic, reliable transport protocol, providing guaranteed delivery, duplicate suppression, in-order delivery, flow
control, congestion avoidance and other transport features. HTTP is the application level protocol providing basic request/response semantics. While we believe that there may
be opportunities to improve latency at the transport layer, our initial investigations have focussed on the application layer, HTTP.
Unfortunately, HTTP was not particularly designed for latency. Furthermore, the web pages transmitted today are significantly different from web pages 10 years ago and
demand improvements to HTTP that could not have been anticipated when HTTP was developed. The following are some of the features of HTTP that inhibit optimal
performance:
Single request per connection. Because HTTP can only fetch one resource at a time (HTTP pipelining helps, but still enforces only a FIFO queue), a server delay of 500
ms prevents reuse of the TCP channel for additional requests. Browsers work around this problem by using multiple connections. Since 2008, most browsers have
finally moved from 2 connections per domain to 6.
Exclusively client-initiated requests. In HTTP, only the client can initiate a request. Even if the server knows the client needs a resource, it has no mechanism to inform
the client and must instead wait to receive a request for the resource from the client.
Uncompressed request and response headers. Request headers today vary in size from ~200 bytes to over 2KB. As applications use more cookies and user agents
expand features, typical header sizes of 700-800 bytes is common. For modems or ADSL connections, in which the uplink bandwidth is fairly low, this latency can be
significant. Reducing the data in headers could directly improve the serialization latency to send requests.
Redundant headers. In addition, several headers are repeatedly sent across requests on the same channel. However, headers such as the User-Agent, Host, and Accept*
are generally static and do not need to be resent.
Optional data compression. HTTP uses optional compression encodings for data. Content should always be sent in a compressed format.
Previous approaches
SPDY is not the only research to make HTTP faster. There have been other proposed solutions to web latency, mostly at the level of the transport or session layer:
Stream Control Transmission Protocol (SCTP) -- a transport-layer protocol to replace TCP, which provides multiplexed streams and stream-aware congestion control.
HTTP over SCTP -- a proposal for running HTTP over SCTP. Comparison of HTTP Over SCTP and TCP in High Delay Networks describes a research study comparing the
performance over both transport protocols.
Structured Stream Transport (SST) -- a protocol which invents "structured streams": lightweight, independent streams to be carried over a common transport. It
replaces TCP or runs on top of UDP.
MUX and SMUX -- intermediate-layer protocols (in between the transport and application layers) that provide multiplexing of streams. They were proposed years ago at
the same time as HTTP/1.1.
These proposals offer solutions to some of the web's latency problems, but not all. The problems inherent in HTTP (compression, prioritization, etc.) should still be fixed,
regardless of the underlying transport protocol. In any case, in practical terms, changing the transport is very difficult to deploy. Instead, we believe that there is much
low-hanging fruit to be gotten by addressing the shortcomings at the application layer. Such an approach requires minimal changes to existing infrastructure, and (we
think) can yield significant performance gains.
Goals for SPDY
The SPDY project defines and implements an application-layer protocol for the web which greatly reduces latency. The high-level goals for SPDY are:
To target a 50% reduction in page load time. Our preliminary results have come close to this target (see below).
To minimize deployment complexity. SPDY uses TCP as the underlying transport layer, so requires no changes to existing networking infrastructure.
To avoid the need for any changes to content by website authors. The only changes required to support SPDY are in the client user agent and web server applications.
To bring together like-minded parties interested in exploring protocols as a way of solving the latency problem. We hope to develop this new protocol in partnership with
the open-source community and industry specialists.
Some specific technical goals are:
To allow many concurrent HTTP requests to run across a single TCP session.
To reduce the bandwidth currently used by HTTP by compressing headers and eliminating unnecessary headers.
To define a protocol that is easy to implement and server-efficient. We hope to reduce the complexity of HTTP by cutting down on edge cases and defining easily
parsed message formats.
To make SSL the underlying transport protocol, for better security and compatibility with existing network infrastructure. Although SSL does introduce a latency
penalty, we believe that the long-term future of the web depends on a secure network connection. In addition, the use of SSL is necessary to ensure that
communication across existing proxies is not broken.
To enable the server to initiate communications with the client and push data to the client whenever possible.
SPDY design and features
SPDY adds a session layer atop of SSL that allows for multiple concurrent, interleaved streams over a single TCP connection.
The usual HTTP GET and POST message formats remain the same; however, SPDY specifies a new framing format for encoding and transmitting the data over the wire.
Streams are bi-directional, i.e. can be initiated by the client and server.
SPDY aims to achieve lower latency through basic (always enabled) and advanced (optionally enabled) features.
Basic features
1 of 4
09/19/2014 09:59 AM
238
D. Rossi – RES224
SPDY: An experimental protocol for a faster web -...
http://dev.chromium.org/spdy/spdy-whitepaper
Multiplexed streams
SPDY allows for unlimited concurrent streams over a single TCP connection. Because requests are interleaved on a single channel, the efficiency of TCP is much
higher: fewer network connections need to be made, and fewer, but more densely packed, packets are issued.
Request prioritization
Although unlimited parallel streams solve the serialization problem, they introduce another one: if bandwidth on the channel is constrained, the client may block
requests for fear of clogging the channel. To overcome this problem, SPDY implements request priorities: the client can request as many items as it wants from the
server, and assign a priority to each request. This prevents the network channel from being congested with non-critical resources when a high priority request is
pending.
HTTP header compression
SPDY compresses request and response HTTP headers, resulting in fewer packets and fewer bytes transmitted.
Advanced features
In addition, SPDY provides an advanced feature, server-initiated streams. Server-initiated streams can be used to deliver content to the client without the client needing to ask
for it. This option is configurable by the web developer in two ways:
Server push.
SPDY experiments with an option for servers to push data to clients via the X-Associated-Content header. This header informs the client that the server is pushing a
resource to the client before the client has asked for it. For initial-page downloads (e.g. the first time a user visits a site), this can vastly enhance the user
experience.
Server hint.
Rather than automatically pushing resources to the client, the server uses the X-Subresources header to suggest to the client that it should ask for specific resources,
in cases where the server knows in advance of the client that those resources will be needed. However, the server will still wait for the client request before sending
the content. Over slow links, this option can reduce the time it takes for a client to discover it needs a resource by hundreds of milliseconds, and may be better for
non-initial page loads.
For technical details, see the SPDY draft protocol specification.
SPDY implementation: what we've built
This is what we have built:
A high-speed, in-memory server which can serve both HTTP and SPDY responses efficiently, over TCP and SSL. We will be releasing this code as open source in
the near future.
A modified Google Chrome client which can use HTTP or SPDY, over TCP and SSL. The source code is at http://src.chromium.org/viewvc/chrome/trunk/src/net
/spdy/. (Note that code currently uses the internal code name of "flip"; this will change in the near future.)
A testing and benchmarking infrastructure that verifies pages are replicated with high fidelity. In particular, we ensure that SPDY preserves origin server headers,
content encodings, URLs, etc. We will be releasing our testing tools, and instructions for reproducing our results, in the near future.
Preliminary results
With the prototype Google Chrome client and web server that we developed, we ran a number of lab tests to benchmark SPDY performance against that of HTTP.
We downloaded 25 of the "top 100" websites over simulated home network connections, with 1% packet loss. We ran the downloads 10 times for each site, and calculated
the average page load time for each site, and across all sites. The results show a speedup over HTTP of 27% - 60% in page load time over plain TCP (without SSL), and
39% - 55% over SSL.
Table 1: Average page load times for top 25 websites
DSL 2 Mbps downlink, 375 kbps uplink
Cable 4 Mbps downlink, 1 Mbps uplink
Average ms
Average ms
Speedup
Speedup
HTTP
3111.916
2348.188
SPDY basic multidomain* connection /
TCP
2242.756
27.93%
1325.46
43.55%
SPDY basic singledomain* connection /
TCP
1695.72
45.51%
933.836
60.23%
SPDY single-domain +
server push / TCP
1671.28
46.29%
950.764
59.51%
SPDY single-domain +
server hint / TCP
1608.928
48.30%
856.356
63.53%
SPDY basic singledomain / SSL
1899.744
38.95%
1099.444
53.18
SPDY single-domain +
client prefetch / SSL
1781.864
42.74%
1047.308
55.40%
* In many cases, SPDY can stream all requests over a single connection, regardless of the number of different domains from which requested resources originate. This
allows for full parallelization of all downloads. However, in some cases, it is not possible to collapse all domains into a single domain. In this case, SPDY must still open a
connection for each domain, incurring some initial RTT overhead for each new connection setup. We ran the tests in both modes: collapsing all domains into a single
domain (i.e. one TCP connection); and respecting the actual partitioning of the resources according to the original multiple domains (= one TCP connection per domain).
We include the results for both the strict "single-domain" and "multi-domain" tests; we expect real-world results to lie somewhere in the middle.
The role of header compression
Header compression resulted in an ~88% reduction in the size of request headers and an ~85% reduction in the size of response headers. On the lower-bandwidth DSL
link, in which the upload link is only 375 Kbps, request header compression in particular, led to significant page load time improvements for certain sites (i.e. those that
issued large number of resource requests). We found a reduction of 45 - 1142 ms in page load time simply due to header compression.
The role of packet loss and round-trip time (RTT)
We did a second test run to determine if packet loss rates and round-trip times (RTTs) had an effect on the results. For these tests, we measured only the cable link, but
simulated variances in packet loss and RTT.
We discovered that SPDY's latency savings increased proportionally with increases in packet loss rates, up to a 48% speedup at 2%. (The increases tapered off above the
2% loss rate, and completely disappeared above 2.5%. In the real world, packets loss rates are typically 1-2%, and RTTs average 50-100 ms in the U.S.) The reasons that
2 of 4
09/19/2014 09:59 AM
5.1 SPDY (Google whitepaper)
SPDY: An experimental protocol for a faster web -...
http://dev.chromium.org/spdy/spdy-whitepaper
SPDY does better as packet loss rates increase are several:
SPDY sends ~40% fewer packets than HTTP, which means fewer packets affected by loss.
SPDY uses fewer TCP connections, which means fewer chances to lose the SYN packet. In many TCP implementations, this delay is disproportionately expensive (up
to 3 s).
SPDY's more efficient use of TCP usually triggers TCP's fast retransmit instead of using retransmit timers.
We discovered that SPDY's latency savings also increased proportionally with increases in RTTs, up to a 27% speedup at 200 ms. The The reason that SPDY does better as
RTT goes up is because SPDY fetches all requests in parallel. If an HTTP client has 4 connections per domain, and 20 resources to fetch, it would take roughly 5 RTs to
fetch all 20 items. SPDY fetches all 20 resources in one RT.
Table 2: Average page load times for top 25 websites by packet loss rate
Average ms
Speedup
Packet loss rate
HTTP
SPDY basic (TCP)
0%
1152
1016
0.5%
1638
1105
32.54%
1%
2060
1200
41.75%
11.81%
1.5%
2372
1394
41.23%
2%
2904
1537
47.7%
2.5%
3028
1707
43.63%
Table 3: Average page load times for top 25 websites by RTT
Average ms
Speedup
RTT in ms
HTTP
SPDY basic (TCP)
20
1240
1087
12.34%
40
1571
1279
18.59%
60
1909
1526
20.06%
80
2268
1727
23.85%
120
2927
2240
23.47%
160
3650
2772
24.05%
200
4498
3293
26.79%
SPDY next steps: how you can help
Our initial results are promising, but we don't know how well they represent the real world. In addition, there are still areas in which SPDY could improve. In particular:
Bandwidth efficiency is still low. Although dialup bandwidth efficiency rate is close to 90%, for high-speed connections efficiency is only about ~32%.
SSL poses other latency and deployment challenges. Among these are: the additional RTTs for the SSL handshake; encryption; difficulty of caching for some proxies.
We need to do more SSL tuning.
Our packet loss results are not conclusive. Although much research on packet-loss has been done, we don't have enough data to build a realistic model model for
packet loss on the Web. We need to gather this data to be able to provide more accurate packet loss simulations.
SPDY single connection loss recovery sometimes underperforms multiple connections. That is, opening multiple connections is still faster than losing a single
connection when the RTT is very high. We need to figure out when it is appropriate for the SPDY client to make a new connection or close an old connection and
what effect this may have on servers.
The server can implement more intelligence than we have built in so far. We need more research in the areas of server-initiated streams, obtaining client network
information for prefetching suggestions, and so on.
To help with these challenges, we encourage you to get involved:
Send feedback, comments, suggestions, ideas to the chromium-discuss discussion group.
Download, build, run, and test the Google Chrome client code.
Contribute improvements to the code base.
SPDY frequently asked questions
Q: Doesn't HTTP pipelining already solve the latency problem?
A: No. While pipelining does allow for multiple requests to be sent in parallel over a single TCP stream, it is still but a single stream. Any delays in the processing of anything
in the stream (either a long request at the head-of-line or packet loss) will delay the entire stream. Pipelining has proven difficult to deploy, and because of this remains
disabled by default in all of the major browsers.
Q: Is SPDY a replacement for HTTP?
A: No. SPDY replaces some parts of HTTP, but mostly augments it. At the highest level of the application layer, the request-response protocol remains the same. SPDY still uses
HTTP methods, headers, and other semantics. But SPDY overrides other parts of the protocol, such as connection management and data transfer formats.
Q: Why did you choose this name?
A: We wanted a name that captures speed. SPDY, pronounced "SPeeDY", captures this and also shows how compression can help improve speed.
Q: Should SPDY change the transport layer?
A: More research should be done to determine if an alternate transport could reduce latency. However, replacing the transport is a complicated endeavor, and if we can
overcome the inefficiencies of TCP and HTTP at the application layer, it is simpler to deploy.
Q: TCP has been time-tested to avoid congestion and network collapse. Will SPDY break the Internet?
A: No. SPDY runs atop TCP, and benefits from all of TCP's congestion control algorithms. Further, HTTP has already changed the way congestion control works on the
Internet. For example, HTTP clients today open up to 6 concurrent connections to a single server; at the same time, some HTTP servers have increased the initial congestion
window to 4 packets. Because TCP independently throttles each connection, servers are effectively sending up to 24 packets in an initial burst. The multiple connections
side-step TCP's slow-start. SPDY, by contrast, implements multiple streams over a single connection.
Q: What about SCTP?
A: SCTP is an interesting potential alternate transport, which offers multiple streams over a single connection. However, again, it requires changing the transport stack, which
will make it very difficult to deploy across existing home routers. Also, SCTP alone isn't the silver bullet; application-layer changes still need to be made to efficiently use the
channel between the server and client.
Q: What about BEEP?
A: While BEEP is an interesting protocol which offers a similar grab-bag of features, it doesn't focus on reducing the page load time. It is missing a few features that make this
3 of 4
09/19/2014 09:59 AM
240
D. Rossi – RES224
SPDY: An experimental protocol for a faster web -...
http://dev.chromium.org/spdy/spdy-whitepaper
possible. Additionally, it uses text-based framing for parts of the protocol instead of binary framing. This is wonderful for a protocol which strives to be as extensible as
possible, but offers some interesting security problems as it is more difficult to parse correctly.
Sign in | Recent Site Activity | Report Abuse | Print Page | Powered By Google Sites
4 of 4
09/19/2014 09:59 AM
5.2 SPDY (CACM’12)
contributed articles
Improved performance and a proven
deployment strategy make SPDY a potential
successor to HTTP.
By Bryce Thomas, Raja Jurdak, and Ian Atkinson
SPDYing
Up the
Web
little resemblance to the Web
of a decade ago. A Web page today encapsulates
tens to hundreds of resources pulled from multiple
domains. JavaScript is the technological staple of Web
applications, not a tool for frivolous animation.
Users access the Web from diverse device form factors,
while browsers have improved dramatically. A constant
throughout this evolution is the underlying applicationlayer protocol—HTTP—providing fertile ground
for Web growth and evolution but was designed at a
time of far less page complexity. Moreover, HTTP is
not optimal, with pages taking longer to load. Studies
over the past five years suggest even 100 milliseconds
To d ay ’ s W eb b e a rs
64
co m m un i cat i o n s o f th e acm
| d ec em b er 2 01 2 | vo l . 55 | n o. 12
additional delay can have a quantifiably negative effect on Web use,9 spurring interest in improving Web performance. One such effort is SPDY, a
potential successor to HTTP championed by Google that requires both client and server changes, a formidable
hurdle to widespread adoption. However, early client support from major
browsers Chrome and Firefox suggests
SPDY is a protocol being taken seriously. Though server support for SPDY
is growing through projects like mod_
spdy12 truly widespread server adoption is likely to take time. In the interim, SPDY gateways are emerging as a
compelling transition strategy, promising to accelerate SPDY adoption by
functioning as a translator between
SPDY-enabled clients and non-SPDYenabled servers (see Figure 1). A variety
of incentives motivate organizations to
deploy and users to adopt SPDY gateways, as described here.
SPDY (pronounced SPeeDY) is an
experimental low-latency applicationlayer protocol27 designed by Google
and introduced in 2009 as a drop-in
replacement for HTTP on clients and
servers. SPDY retains the semantics
of HTTP, allowing content to remain
unchanged on servers while adding request multiplexing and prioritization,
header compression, and server push
of resources. Since 2009, SPDY has undergone metamorphosis from press release written and distributed by Google
to production protocol implemented
by some of the highest-profile players
on the Web.
Figure 2 is a timeline of SPDY milestones, first appearing publicly in a
key insights
S PDY seeks to improve Web page
load times by making fewer round trips
to the server.
C lient-side browser support for SPDY
has grown rapidly since 2009, and SPDY
gateways offer a transition strategy that
does not rely on server-side support.
O pen SPDY gateways are an opportunity
for organizations to capitalize on the
behavioral browsing data they produce.
Im ag e by Vla ditto
d o i: 1 0 .1 1 4 5/ 2 38 0 65 6 .2 38 0 67 3
242
D. Rossi – RES224
5.2 SPDY (CACM’12)
contributed articles
Figure 1. SPDY gateway translates between SPDY-capable clients and conventional HTTP
servers.
situated on high-speed
Internet backbone
nytimes.com
HTTP
SPDY Gateway
HTTP
facebook.com
SPDY
HTTP
netflix.com
Figure 2. Timeline of SPDY-related development milestones.
Open-source iOS
SPDY client library
commissioned by
SPDY author
Mike Belshe
November 2009 post6 on the Google
Chromium blog (http://blog.chromium.org) describing the protocol as
“an early-stage research project,” part
of Google’s effort to “make the Web
faster.” By September 2010, SPDY
had been implemented in the stable
version of the Google Chrome browser,14 and by February 2011, Google
quietly flipped the server-side switch
on SPDY, using the protocol to serve
major services (such as Gmail, Maps,
and search) to Chrome and other
SPDY-enabled clients.4 In February
2011, the Android Honeycomb mobile Web browser received client-side
SPDY support, also with little publicity.13 In September 2011, Amazon announced its Kindle Fire tablet, along
with the Silk Web browser that speaks
SPDY to Amazon cloud servers.3 In
January 2012, Mike Belshe, SPDY coauthor, commissioned development
66
co mm un i cat i o n s o f th e ac m
01
2
ar
.2
.2
01
2
SPDY
implemented in
Mozilla Firefox
browser
Fe
b
01
1
Se
pt
.2
1
.2
01
Fe
b
0
.2
01
pt
Se
No
v.
2
00
9
SPDY
announced as
“early-stage
research
project”
Amazon
Google begins
announces
serving Search, Kindle Fire tablet
Gmail, Maps,
using SPDY
and other major between client
services over
SPDY
devices and
SPDY to
implemented
Amazon servers
in Google Chrome compatible
to accelerate
clients
browser
browsing
M
SPDY implemented
in Google Android
Honeycomb browser
of an open-source Apple iOS SPDY client library.20 In March 2012, Firefox
11 implemented the SPDY protocol,19
which, by June 2012, was enabled by
default in Firefox 13,22 bringing combined client-side support (Chrome +
Firefox) to approximately 50% of the
desktop browser market25 (see Figure
3). SPDY is currently an Internet Engineering Task Force (IETF) Internet
draft in its third revision.7
SPDY’s rapid client adoption is impressive, though server adoption lags
considerably. SPDY gateways offer a
promising transition strategy, bringing many of SPDY’s benefits to clients
without requiring server support.
The incentive for clients to use SPDY
gateways is simple: a faster and more
secure browsing experience; SPDY is
generally deployed over SSL for reasons discussed later. There are also
commercial incentives for companies
| dec emb er 2 0 12 | vo l . 5 5 | n o. 12
worldwide to deploy SPDY gateways
on the high-speed Internet. Contentdelivery networks have begun offering SPDY gateway services to Web
site owners as a means of accelerating the performance (as experienced
by users) of their HTTP Web sites.1,26
Vendors of mobile devices might deploy SPDY gateways to accelerate the
notoriously slow high-latency mobile
browsing experience, a marketable
feature. Even more intriguing are
the incentives for large Web companies to deploy open (publicly available) SPDY gateways to collect and
mine rich information about users’
Web-browsing behavior, a lucrative
commodity in the business of online
advertising. Interestingly, the SPDY
gateway’s ability to aggregate certain
critical resources may provide benefits above and beyond regular SPDY,
as described later.
SPDY Protocol
SPDY is designed primarily to address
performance inhibitors inherent in
HTTP, including HTTP’s poor support
for pipelining and prioritization, the
inability to send compressed headers,
and lack of resource push capabilities.
SPDY’s hallmark features—request
multiplexing/prioritization,
header
compression, and server push—are described in the following sections:
Request multiplexing and prioritization. SPDY multiplexes requests and responses over a single TCP connection
in independent streams, with request
multiplexing inspired by HTTP pipelining while removing several limitations. HTTP pipelining allows multiple
HTTP requests to be sent over a TCP
connection without waiting for corresponding responses (see Figure 4).
Though pipelining has been specified
since the 1999 HTTP 1.1 RFC,11 Opera
is the only browser that both implements the feature and enables it by default. Other major browsers either do
not implement pipelining or disable
pipelining by default, as compatibility
with older Web proxies and servers is
problematic. Besides a lack of widespread adoption, HTTP pipelining also
suffers from head-of-line blocking, as
the specification mandates resources
be returned in the order they are requested, meaning a large resource, or
one associated with a time-consuming
244
D. Rossi – RES224
contributed articles
though each involves certain shortcomings; for example, data URIs allow
inlining resources (such as images)
into the main HTML but are not cacheable and increase resource size by approximately 33%. Another technique,
Comet, opens a long-held connection
to the server8 through which arbitrary resources may be pushed but requires an additional initial round trip.
Bleeding-edge technologies (such as
Web Sockets28 and resource prefetching15,21) also enable push-like behavior
but, like Comet, require an additional
round trip to establish such behavior.
A universal limitation of current techniques is they break resource modu-
tionary based on a priori knowledge of
common HTTP headers.29
Server push. This advanced feature
of SPDY allows the server to initiate
resource transfer, rather than having
to wait until a corresponding client
request is received; Figure 8 outlines
a server using it to push a resource to
the client that would be requested imminently regardless. Server push saves
superfluous round trips by relying on
the server’s knowledge of resource dependencies to determine what should
be sent to the client.
Performance engineers have employed a range of techniques to try to
achieve push-like behavior over HTTP,
Figure 3. Global browser usage share, as recorded by StatCounter (http://gs.statcounter.
com/#browser-ww-monthly-201103-201202); in February 2012, Chrome had 29.84% and
Firefox 24.88%.
StatCounter Global Stats
Top five browsers, Feb. 2011 to Feb. 2012
IE
50%
Firefox
Chrome
Safari
Opera
Other (dotted)
40%
30%
20%
10%
12
2
11
01
20
Fe
b.
.2
Ja
n
1
01
20
De
c.
1
11
v.
2
No
20
01
.2
Oc
t.
20
11
20
11
Se
pt
Ju
ly
11
Ju
ly
Ju
M
ay
ne
20
20
11
11
20
ril
20
ch
ar
Ap
11
0%
M
back-end process, delays all other resources (see Figure 5).
SPDY implements pipelining without HTTP’s head-of-line blocking limitation. Resources transferred through
a SPDY connection are carried in annotated “streams,” or independent sequences of bidirectional data divided
into frames; annotated streams allow
SPDY to not only return resources in
any order but interleave resources over
a single TCP connection7 (see Figure 6).
SPDY also includes request prioritization, allowing the client to specify
that certain resources be returned with
a higher priority than others. Unlike
many quality-of-service mechanisms
that work on prioritizing packets in
queues at lower layers in the network
stack, SPDY prioritization works at the
application layer, designed to allow
the client to specify what is important.
One use of prioritization is to request
that resources that block progressive
page rendering (such as cascading
style sheets and JavaScript) be returned
with higher priority. Another use of prioritization is to increase the priority of
resources being downloaded for the
currently visible browser tab while decreasing priority of resources belonging to a currently loading but hidden
tab. A further implication of implementing priority at the application
layer is that a server can, at least theoretically, prioritize not only the order in
which resources are transmitted over
the wire but also the order in which resources are generated on the back-end
if the task is time intensive.
Header compression. HTTP requests
and responses all include a set of HTTP
headers that provide additional information about the request or response
(see Figure 7). There is significant redundancy in these headers across requests and responses; for example,
the “User-Agent” header describing
the user’s browser (such as Mozilla/5.0
compatible, MSIE 9.0, Windows NT
6.1, WOW64, and Trident/5.0 for Internet Explorer 9) is sent to the server
many times over. Likewise, cookie
headers, which describe state information about the client, are repeated
many times over across requests. Such
redundancy means HTTP headers tend
to compress relatively effectively. To
further improve compression, SPDY
seeds an out-of-band compression dic-
Figure 4. HTTP pipelining allows multiple concurrent requests, reducing the number of
round trips to the server.
no pipelining
client
pipelining
server
client
server
d ecem ber 2 012 | vo l. 55 | n o. 12 | com mu n icat ion s of t he acm
67
5.2 SPDY (CACM’12)
contributed articles
larity by inserting JavaScript, links, or
inlined content into the main HTML
document. SPDY push does not require modification of content to support push functionality.
SPDY Security
The SPDY protocol can be run over
either a secure (SSL encrypted) or insecure (non-encrypted) transport. In
practice, both philosophical views
Figure 5. Head-of-line blocking in HTTP pipelining; a large resource blocks subsequent
smaller resources (left), and a slow-to-generate resource blocks subsequent resources
(right).
client
server
client
server
RE
REQ A
REQ B
REQ C
RE
RE
backend
QA
QB
QC
loading A…
RES A
large
resource A
blocks
smaller
resource
B and C
SA
RE
SB
RE
RES B
SA
RE
SB
RE
RES C
SC
RE
A takes
a long time
to produce
blocking
B and C
SC
RE
Figure 6. Request multiplexing and prioritization in SPDY; resources are sent in chunks over
independent streams that can be interleaved and prioritized.
Client
Server
SYN stream 1
Priority 3
SYN stream 3
Priority 3
SYN reply 3
SYN reply 1
SYN stream 5
Priority 0
higher-priority
request
Data frame
[stream 1]
SYN reply 5
Data frame
[stream 5]
Data frame + fin
[stream 5]
Data frame
[stream 3]
Data frame + fin
[stream 1]
Data frame + fin
[stream 3]
68
co m m u ni cati o n s o f th e acm
Server can
respond in
any order
| dec emb er 2 0 12 | vo l . 5 5 | n o. 12
Server
interleaves
higher-priority
response frames
before completing
previous transfers
on the role of Web encryption and
pragmatism in handling real-world
deployment constraints have led to
primarily SSL-encrypted implementations. Mike Belshe, SPDY co-author,7 and Patrick McManus, principal SPDY implementer for Firefox,
have expressed their interest in seeing the Web transition to a “secure
by default” model.5,18 Proponents of
encrypted SPDY say SSL is no longer
computationally expensive16 and its
security benefits outweigh its communication overhead.
The pragmatic reason for deploying SPDY over SSL (port 443) rather
than HTTP (port 80) is that transparent HTTP proxies between the client
and the server handle HTTP upgrades
unreliably.13 Transparent HTTP proxies do not modify encrypted traffic on
SSL port 443 (as they do on port 80)
and so should not interfere with newer
protocols like SPDY. The rationale for
choosing port 443 over an arbitrary
port number is that port 443 appears to
traverse firewalls more effectively13 (see
Figure 9).
SPDY relies on the next-protocol negotiation (NPN)17 SSL extension to upgrade the SSL connection to the SPDY
protocol. NPN is currently an Internet
Engineering Task Force Internet Draft
and implemented in OpenSSL23; NPN
also allows negotiation of other competing future protocols. SSL implementations that do not currently support NPN simply ignore the request to
upgrade the protocol, retaining backward compatibility.
SPDY Performance
Only a handful of (publicly available)
studies quantitatively benchmark
SPDY against HTTP, all from the same
source—Google. Corroborating the
following results across different sites
represents important future work:
Google Chrome live A/B study. In
2011, Google benchmarked SPDY
against HTTP in real-life A/B tests
conducted through the Chrome
browser. From March 22 to April 5,
2011, Google configured “in the wild”
deployments of the Chrome 12 browser to randomly assign 95% of browser
instantiations to use SPDY for SPDYcapable sites; the other 5% of instantiations used HTTPS. Google researchers observed a 15.4% improvement in
246
D. Rossi – RES224
contributed articles
page-load time across browser instantiations using SPDY,13 though a caveat
was that domain names in the study
were neither recorded nor weighted.
Google itself is thought by Google developers to be the current dominant
consumer of server-side SPDY technology so is likely overrepresented
in these results. Google sites were, in
2011, already heavily optimized, suggesting the stated improvement was
likely conservative, though further
data is needed for confirmation.
Google’s second result from the
study was that (encrypted) SPDY is
faster than (unencrypted) HTTP for
Google’s AJAX search; Google researchers provided no further detail.
Google lab tests set one. Google performed a series of laboratory benchmarks of SPDY vs. HTTP under various
conditions, though unencrypted SPDY,
which would be expected to be faster
than encrypted SPDY, was compared
against HTTP, despite SPDY deployments being predominantly encrypted.
For simulated downloads of the top
45 pages on the Web (as recorded by
Alexa), Google in 2011 reported a 51%
reduction in uploaded bytes, 4% reduction in downloaded bytes, and 19%
reduction in total packets vs. HTTP.13
Uploaded bytes were significantly reduced due to SPDY’s header compression and the fact that HTTP headers
are amenable to strong compression.
Google reported that download bytes
were only marginally reduced, as most
downloaded content in its tests was
not headers and in many cases already
compressed. The reduction in total
packets is due to both a reduction in
overall bytes and the fact that SPDY
uses only a single connection, resulting in more “full” packets.
Google lab tests set two. A 2009
Google SPDY white paper14 described
simulated page-load time of SPDY
vs. HTTP for the Alexa top 25 Web
sites. The first test simulated SPDY
vs. HTTP with 1% packet loss over
simulated home-network connections. Unencrypted SPDY exhibited
27%–60% improvement in page-load
time, and encrypted (SSL) SPDY exhibited 39%–55% improvement in
page-load time. A second test determined how packet-loss affected SPDY
(unencrypted) vs. HTTP; at 0% packet
loss, SPDY was 11% faster, and at 2%
Figure 7. HTTP request and response with example headers; header keys (such as
“User-Agent”) and header values (such as a particular user agent string) are repeated
many times over on a typical connection so make good candidates for compression.
GET /pub/WWW/picture.jpg HTTP/1.1
Host: www.w3.org
…
Accept-Endcoding: gzip, deflate, sdch
User-agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
HTTP/1.1 200 OK
Date: Thu, 26 Jan 2012 04:16:13 GMT
…
Server: Apache/2.2.3 (Red Hat)
Content-Type: image/jpeg
Figure 8. HTTP is unable to push resources to the client, even when it knows the client will
require them soon (left); SPDY server push allows the server to initiate the transfer of a
resource it believes the client will need soon (right).
Standard HTTP
SPDY Server
client
server
REQ e
xamp
ome
g
… <im …
html>
RES < anner.jpg" />
"b
=
rc
s
REQ e
xample
.com/b
anner.
jpg
jpg]
anner.
RES [b
server
REQ e
x
le.com
/h
Client
learns
it needs
banner.jpg
only after
parsing
html
client
packet loss SPDY was up to 47% faster.
A third test simulated SPDY vs. HTTP
as a function of round-trip time; for
example, for a 20ms round trip, SPDY
was 12% faster than HTTP, and for
a 200ms round trip, SPDY was 26%
faster; for a full exposition of these results, see Google.14
Questions on SPDY performance.
There is a conspicuous lack of results
describing how SPDY performs on
mobile devices. SPDY’s dominant performance improvements are due in
theory to reduced round trips between
client and server. Many mobile-carrier
technologies today exhibit latency several times that of their fixed-line and
Wi-Fi counterparts. By some projections, the majority of the developing
world will have its first Internet experi-
ample
[b
PUSH
.com/h
ome
jpg]
anner.
g
… <im tml>
/h
html>
RES < r.jpg" /> … <
e
n
an
src="b
Server
preempts
client’s
imminent
request for
banner.jpg
and pushes it,
removing
a round trip
ence through a mobile carrier, proliferating high-latency connections. In
theory, SPDY is ideally suited to these
high-latency mobile environments,
though real-world results are needed
for confirmation.
SPDY push-and-request prioritization is also underexplored. For push,
work is needed toward determining
how aggressive a server should preemptively push resources to the client.
For prioritization, no studies exist on
SPDY’s efficacy in tabbed browser environments where the currently visible
tab’s downloading resources could be
assigned higher priority.
SPDY Effect on TCP Connections
Though SPDY is an application-layer
protocol, it involves broader implica-
d ec ember 2 012 | vol. 55 | no. 12 | co mm un icat io ns o f t h e acm
69
5.2 SPDY (CACM’12)
contributed articles
tions and interesting interactions with
the TCP transport layer:
TCP connection proliferation in HTTP.
Prior to standardization of HTTP/1.1 in
1999, HTTP/1.0 permitted download-
ing only a single resource over a TCP
connection and only four concurrent
TCP connections to any given server.
HTTP/1.1 introduced persistent connections, allowing connection reuse
Figure 9. Real-world success rates of upgrading to newer protocols over various port
numbers, as measured by Google Chrome’s WebSocket team. Firewalls drop traffic on arbitrary new ports, and transparent proxies inhibit protocol upgrades over port 80.
86% of traffic works
dest-port 61985
firewalls can drop traffic
on arbitrary new ports
arbitrary new port
67% of traffic works
dest-port 80
standard HTTP port
transparent proxies often don’t
handle protocol upgrades correctly
95% of traffic works
dest-port 443
standard SSL port
SSL traverses
transparent proxies
unaltered
most firewalls
permit port 443
Figure 10. A SPDY gateway offers security between client and gateway regardless of the
security of the destination server.
SPDY
HTTP
Secure connection between
client and gateway, regardless
of destination server security
SPDY gateway to HTTP
server remains insecure
SPDY Gateway
unencrypted.com
plaintext.org
eavesdroppable.net
70
co m m un i cati o n s o f th e ac m
| d ec e mb er 2 01 2 | vo l . 5 5 | n o. 12
for multiple resources. HTTP/1.1 concomitantly reduced maximum concurrent TCP connections from four to two,
helping reduce server load and alleviate Internet congestion11 induced by
proliferation of short-lived TCP connections at the time. Unfortunately,
halving concurrent connections had
the adverse effect of reducing download parallelism. HTTP/1.1 envisaged
that the newly introduced HTTP pipelining would remedy the problem, but,
as described earlier, pipelining proved
difficult to implement and suffers from
head-of-line blocking, as in Figure 5.
Having only two concurrent connections creates a serious performance
bottleneck for modern high-speed
Internet connections and complex
Web sites. First, TCP slow-start, slowly
ramping up usable-connection bandwidth based on number of successfully received packets, is often overly
conservative in its initial bandwidth
allocation. Several round trips are
needed before the connection is saturated, by which time much of the content may have been downloaded already (at a slower-than-necessary rate).
Second, a typical modern Web page
encapsulates 10s or 100s of resources,
only two of which may be requested at
any given time. Without HTTP pipelining, requests cannot be queued on the
server, so each new request incurs an
additional round trip. Because most
Web resources are small, the roundtrip time to the server often dominates
over the time to receive the resource
from first to last byte.
Modern browsers break from the
HTTP/1.1 standard by allowing six or
more concurrent TCP connections
to a server. This allocation largely circumvents both previously outlined
problems—effective initial bandwidth
becoming TCP slow-start constant * 6
(rather than * 2) and fewer round trips
incurred due to higher request concurrency. A common practice among large
Web properties (such as Facebook,
Google, and Twitter) is to “shard”24 a
Web page’s resources across multiple
domains (such as img1.facebook,
img2.facebook, img3.facebook, and
img4.facebook) to subvert browser policy and obtain greater concurrency. In
a modern browser, a Web page sharded
across four domains can receive 4 * 6 =
24 concurrent TCP connections.
248
D. Rossi – RES224
contributed articles
TCP connection proliferation. Increasing concurrent connections
through browser policy and sharding
can improve page-load time but create
other problems. Though highly concurrent connections circumvent an overly
conservative slow start on a single TCP
connection, they may (in aggregate) exceed total available bandwidth, inducing packet loss. Moreover, the likelihood of losing a critical control packet
increases with the number of concurrent connections; for example, the TCP
SYN packet, which initiates a TCP connection, has a retransmission timeout
measured on the order of seconds if no
acknowledgment is received.
Highly concurrent TCP connections also decrease the likelihood of
fast retransmit being invoked under
packet loss. Fast retransmit is a TCP
enhancement that immediately resends a packet without waiting for
a fixed timeout delay if acknowledgments for several packets subsequent
to the lost packet are received. Highly concurrent connections obtain
less bandwidth individually than a
single long-lived connection and are
therefore less likely to receive and acknowledge enough packets in a short
enough duration to trigger fast re-
transmit. There is also less “body” in
each short-lived connection, increasing the likelihood that any packet loss
would occur near the end of a connection where too few acknowledgments
exist to trigger fast retransmit.
Finally, highly concurrent TCP connections create more connection states
to be maintained at various points in
the network, including at networkaddress-translation boxes, as well as
state binding processes to TCP port
numbers. In some instances, this state
can even cause poorly implemented
hardware and software to fail or misidentify the highly concurrent connection openings as a SYN flood (a type of
denial-of-service attack).10
SPDY elephant vs. HTTP mice. The
highly concurrent short-lived TCP flows
of modern HTTP fall into the category
of connections colloquially known as
“mice” flows. In contrast, SPDY is able
to use a single long-lived “elephant”
flow, as it can multiplex and prioritize
all requests over a single connection.
SPDY therefore retains the benefits of
highly concurrent HTTP connections,
without detrimental side effects.
A short-term disadvantage of SPDY’s single-connection approach is
inequitable TCP “backoff” compared
to competing applications still using
multiple TCP connections; for example, a backoff algorithm that responds
to packet loss by reducing the available
bandwidth of a connection by 50% will
likewise halve the total bandwidth
available to an application using a single SPDY TCP connection. The same
backoff algorithm applied to an application using 12 concurrent TCP connections would reduce the total available bandwidth to the application by
only 4% (1/24) of the connections. Connection proliferation should not be encouraged over the long term, though a
short-term mitigation strategy would
involve using a small number of concurrent SPDY connections. Long-term
research may look to address this issue
through smarter backoff algorithms
providing equitable treatment to applications, independent of the number of TCP connections.
Transitioning to SPDY
SPDY has been implemented in several
popular client browsers, most notably
Chrome and Firefox. Though server
support for SPDY continues to grow,
it has yet to reach the maturity and
adoption of client implementations.
SPDY gateways are one way to acceler-
Figure 11. A client can delegate DNS lookup to a SPDY gateway, helping minimize round trips.
No SPDY Gateway
client
With SPDY Gateway
DNS name server
transla
te exa
mple.c
om
3.10
192.0.4
.com is
le
p
m
a
ex
client
Client sends
request
without
translating
domain name
GET e
x
SPDY gateway
ample
.c
om/ho
m
GE
Te
xam
ple
.co
m/
ho
me
K
transla
te exa
mple.c
om
0
.0.43.1
is 192
le.com
examp
example.com
(192.0.43.10)
GET e
xamp
le.com
/home
example.com
(192.0.43.10)
0O
20
e
DNS name server
K
200 O
0
20
OK
d ecem be r 20 12 | vo l. 55 | n o. 12 | com mu n ic at ion s of t he ac m
71
5.2 SPDY (CACM’12)
contributed articles
ate SPDY adoption, providing many
SPDY performance and security advantages without requiring SPDY support
on the server. A SPDY gateway is an
explicit proxy that translates between
SPDY-enabled clients and HTTP-only
servers. By situating such a gateway on
the high-speed Internet, SPDY is used
over the slow “last mile” link between
the client and the Internet core. The
HTTP portion of the connection is in
turn isolated to the very-low-latency,
very-high-bandwidth link between the
gateway and the server, largely mitigating HTTP’s dominant performance
inhibitors. In addition to providing a
practical, viable SPDY transition solution, SPDY gateways also offer several
performance-enhancing features:
Secure connection to gateway, regardless of server-side SSL support. Because
SPDY operates over SSL, the client-togateway connection is secure, regardless of whether SSL is supported on
the destination server (see Figure 10).
Though the gateway-to-server connection could remain insecure, clients are
protected from common attacks (such
as eavesdropping on insecure Wi-Fi access points).
Single client-side connection across
all domains. As described earlier,
SPDY request multiplexing results in
dramatically fewer TCP connections
than HTTP browsers in use today.
However, clients still require at least
one new TCP connection for each new
server they contact. A SPDY gateway
can achieve even greater efficiency
than regular SPDY by multiplexing all
of a client’s requests to the gateway
over a single TCP connection covering
all servers.
A SPDY gateway might still create
multiple connections to a given HTTP
server to emulate pipelining and avoid
head-of-line blocking but isolate these
connections to the high-speed/lowlatency Internet core. A SPDY gateway
may also retain a small pool of TCP
connections to popular servers, allowing new client requests to be forwarded
immediately without incurring a new
TCP connection handshake or slowstart “warm-up.” Likewise, the client
needs to perform a single TCP connection handshake only with the gateway
and go through the slow-start warm-up
only once (as opposed to every time a
new server is contacted).
72
co m m u ni cati o ns o f t he ac m
Aside from offering
faster browsing
as a selling point,
Amazon and other
potential vendors
are likely interested
in the data mining
and advertising
opportunities
that come with
controlling
the gateway.
| d ec em ber 2 0 1 2 | vo l . 55 | n o. 1 2
Delegated DNS lookup. This performance enhancement specific to SPDY
gateways entails the gateway performing DNS translations from domain
names to server IP addresses on behalf
of the client, allowing the client to immediately send a request for a resource
to the gateway without knowing the
IP address of the server on which it
is hosted (see Figure 11). Being situated on the high-speed Internet, the
gateway is better positioned to quickly
translate the domain name to an IP address; moreover, a gateway that serves
a large number of users can cache the
IP addresses of popular domains.
Intelligent push. A SPDY gateway
can exploit its large user base to infer
resource dependencies, even across
domains. A regular SPDY-capable
server has a limited view of a user’s
browsing behavior, isolated to the
server itself. A gateway sees a user’s
requests for all servers so it can infer
complex patterns of cross-domain
navigation; for example, the gateway
could determine that 95% of users issuing a Google search query for “Twitter” proceed to twitter.com, and, given
this knowledge, the gateway then preemptively pushes resources from the
twitter.com homepage to the user.
In 2011, Amazon reported the Silk
browser on the Kindle Fire tablet already performed intelligent push of
this nature, called by Amazon “predictive rendering.”2
Caching. Like a transparent proxy,
a SPDY gateway can cache resources
such that subsequent requests for the
same resource are served without contacting the origin server.
SPDY gateways, a permanent fixture?
This description of SPDY gateways
highlights that in some respects gateways offer more attractive features than
SPDY directly between clients and servers, including four notable functions:
further reduction in TCP connections
over the last mile; pre-warming of TCP
connections; delegation of DNS translations to the fast Internet core; and
intelligent push and resource caching.
We suggest that gateways may have
a persistent role on the Web, beyond
mere transition strategy.
Future SPDY Gateways
Several companies have deployed
large SPDY gateways. Perhaps most
250
D. Rossi – RES224
contributed articles
notable is the gateway used by the default Silk browser on the Amazon Kindle
Fire tablet2; Silk proxies much of a
user’s Web traffic through an Amazon
SPDY gateway deployed on the Amazon Web Services cloud infrastructure.
Other examples are content-deliverynetwork/Web-acceleration providers
Contendo1 and Strangeloop,26 both
offering SPDY gateways as a service to
HTTP content providers.
Device-specific SPDY gateways. Amazon’s decision to couple the Kindle
Fire Silk browser to its own proprietary SPDY-based gateway begs the
question: Could, and will, other major
providers do the same? Could there
be, say, an Apple SPDY gateway for
iPhones and iPads or a Google SPDY
gateway for Android devices in the future? Could such gateways be in the
works already? The potential performance advantage of SPDY gateways
is particularly intriguing on such resource-constrained mobile devices.
The controlled “appliancized” nature
of the devices and their operating
systems would also simplify vendor
implementation. Aside from offering faster browsing as a selling point,
Amazon and other potential vendors
are likely interested in the data mining and advertising opportunities that
come with controlling the gateway.
Open SPDY gateways. Beyond device-specific gateways lies uncharted
though potentially lucrative territory—open SPDY gateways—that, like
an open proxy, are usable by anyone,
independent of device or platform.
Major Web companies have demonstrated that free and universal services
can be made profitable through related targeted advertising opportunities.
So, could SPDY gateways be turned
into another free, universal service
rendered profitable through bettertargeted advertising?
A limitation Web advertisers face
today is a restricted view of user activity on domains beyond their direct control. A SPDY gateway provides a vantage point from which to observe all
of a user’s Web activity, not just on domains under the advertiser’s control.
Major Web companies like Facebook
and Google track users across the Web
on third-party sites through partner
advertising scripts and other embeddable features (such as the “Like” but-
ton), but the picture is incomplete.
An open SPDY gateway would provide
advertisers missing pieces from the
browsing-behavior puzzle that could
be fed back into targeted-advertising
algorithms. While much the same
could be done using device-specific
SPDY gateways, an open SPDY gateway would provide insight into a much
larger user population. Interesting to
consider therefore is whether SPDY
gateways (much like search) could
become a universal service accessible
through a broad range of devices.
Conclusion
SPDY is a high-performance application-layer protocol and potential
successor to HTTP. Clients have been
quick to adopt it, though server implementations lag. SPDY gateways are
helping accelerate SPDY adoption by
removing the need for SPDY support
on the server. A range of compelling
incentives exists for deploying SPDY
gateways that are only beginning to
be explored. Beyond just a transition
strategy, SPDY gateways have performance characteristics that make
them attractive for longer-term use.
Whether such long-term advantages
compared to SPDY support on the
server are substantial enough to warrant retaining SPDY gateways is an
open question.
Acknowledgments
This work is supported in part by an
Australian Government Australian
Postgraduate Awards scholarship and
Commonwealth Scientific and Industrial Research Organisation Office of
the Chief Executive scholarship. The
authors would also like to thank the
anonymous reviewers for their valuable comments and suggestions. References
1.Akamai. Akamai Acquires Contendo. Press Release,
Mar. 2012; http://www.akamai.com/cotendo
2. Amazon. Amazon Silk FAQs; http://www.amazon.com/
gp/help/customer/display.html/?nodeId=200775440
3. Amazon. Introducing Amazon Silk; http://amazonsilk.
wordpress.com/2011/09/28/introducing-amazon-silk
4. Belshe, M. SPDY on Google servers?
Jan. 2011; https://groups.google.com/
forum/?fromgroups#!searchin/spdy-dev/SPDY$20on$
20Google$20servers?$20/spdy-dev/TCOW7Lw2scQ/
INuev2A-ixAJ
5. Belshe, M. SSL: It’s a matter of life and death. Mike’s
Lookout blog, May 28, 2011; http://www.belshe.
com/2011/05/28/ssl-its-a-matter-of-life-and-death/
6. Belshe, M. and Peon, R. A 2x faster Web. The
Chromium Blog, Nov. 11, 2009; http://blog.chromium.
org/2009/11/2x-faster-web.html
7. Belshe, M. and Peon, R. SPDY Protocol. Chromium
Projects, Feb. 2012; http://dev.chromium.org/spdy/
spdy-protocol/spdy-protocol-draft3
8. Bozdag, E., Mesbah, A., and van Duersen, A. A
comparison of push and pull techniques for AJAX
in Web site evolution. In Proceedings of the Ninth
IEEE International Workshop (Paris, Oct. 5–6). IEEE
Computer Society, Washington, D.C., 2007, 15–22.
9. Brutlag, J. Speed Matters for Google Web Search.
Technical Report, 2009; http://services.google.com/fh/
files/blogs/google_delayexp.pdf
10. Eddy, W. TCP SYN Flooding Attacks and Common
Mitigations. Internet Engineering Task Force, Aug.
2007; http://tools.ietf.org/html/rfc4987
11. Fielding, R. et al. Hypertext Transfer Protocol—
HTTP/1.1: Connections. World Wide Web Consortium,
June 1999; http://www.w3.org/Protocols/rfc2616/
rfc2616-sec8.html
12.Google Inc. mod-spdy: Apache SPDY module. May
2012; http://code.google.com/p/mod-spdy/
13.Google Inc. SPDY essentials. Dec. 2011; http://www.
youtube.com/watch?feature=player_detailpage&v=T
NBkxA313kk#t=2179s
14.Google Inc. The Chromium Projects. SPDY: An
Experimental Protocol for a Faster Web. White
Paper, 2009; http://www.chromium.org/spdy/spdywhitepaper
15. Komoroske, A. Prerendering in Chrome. The
Chromium Blog, June 2011; http://blog.chromium.
org/2011/06/prerendering-in-chrome.html
16. Langley, A. Overclocking SSL. Imperial Violet
Blog, June 25, 2010; http://www.imperialviolet.
org/2010/06/25/overclocking-ssl.html
17. Langley, A. Transport Layer Security Next Protocol
Negotiation Extension. Internet Engineering Task
Force, Mar. 30, 2011; http://tools.ietf.org/html/draftagl-tls-nextprotoneg-02
18.McManus, P. Maturing Web transport protocols with
SPDY and friends. Video of SPDY Talk at Codebits.
eu, Nov. 2011; http://bitsup.blogspot.com.au/2011/11/
video-of-spdy-talk-at-codebitseu.html
19.McManus, P. SPDY brings responsive and scalable
transport to Firefox 11. Mozilla Hacks blog, Feb.
2012; http://hacks.mozilla.org/2012/02/spdy-bringsresponsive-and-scalable-transport-to-firefox-11/
20.Morrison, J. SPDY for iPhone. GitHub, Inc., Jan. 2012;
https://github.com/sorced-jim/SPDY-for-iPhone
21.Mozilla Developer Network. Link prefetching FAQ.
Mar. 2003; https://developer.mozilla.org/en/Link_
prefetching_FAQ
22. Nyman, R. Firefox Aurora 13 is out—SPDY on by
default and a list of other improvements. Mar. 19,
2012; http://hacks.mozilla.org/2012/03/firefox-aurora13-is-out-spdy-on-by-default-and-a-list-of-otherimprovements/
23. OpenSSL. OpenSSL Cryptography and SSL/TLS
Toolkit. Mar. 2012; http://www.openssl.org/news/
changelog.html
24. Souders, S. Sharding dominant domains. Steve
Souders blog, May 12, 2009; http://www.stevesouders.
com/blog/2009/05/12/sharding-dominant-domains/
25. StatCounter. StatCounter GlobalStats, Feb.
2012; http://gs.statcounter.com/#browser-wwmonthly-201102-201202
26. Strangeloop Networks. Strangeloop. Mar. 2012; http://
www.strangeloopnetworks.com/products/overview/
27. The Chromium Projects. SPDY, Mar. 2012; http://dev.
chromium.org/spdy
28. World Wide Web Consortium. The WebSocket API:
Editor’s Draft 29 August 2012; http://dev.w3.org/
html5/websockets/
29. Yang, F., Amer, P., Leighton, J., and Belshe, M. A
Methodology to Derive SPDY’s Initial Dictionary for
Zlib Compression. University of Delaware, Newark, DE,
2012; http://www.eecis.udel.edu/~amer/PEL/poc/pdf/
SPDY-Fan.pdf
Bryce Thomas ([email protected]) is a Ph.D.
candidate in the Discipline of Information Technology at
James Cook University, Townsville, Queensland, Australia.
Raja Jurdak ([email protected]) is a researcher in
the Commonwealth Scientific and Industrial Research
Organisation and a professor in the University of
Queensland, Brisbane, Australia.
Ian Atkinson ([email protected]) is a professor
and director of the eResearch Centre of James Cook
University, Townsville, Queensland, Australia.
© 2012 ACM 0001-0782/12/12
dece mber 2 012 | vo l. 55 | no. 12 | com mu n ication s o f t h e ac m
73
5.3 SPDY (CoNEXT’13)
Towards a SPDY’ier Mobile Web?
Jeffrey Erman, Vijay Gopalakrishnan, Rittwik Jana, K.K. Ramakrishnan
AT&T Labs – Research
One AT&T Way, Bedminster, NJ, 07921
{erman,gvijay,rjana,kkrama}@research.att.com
ABSTRACT
spite the plethora of ‘apps’, web access remains one of the
most important uses of the mobile internet. It is therefore
critical that the performance of the cellular data network be
tuned optimally for mobile web access.
The Hypertext Transfer Protocol (HTTP) is the key building block of the web. Its simplicity and widespread support
has catapulted it into being adopted as the nearly ‘universal’ application protocol, such that it is being considered
the narrow waist of the future internet [11]. Yet, despite its
success, HTTP suffers from fundamental limitations, many
of which arise from the use of TCP as its transport layer
protocol. It is well-established that TCP works best if a
session is long lived and/or exchanges a lot of data. This is
because TCP gradually ramps up the load and takes time
to adjust to the available network capacity. Since HTTP
connections are typically short and exchange small objects,
TCP does not have sufficient time to utilize the full network capacity. This is particularly exacerbated in cellular
networks where high latencies (hundreds of milliseconds are
not unheard off [18]) and packet loss in the radio access network is common. These are widely known to be factors that
impair TCP’s performance.
SPDY [7] is a recently proposed protocol aimed at addressing many of the inefficiencies with HTTP. SPDY uses
fewer TCP connections by opening one connection per domain. Multiple data streams are multiplexed over this single
TCP connection for efficiency. SPDY supports multiple outstanding requests from the client over a single connection.
SDPY servers transfer higher priority resources faster than
low priority resources. Finally, by using header compression,
SPDY reduces the amount of redundant header information
each time a new page is requested. Experiments show that
SPDY reduces page load time by as much as 64% on wired
networks and estimate as much as 23% improvement on cellular networks (based on an emulation using Dummynet) [7].
In this paper, we perform a detailed and systematic measurement study on real-world production cellular networks
to understand the benefits of using SPDY. Since most websites do not support SPDY – only about 0.9% of all websites use SPDY [15] – we deployed a SPDY proxy that functions as an intermediary between the mobile devices and web
servers. We ran detailed field measurements using 20 popular web pages. These were performed across a four month
span to account for the variability in the production cellular
network. Each of the measurements was instrumented and
set up to account for and minimize factors that can bias the
results (e.g., cellular handoffs).
Despite its widespread adoption and popularity, the Hypertext Transfer Protocol (HTTP) suffers from fundamental
performance limitations. SPDY, a recently proposed alternative to HTTP, tries to address many of the limitations
of HTTP (e.g., multiple connections, setup latency). With
cellular networks fast becoming the communication channel of choice, we perform a detailed measurement study to
understand the benefits of using SPDY over cellular networks. Through careful measurements conducted over four
months, we provide a detailed analysis of the performance
of HTTP and SPDY, how they interact with the various
layers, and their implications on web design. Our results
show that unlike in wired and 802.11 networks, SPDY does
not clearly outperform HTTP over cellular networks. We
identify, as the underlying cause, a lack of harmony between
how TCP and cellular networks interact. In particular, the
performance of most TCP implementations is impacted by
their implicit assumption that the network round-trip latency does not change after an idle period, which is typically not the case in cellular networks. This causes spurious
retransmissions and degraded throughput for both HTTP
and SPDY. We conclude that a viable solution has to account for these unique cross-layer dependencies to achieve
improved performance over cellular networks.
Categories and Subject Descriptors
C.2.2 [Computer-Communication Networks]: Network
Protocols—Applications; C.4 [Performance of Systems]:
Measurement techniques
Keywords
SPDY, Cellular Networks, Mobile Web
1.
INTRODUCTION
As the speed and availability of cellular networks grows,
they are rapidly becoming the access network of choice. DePermission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from [email protected].
CoNEXT’13, December 9–12, 2013, Santa Barbara, California, USA.
Copyright 2013 ACM 978-1-4503-2101-3/13/12 ...$15.00.
http://dx.doi.org/10.1145/2535372.2535399.
303
252
D. Rossi – RES224
Our main observation from the experiments is that, unlike
in wired and 802.11 WiFi networks, SPDY does not outperform HTTP. Most importantly, we see that the interaction
between TCP and the cellular network has the most impact
on performance. We uncover a fundamental flaw in TCP
implementations where they do not account for the high
variability in the latency when the radio transitions from
idle to active. Such latency variability is common in cellular
networks due to the use of a radio resource state machine.
The TCP Round-Trip Time (RTT) estimate and thus the
time out value is incorrect (significantly under-estimated)
after an idle period, triggering spurious retransmissions and
thus lower throughput.
The TCP connection and the cellular radio connection for
the end-device becomes idle because of users’ web browsing patterns (with a “think time” between pages [9]) and
how websites exchange data. Since SPDY uses a single long
lived connection, the TCP parameter settings at the end of
a download from one web site is carried over to the next site
accessed by the user. HTTP is less affected by this because
of its use of parallel connections (isolates impact to a subset
of active connections) and because the connections are short
lived (isolates impact going across web sites). We make the
case that a viable solution has to account for these unique
cross-layer dependencies to achieve improved performance
of both HTTP and SPDY over a cellular network.
The main contributions of this paper include:
With the original versions of HTTP, a single object was
downloaded per connection. HTTP version 1.1 introduced
the notion of persistent connections that have the ability to
reuse established TCP connections for subsequent requests
and the concept of pipelining. With persistence, objects are
requested sequentially over a connection as shown in Figure 1(b). Objects are not requested until the previous response has completed. However, this introduces the problem
of head-of-line (HOL) blocking where subsequent requests
get significantly delayed in waiting for the current response
to come back. Browsers attempt to minimize the impact of
HOL blocking by opening multiple concurrent connections
to each domain — most browsers today use six parallel connections — with an upper limit on the number of active
connections across all domains.
With pipelining, multiple HTTP requests can be sent to
a server together without waiting for the corresponding responses as shown in Figure 1(c). The client then waits for
the responses to arrive in the order in which they were requested. Pipelining can improve page load times dramatically. However, since the server is required to send its responses in the same order that the requests were received,
HOL blocking can still occur with pipelining. Some mobile
browsers have only recently started supporting pipelining.
2.2
• We conduct a systematic and detailed study over more
than four months on the performance of HTTP and
SPDY. We show that SPDY and HTTP perform similarly over cellular networks.
• We show that the interaction between the cellular network and TCP needs further optimization. In particular, we show that the RTT estimate, and thus the
retransmission time-out computation in TCP is incongruous with how the cellular network radio state machine functions.
• We also show that the design of web sites, where data
is requested periodically, also triggers TCP timeouts.
We also show that there exist dependencies in web
pages today that prevent the browser from fully utilizing SPDY’s capabilities.
2.
BACKGROUND
We present a brief background on how HTTP and SPDY
protocols work in this section. We use the example in Figure 1 to aid our description.
2.1
The SPDY Protocol
Even though HTTP is widely adopted and used today, it
suffers from several shortcomings (e.g., sequential requests,
HOL blocking, short-lived connections, lack of server initiated data exchange, etc.) that impact web performance,
especially on the cellular network.
SPDY [7] is a recently proposed application-layer protocol
for transporting content over the web with the objective
of minimizing latency. The protocol works by opening one
TCP connection per domain (or just one connection if going
via a proxy). SPDY then allows for unlimited concurrent
streams over this single TCP connection. Because requests
are interleaved on a single connection, the efficiency of TCP
is much higher: fewer network connections need to be made,
and fewer, but more densely packed, packets are issued.
SPDY implements request priorities to get around one object request choking up the connection. This is described in
Figure 1(d). After downloading the main page, and identifying the objects on the page, the client requests all four
objects in quick succession, but marks objects 3 and 4 to be
of higher priority. As a result, server transfers these objects
first thereby preventing the connection from being congested
with non-critical resources (objects 2 and 5) when high priority requests are pending. SPDY also allows for multiple
responses to be transferred as part of the same packet (e.g.
objects 2 and 5 in Figure 1(d)) can fit in a single response
packet can be served altogether. Finally, SPDY compresses
request and response HTTP headers and Server-initiated
data exchange. All of these optimizations have shown to
yield up to 64% reduction in page load times with SPDY [7].
The HTTP Protocol
The Hypertext Transfer Protocol (HTTP) is a stateless,
application-layer protocol for transmitting web documents.
It uses TCP as its underlying transport protocol. Figure 1(a)
shows an example web page which consists of the main
HTML page and four objects referred in that page. When
requesting the document, a browser goes through the typical TCP 3-Way handshake as depicted in Figures 1(b) and
(c). Upon receiving the main document, the browser parses
the document and identifies the next set of objects needed
for displaying the page. In this example there are four more
objects that need to be downloaded.
3. EXPERIMENTAL SETUP
We conducted detailed experiments comparing the performance of HTTP and SPDY on the 3G network of a commercial, production US cellular provider over a four month
period in 2013.
Figure 2 provides an overview of our test setup. Clients
in our setup connect over the cellular network using HTTP
304
5.3 SPDY (CoNEXT’13)
Client
Client
Server
2
SYN-AC
ACK
GET 1
GET 2
2
4
GET
GET
GET
GET
5
2
3
4
5
Server
SYN
K
SYN-AC
ACK
SSL / SP
DY Setu
p
SYN-AC
ACK
1
1
3
1
Client
K
K
GET 1
Server
SYN
SYN
GET 1
1
2
3
4
5
GET
GET
GET
GET
2
3
4
5
3
4
2 5
GET 5
5
(a) Example Web Page
(c) HTTP w/ Pipelining
(b) HTTP Persistent Conn.
(d) SPDY
Figure 1: Example showing how HTTP and SPDY work.
Test
Server
Cellular
Network
traversing a SPDY proxy. Depending on the experiment, we
explicitly configured Chrome to use either the HTTP or the
SPDY proxy. When using a HTTP proxy, Chrome opens
up to 6 parallel TCP connections to the proxy per domain,
with a maximum of 32 active TCP connections across all
domains. With SPDY, Chrome opens one SSL-encrypted
TCP connection and re-uses this connection to fetch web
objects. The connection is kept persistent and requests for
different websites re-use the connection.
Test Location: Cellular experiments are sensitive to a
lot of factors, such as signal strength, location of the device in a cell, the cell tower’s backhaul capacity, load on the
cell tower, etc. For example, a device at a cell edge may frequently get handed-off between towers, thereby contributing
to added delays. To mitigate such effects, we identified a cell
tower that had sufficient backhaul capacity and had minimal interference from other cell sites. For most of our experiments, we chose a physical location with an unobstructed
view of the tower and received a strong signal (between 47 and -52 dBm). We configured the 3G modem to remain
connected to that base station at that sector on a particular
channel frequency and used a diagnostic tool to monitor the
channel on that sector.
Proxies Used: We used a virtual machine running Linux
in a compute cloud on the east coast of US to host our proxies. At the time of our experiments, there were no proxy
implementations that supported both HTTP and SPDY.
Hence we chose implementations that are purported to be
widely used and the most competent implementations for
the corresponding protocols. We used Squid [2] (v3.1) as
our HTTP proxy. Squid supports persistent connections to
both the client and the server. However, it only supports a
rudimentary form of pipelining. For this reason, we did not
run experiments of HTTP with pipelining turned on. Our
comparisons are restricted to HTTP with multiple persistent
connections. For SPDY, we used a SPDY server built by
Google and made available as part of the Chromium source
tree. This server was used in the comparison [7] of SPDY
and HTTP and has since had extensions built in to support
proxying.1 We ran tcpdump to capture network level packet
traces and tcp-probe kernel module to capture TCP congestion window values from the proxy to the mobile device.
Internet
SPDY HTTP
Proxy Proxy
Cloud
Figure 2: Our test setup
or SPDY to proxies that support that corresponding protocol. These proxies then use persistent HTTP to connect to
the different web servers and fetch requested objects. We
run a SPDY and an HTTP proxy on the same machine
for a fair comparison. We use a proxy as an intermediary for two reasons: (a) We necessarily could not compare
SPDY and HTTP directly. There are relatively few web
sites that support SPDY. Moreover, a web server running
SPDY would not support HTTP and vice versa. Thus, we
would be evaluating connections to different servers which
could affect our results (depending on their load, number
of objects served, etc). (b) Most cellular operators in the
US already use HTTP proxies to improve web performance.
Running a SPDY proxy would allow operators to support
SPDY over the cellular network even if the web sites do not.
Test Devices: We use laptops running Windows 7 and
equipped with 3G (UMTS) USB cards as our client devices.
We ran experiments with multiple laptops simultaneously
accessing the test web sites to study the effect of multiple
users loading the network. There are several reasons we use
a laptop for our experiments. First, tablets and cellularequipped laptops are on the rise. These devices request
the regular web pages unlike smart phones. Second, and
more importantly, we wanted to eliminate the effects of a
slow processor as that could affect our results. For example,
studies [16] have shown that HTML, Javascript, and CSS
processing and rendering can delay the request of required
objects and significantly affect the overall page load time.
Finally, it has been observed [13] that having a slow processor increases the number of zero window advertisements,
which significantly affects throughput.
Test Client: We used a default installation of the Google
Chrome browser (ver 23.0) as the test client, as it supported
1
We also tested performance with a SOCKS proxy, but
found the results to be worse than both HTTP and SPDY.
305
254
D. Rossi – RES224
Total
Objs
134.8
160.6
143.8
121.6
45.2
163.4
115.8
157.7
5.1
59.3
122.1
29.4
63.4
167.8
323
267.1
218.5
33.6
68.7
163.2
Avg.
Size
(KB)
626.9
2197.3
1563.1
963.3
602.8
1594.5
1130.6
1184.5
56.2
719.7
1489.1
688.0
895.1
1130.5
1722.7
2311.0
4691.3
1664.8
2908.9
1653.8
Avg.
No. of
Domains
37.6
36.3
15.8
27.5
3.0
13.2
28.5
27.3
2.0
17.9
17.9
4.0
9.0
12.5
84.7
75.0
37.0
9.1
15.5
48.7
Avg.
Text
Objs
28.6
16.5
13.3
9.6
2.0
13.2
9.1
29.6
3.1
6.8
24.1
2.3
4.1
19.5
73.4
60.3
19.0
3.3
5.2
19.7
Avg.
JS/
CSS
41.3
28.0
36.8
18.3
18.0
36.4
49.5
28.3
2.0
7.0
21.0
10.0
15.0
94.0
73.6
56.9
56.3
6.7
23.8
45.3
Table 1: Characteristics of tested websites.
numbers are averaged across runs.
Avg.
Imgs/
Other
64.9
116.1
93.7
93.7
25.2
113.8
57.2
99.8
0.0
45.5
77.0
17.1
44.3
54.3
176.0
149.9
143.2
23.6
39.7
98.2
60000
40000
30000
20000
10000
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
Test Website
Figure 3: Page Load Time for different web sites
with HTTP and SPDY.
The
axis shows the different websites we tested; the y-axis is the
page load time in milliseconds. For each website, the (red)
box on the left shows the page load times for HTTP, while
the (blue) box on the right shows the times for SPDY. The
box plot gives the standard metrics: the 25 percentile, the 75
percentile and the black notch in the box is the median value.
The top and bottom of the whiskers shows the maximum
and minimum values respectively. Finally, the circle in these
boxes shows the mean page load time across all the runs.
The results from Figure 3, interestingly, do not show a
convincing winner between HTTP and SPDY. For some
sites, the page load time with SPDY is lower (e.g., 3, 7),
while for others HTTP performs better (e.g., 1, 4). But for
a large number of sites there isn’t a significant difference.2
This is in sharp contrast to existing results on SPDY where
it has been shown to have between 27-60% improvement [7].
Importantly, previous results have shown an average of 23%
reduction over emulated cellular networks [17].
Web Pages Requested: We identified the top web sites
visited by mobile users to run our tests (in the top Alexa
sites). Of these, we eliminated web sites that are primarily landing pages (e.g., Facebook login page) and picked the
remaining 20 most requested pages. These 20 web pages
have a good mix of news websites, online shopping and auction sites, photo and video sharing as well as professionally
developed websites of large corporations. We preferred the
“full” site instead of the mobile versions keeping in mind the
increasing proliferation of tablets and large screen smartphones. These websites contain anywhere from 5 to 323
objects, including the home page. The objects in these sites
were spread across 3 to 84 domains. Each web site had
HTML pages, Javascript objects, CSS and images. We tabulate important aspects of these web sites in Table 1.
Test Execution: We used a custom client that talks to
Chrome via the remote debugging interface and got Chrome
to load the test web pages. We generated a random order
in which to visit the 20 web sites and used that same order
across all experiments. Each website was requested 60 seconds apart. The page may take much shorter time to load;
in that case the system would be idle until the 60 second
window elapsed. We chose 60 seconds both to allow for web
pages to load completely and to reflect a nominal think time
that users take between requests.
We used page load time as the main metric to monitor
performance. Page load time is defined as the time it takes
the browser to download and process all the objects associated with a web page. Most browsers fire a javascript event
(onLoad()) when the page is loaded. The remote debugging
interface provided us the time to download the different objects in a web page. We alternated our test runs between
HTTP and SPDY to ensure that temporal factors do not
affect our results. We ran each experiment multiple times
during the typically quiet periods (e.g., 12 AM to 6 AM) to
mitigate effects of other users using the base station.
4.
HTTP
SPDY
50000
Page Load Time (in msec)
Website
Finance
Entertainment
Shopping
Portal
Technology
ISP
News
News
Shopping
Auction
Online Radio
Photo Sharing
Technology
Baseball
News
Football
News
Photo Sharing
Online Radio
Weather
4.0.1 Performance over 802.11 Wireless Networks
As a first step in explaining the result in Figure 3, we
wanted to ensure that the result was not an artifact of our
test setup or the proxies used. Hence, we ran the same experiments using the same setup, but over an 802.11g wireless
network connected to the Internet via a typical residential
broadband connection (15 Mbps down/ 2 Mbps up).
Figure 4 shows the average page load times and the 95%
confidence intervals. Like previous results [7], this result also
shows that SPDY performs better than HTTP consistently
with page load time improvements ranging from 4% for website 4 to 56% for website 9 (ignoring website 2). Since the
only difference between the two tests is the access network,
we conclude that our results in Figure 3 is a consequence of
how the protocols operate over the cellular network.
5. UNDERSTANDING THE CROSS-LAYER
INTERACTIONS
We look at the different components of the application and
the protocols that can affect performance. In the process we
EXPERIMENTAL RESULTS
We first compare the performance of SPDY and HTTP
using data collected from a week’s worth of experiments.
Since there was a lot of variability in the page load times,
we use a box plot to present the results in Figure 3. The x-
2
HTTP seems to perform quite poorly with site 2. Upon
investigation, we found that the browser would occasionally
stall on this site. These stalls happened more often with
HTTP than with SPDY resulting in increased times.
306
5.3 SPDY (CoNEXT’13)
14000
12000
10000
8000
6000
4000
2000
0
0
5
10
15
20
SPDY Wait
SPDY Recv
3000
2500
2000
1500
1000
500
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Web Site
Website
Figure 4:
Average Page Load Time over an
802.11g/Broadband network.
Figure 5: Split of average download times of objects
by constituent components.
observe that there are significant interdependencies between
the different layers (from browser behavior and web page
design, to TCP protocol implementations, to the intricacies
of the cellular network) that affect overall performance.
5.1
HTTP Recv
SPDY Init
SPDY Send
16000
Average Object Time (in msec)
Page Load Time (in msec)
HTTP Init
HTTP Send
HTTP Wait
HTTP
SPDY
18000
the proxy catches up in serving the requests to the client.
Figure 7 discussed in the next section shows this behavior.
5.2
Object download times
Web Page design and object requests
We now look at when different objects for a website are
requested by the browser. One of the performance enhancements SPDY allows is for all objects to be requested in parallel without waiting for the response of outstanding objects.
In contrast, HTTP has only one outstanding request per
TCP connection unless pipelining is enabled.
We plot the request time (i.e., the time the browser sends
out a request) for both HTTP and SPDY for four websites
(due to space considerations) in Figure 6. Two of these
are news websites and two contain a number of photos and
videos. SPDY, unlike what was expected, does not actually
request all the objects at the same time. Instead for three
of the four web sites, SPDY requests objects in steps. Even
for the one website where all the objects are requested in
quick succession, we observe a delay between the first request and the subsequent requests. HTTP, on the other
hand, requests objects continuously over time. The number
of objects it downloads in parallel depends on the number
of TCP connections the browser opens to each domain and
across all domains.
We attribute this sequence of object requests to how the
web pages are designed and how the browsers process them
to identify constituent objects. Javascript and CSS files introduce interdependencies by requesting other objects. Table 1 highlights that websites make heavy use of JavaScript
or CSS and contain anywhere from 2 to 73 different scripts
and stylesheets. The browser does not identify these further
objects until these files are downloaded and processed. Further, browsers process some of these files (e.g., Javascripts)
sequentially as these can change the layout of the page. This
results in further delays. The overall impact to page load
speeds depends on the number of such files in a web page,
and on the interdependencies in them.
To validate our assertion that SPDY is not requesting
all the objects at once because of these interdependencies
and also to understand better the higher wait time of objects, we built two test web pages that consist of only a
main HTML page and images which we placed on a test
server (see Fig. 2). There were a total of 50 objects that
The first result we study is the break down of the page
load time. Recall that, by default, the page load time is the
time it takes the browser to process and download all the
objects required for the web page. Hence, we look into the
average download time of objects on a given page. We split
the download time of the object into 4 steps: (a) the initialization step which includes the time from when the browser
realizes that it requires the object to when it actually requests the object, (b) the send step which includes the time
to actually send the request over the network, (c) the wait
time which is the time between sending the request till the
first byte of response, and finally (d) the receive time which
is the time to receive the object.
We plot the average time of these steps for the different
web sites in Figure 5. First, we see that the trends for average object download time are quite similar to that of page
load times (in Figure 3). This is not surprising given that
page load time is dependent on the object download times.
Next, we see that the send time is almost invisible for both
HTTP and SPDY indicating that sending the request happens very quickly. Almost all HTTP requests fit in one
TCP packet. Similarly almost all SPDY requests also fit in
a single TCP packet; even when the browser bundles multiple SPDY requests in one packet. Third, we see that receive
times with HTTP and SPDY are similar, with SPDY resulting in slightly better average receive times. We see that the
initialization time is much higher with HTTP because the
browser has to either open a new TCP connection to download the object (and add the delay of a TCP handshake), or
wait until it can re-use an existing TCP connection.
SPDY incurs very little initialization time because the
connection is pre-established. On the other hand, it incurs
a significant wait time. Importantly, this wait time is significantly higher than the initialization time for HTTP. This
negates any advantages SPDY gains by reusing connections
and avoiding connection setup. The wait times for SPDY
are much greater because multiple requests are sent together
or in close succession to the proxy. This increases delay as
307
256
D. Rossi – RES224
HTTP, same domain
50
50
40
40
30
30
80
80
20
20
40
40
10
10
120
0
100
120
1000
10000
News Website
300
HTTP
250
SPDY
200
150
100
50
0
100
1000
10000
0
100
1000
10000
Photos Website
140
120
HTTP
100 SPDY
80
60
40
20
0
100
1000
10000
Object ID
Cumulative Objects Requested
160
HTTP
SPDY
1 2 3 4 5 6 7 8
50
40
40
30
30
20
20
10
10
1
2
3
4
5
6
3
4
5
Figure 7: Object request and
download with test web pages.
Time between Request and First Byte
140
Data Download
120
Data Transfer
100
80
60
40
20
0
7
1 2 3 4 5 6 7 8 9
Time (in sec)
needed to be downloaded as part of the web page. We controlled the effect of domains by testing the two extremes: in
one web page, all the objects came from different domains,
while in the second extreme all the objects came from the
same domain. Figure 7 shows the results of these two tests.
Since there are no interdependencies in the web page, we
see that the browser almost immediately identifies all the
objects that need to be downloaded after downloading the
main HTML page (shown using red dots). SPDY then requests all the images on the page in quick succession (shown
in green dots) in both cases. HTTP on the other hand, is
affected by these extremes. When all objects are on different
domains, the browser opens one connection to each domain
up to a maximum number of connection (32 in the case of
Chrome). When all the objects are on the same domain,
browsers limit the number of concurrent connections (6 in
the case of Chrome) but reuse the connections.
Note that while the requests for SPDY are sent out earlier (green dots) than HTTP, SPDY has much more significant delay until the first byte of data is sent back to the
client (start of blue horizontal line). Moreover, we also observe especially in the different domain case, that if multiple objects are downloaded in parallel the time to receive
the objects (length of blue line) is increased. We find in
this experiment that removing all the interdependencies for
SPDY does not significantly improve the performance. In
our tests, HTTP had an average page load time of 5.29s and
6.80s with single vs multiple domains respectively. Conversely, SPDY averages 7.22s and 8.38s with single or multiple domain tests. Consequently, prioritization alone is not
a panacea to SPDY’s performance in cellular networks.
5.3
2
SPDY, same domain
50
Time from start (in msec)
Figure 6: Object request patterns for different websites.
1
SPDY, different domains
160
Object ID (in the order received at Proxy)
HTTP, different domains
Photos and Videos Website
HTTP
SPDY
News Website
160
0
1000
2000
3000
4000
Time (in msec)
5000
6000
7000
Figure 8: Queuing delay at the
proxy
0.45
Avg. Data Transferred (MB)
0.4
HTTP
SPDY
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
5
10
15
20
Time (min)
Figure 9: Average data transferred from proxy to
device every second.
It is clear from Figure 8 that the link between the web
server and proxy is not the bottleneck. We see that in most
cases, the time between when the proxy receives the request
from the client to when it has the first byte of data from
the web server is very short (average of 14 msec with a max
of 46 msec). The time to download the data, at an average
of 4 msec, is also quite short. Despite having the data,
however, we observe that the proxy is unable to send the
data quickly to the client device. There is a significant delay
between when the data was downloaded to the proxy to
when it begins to send the data to the client.
This result shows that SPDY has essentially moved the
bottleneck from the client to the proxy. With HTTP, the
client does not request objects until the pending ones are
downloaded. If these downloads take a while, the overall
download process is also affected. In essence, this is like
admission control at the client. SPDY gets rid of this by
requesting all the objects in quick succession. While this
works well when there is sufficient capacity on the proxyclient link, the responses get queued up at the proxy when
the link between the proxy and the client is a bottleneck.
Eliminating Server-Proxy link bottleneck
Figures 6 and 7 show that while today’s web pages do
not take full advantage of SPDY’s capabilities, that is not a
reason for the lack of performance improvements with SPDY
in cellular networks. So as the next step, we focus on the
proxy and see if the proxy-server link is a bottleneck.
In Figure 8 we plot the sequence of steps at the proxy for a
random website from one randomly chosen sample execution
with SPDY. The figure shows the objects in the order of
requests by the client. There are three regions in the plot for
each object. The black region shows the time between when
the object was requested at the proxy to when the proxy
receives the first byte of response from the web server. The
next region, shown in cyan, represents the time it takes the
proxy to download the object from the web server, starting
from the first byte that it receives. Finally, the red region
represents the time it takes the proxy to transfer the object
back to the client.
5.4
Throughput between client and proxy
The previous result showed that the proxy was not able
to transfer objects to the client quickly, resulting in long
wait times for SPDY. Here, we study the average throughput achieved by SPDY and HTTP during the course of our
experiments. Since each website is requested exactly one
minute apart, in Figure 9 we align the start times of each
experiment, bin the data transferred by SPDY and HTTP
each second, and compute the average across all the runs.
308
Data in Flight (in Kbytes)
5.3 SPDY (CoNEXT’13)
300
HTTP
250 SPDY
200
150
100
50
0
0
200
400
140
120
100
80
60
40
20
0
0
3
6
9
140
120
100
80
60
40
20
0
360
363
600
366
300
250
200
150
100
50
0
369
720 725
Time (in sec)
800
730
1000
735
200
180
160
140
120
100
80
60
40
20
0
1140
1200
1145
1150
Figure 10: The number of unacknowledged bytes for a random run with HTTP and SPDY.
The figure shows the average amount of data that was
transferred during that second. The vertical lines seen every
minute indicate the time when a web page was requested.
We see from the graph that HTTP, on average, achieves
higher data transfers than SPDY. The difference sometimes
is as high as 100%. This is a surprising result because, in theory, the network capacity between the client and the proxy
is the same in both cases. The only difference is that HTTP
uses multiple connections each of which shares the available
bandwidth, while with SPDY the single connection uses the
entire capacity. Hence, we would expect the throughput to
be similar; yet they are not. Since network utilization is determined by how TCP adapts to available capacity, we shift
our attention to how TCP behaves in the cellular network.
5.5
the amount of data transferred could be either limits in the
sender’s congestion window or the receiver window.
5.5.1 Congestion window growth
We processed the packet capture data and extracted the
receive window (rwin) advertised by the client. From the
packet capture data, it was pretty clear that rwin was not
the bottleneck for these experimental runs. So instead we
focused on the proxy’s congestion window and its behavior.
To get the congestion window, we needed to tap into the
Linux kernel and ran a kernel module (tcp_probe) that reports the congestion window (cwnd) and slow-start threshold
(ssthresh) for each TCP connection.
Figure 11 shows the congestion window, ssthresh, the
amount of outstanding data and the occurrence of retransmissions during the course of one random run with SPDY.
First we see that in all cases, the cwnd provides the ceiling
on the outstanding data, indicating that it is the limiting
factor in the amount of data transferred. Next we see that
both the cwnd and the ssthresh fluctuate throughout the
run. Under ideal conditions, we would expect them to initially grow and then stabilize to a reasonable value. Finally,
we see many retransmissions (black circles) throughout the
duration of the run (in our plot, the fatter the circle, the
greater the number of retransmissions.)
To gain a better understanding, we zoom into the interval between 40 seconds and 190 seconds in Figure 12. This
represents the period when the client issues requests to websites 2, 3, and 4. The vertical dashed line represents time
instances where there are retransmissions. From Figure 12
we see that, at time 60, when accessing website 2, both the
cwnd and ssthresh are small. This is a result of multiple retransmissions happening in the time interval 0-60 seconds (refer Figure 11). From 60 to 70 seconds, both the
cwnd and ssthresh grow as data is transferred. Since the
cwnd is higher than the ssthresh, TCP stays in congestion avoidance and does not grow as rapidly as it would in
‘slow-start’. The pattern of growth during the congestion
avoidance phase is also particular to TCP-Cubic (because it
first probes and then has an exponential growth).
After about 70 seconds, there isn’t any data to transfer
and then the connection goes idle until about 85 seconds.
This is the key period of performance loss: At this time,
Understanding TCP performance
To understand the cause for the lower average throughput with SPDY, we look at how TCP behaves when there
is one connection compared to when there are multiple connections. We start by looking at the outstanding bytes in
flight between the proxy and the client device with HTTP
and SPDY. The number of bytes in flight is defined as the
number of bytes the proxy has sent to the client that are
awaiting acknowledgment. We plot the data from one random run of the experiment in Figure 10.
Figure 10 shows that there are instances where HTTP
has more unacknowledged bytes, and other instances where
SPDY wins. When we looked at the correlation between
page load times and the number of unacknowledged bytes,
we found that whenever the outstanding bytes is higher, it
results in lower page load times. To illustrate this, we zoom
into four websites (1, 7, 13 and 20) from the same run and
plot them in the lower half of Figure 10. For the first two
websites, HTTP has more unacknowledged data and hence
the page load times was lower (by more than one second),
whereas for 13 and 20, SPDY has more outstanding data
and hence lower page load times (faster by 10 seconds and
2 seconds respectively). We see that the trend applied for
the rest of the websites and other runs. In addition, we see
in websites 1 and 20 that the growth in outstanding bytes
(i.e., the growth of throughput) is quite slow for SPDY. We
have already established in Figure 8 that the proxy is not
starved for data. Hence, the possible reasons for limiting
309
258
D. Rossi – RES224
160
CWnd
Number of Segments
140
quent retransmissions (refer Figure 11). As a consequence,
cwnd is reduced and the ssthresh is set to a value based on
the cwnd (the specific values depend on the flavor of TCP).
TCP then enters slow start and cwnd and ssthresh grow
back quickly to their previous values (again this depends on
the version of TCP, and in this case depends on the behavior of TCP-Cubic). As a result of an idle and subsequent
retransmission, a similar process repeats itself twice, at 90
and 120 seconds with the cwnd and ssthresh. Interestingly,
at 110 seconds, we do not see retransmissions even though
there was an idle period. We attribute this to the fact that
the RTO value is grown large enough to accommodate the
increased round trip time after the idle time.
When website 3 is requested at time 120, the cwnd and
ssthresh grow as data is transferred. The website also
transfers small amounts of data at around 130 seconds, after a short idle period. That causes TCP to reduce its cwnd
to 10. However the idle period is short enough that the
cellular network does not go idle. As a result, there are
no retransmissions and the ssthresh stays at 65 segments.
The cwnd remains at 10 as no data was transferred after that
time. When website 4 is requested at 180 seconds, however,
the ssthresh falls dramatically because there is a retransmission (TCP as well as the cellular network become idle).
Moreover, there are multiple retransmissions as the RTT
estimates no longer hold.
Retransmission
Outstanding Data
SSThresh
120
100
80
60
40
20
0
0
200
400
600
Time (in sec)
800
1000
1200
Figure 11: The cwnd, ssthresh, and outstanding data
for one run of SPDY. The figure also shows times at
which there are retransmissions.
80
CWnd
SSThresh
Out. Data
Number of Segments
70
60
50
40
30
5.5.2 Understanding Retransmissions
20
One of the reasons for both SPDY and HTTP’s performance issues is the occurrence of TCP retransmissions. Retransmissions result in the collapse of TCP congestion window, which in turn hurts throughput. We analyze the occurrence of retransmissions and its cause in this section.
There are on average 117.3 retransmissions for HTTP and
67.3 for SPDY. We observed in the previous section that
most of the TCP retransmissions were spurious due to an
overly tight RTO value. Upon close inspection of one HTTP
run, we found all (442) retransmissions were in fact spurious.
On a per connection basis, HTTP has fewer retransmissions
(2.9) since there are 42.6 concurrent TCP connections open
on average. Thus, the 67.3 retransmits for SPDY results in
much lower throughput. We also note from our traces that
the retransmissions are bursty in nature and typically affect
a few (usually one) TCP connections. Figure 13 shows that
even though HTTP has a higher number of retransmissions,
when one connection’s throughput is compromised, other
TCP connections continue to perform unaffected. Since
HTTP uses a ‘late binding’ of requests to connections (by
allowing only one outstanding request per connection), it is
able to avoid affected connections, and maintain utilization
of the path between the proxy and the end-device. On the
other hand, since SPDY opens only one TCP connection, all
these retransmissions affect its throughput.
10
0
40
60
80
100
120
140
160
180
Time (in sec)
Figure 12: The cwnd, ssthresh, and outstanding data
for three consecutive websites.
when the proxy tries to send data, multiple effects are triggered. First, since the connection has been idle, a TCP parameter (tcp_slow_start_after_idle) is triggered. Intuitively this parameter captures that fact that network bandwidth could have changed during the idle period and hence
it makes sense to discard all the estimates of the available
bandwidth. As a result of this parameter, TCP reduces
the cwnd to the default initial value of 10. Note that the
ssthresh and retransmission timeout (RTO) values are left
unmodified; as a result the connection goes through slow
start until cwnd grows beyond the ssthresh.
Cellular networks make use of a radio resource controller
(RRC) state machine to manage the state of the radio channel for each device.3 The radio on the device transitions
between idle and active states to conserve energy and share
the radio resources. Devices transfer the most data when
they are in the active (or DCH) state. They transition to
idle after a small period of inactivity. When going from
idle to active, the state machine imposes a promotion delay,
which is typically around 2 seconds [12]. This promotion
delay results in a period in which TCP does not receive any
acknowledgments either. Since TCP’s RTO value is not reset
after an idle period, and this RTO value is much smaller than
the promotion delay, it results in a TCP time out and subse-
5.6
Cellular network behavior
5.6.1 Cellular State Machine
In this experiment we analyze the performance improvement gained by they device staying in the DCH state. Since
there is a delay between each website request, we run a continual ping process that transfers a small amount of data every few seconds. We choose a payload that is small enough
3
Refer to Appendix A for a brief description of the RRC
state machine.
310
5.3 SPDY (CoNEXT’13)
x 10
2000
Relative difference
80
7
70
6
5
% of instances
Retransmitted frame number
90
HTTP
SPDY
8
3000
100
4
9
Retransmission
bursts affecting
a single TCP
stream
4
TCP stream 11
3
50
40
30
TCP stream 9
2
1
0
0
60
10
20
30
40
50
SPDY - Ping
HTTP - Ping
HTTP - No Ping
SPDY - No Ping
20
10
TCP stream 3
60
70
80
Time index
Figure 13: Retransmission bursts
affecting a single TCP stream
0
1000
0
-1000
-2000
HTTP
SPDY
-3000
2
4
6
10000
Page Load Time (msec)
8
10
12
14
16
18
20
Website
Figure 14: Impact of cellular RRC
state machine.
to not interfere with our experiments, but large enough that
the state machine keeps the device in DCH mode.
Figure 14 shows the CDF of the page load times for the different websites across the different runs. Unsurprisingly, the
result shows that having the cellular network in DCH mode
through continuous background ping messages significantly
improves the page load time of both HTTP and SPDY. For
example, more than 80% of the instances load in less than 8
seconds when the device sends continual ping messages, but
only between 40% (SPDY) and 45% (HTTP) complete loading without the ping messages. Moreover, SPDY performs
better than HTTP for about 60% of the instances with the
ping messages. We also looked into the number of retransmissions with and without ping messages; not surprisingly,
we observed that the number of retransmissions reduced by
∼91% for HTTP and ∼96% for SPDY indicating that TCP
RTT estimation is no longer impacted by the cellular state
machine. While this result is promising, it is not practical to
keep the device in DCH state as it wastes cellular resources
and drains device battery. Hence, mechanisms need to be
built into TCP that account for the cellular state machine.
5.6.2
1000
Figure 15: Page load times with &
w/o tcp_slow_start_after_idle
9000
Page Load Time (in msec)
8000
7000
HTTP
SPDY
6000
5000
4000
3000
2000
1000
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Test Website
Figure 16: Page Load Time of HTTP and SPDY
over LTE
lay. We focus on a short duration of a particular, randomly
selected, run with SPDY in Figure 17. The figure shows
the congestion window of the TCP connection (in red), the
amount of data in flight (in cyan) and the times when there
are retransmissions (in black). The thicker retransmission
lines indicate multiple retransmissions. We see from the figure that retransmissions occur after an idle period in LTE
also. For example, at around 600 seconds, the proxy tries to
send data to the device after an idle period; timeouts occur
after the transmission of data, leading to retransmissions;
and the congestion window collapses. This result leads us
to believe that the problem persists even with LTE, albeit
less frequently than with 3G.
Performance over LTE
We analyze the performance of HTTP and SPDY over
LTE in this section. LTE adopts an improved RRC state machine with a significantly smaller promotion delay. On the
other hand, LTE also has lower round-trip times compared
to 3G, which has the corresponding effect of having much
smaller RTO values. We perform the same experiments using the same setup as in the previous 3G experiments, but
connect to an LTE network with LTE USB laptop cards.
Figure 16 shows the box plot of page load times for HTTP
and SPDY over LTE. As expected, we see that both HTTP
and SPDY have considerably smaller page load times compared to 3G. We also see that HTTP performs just as well
as SPDY, if not better, for the initial few pages. However, SPDY’s performance is better than HTTP after the
initial set of web pages. We attribute this to the fact that
LTE’s RRC state machine addresses many of the limitations
present in the 3G state machine, thereby allowing TCP’s
congestion window to grow to larger values and thus allowing SPDY to transfer data more quickly. We also looked at
the retransmission data for HTTP and SPDY – the number of retransmissions reduced significantly with an average
of 8.9 and 7.52 retransmissions per experiment with HTTP
and SPDY (as opposed to 117 and 63 with 3G) respectively.
While the modified state machine of LTE results in better performance, we also wanted to see if it eliminated the
issue of retransmission as a result of the state promotion de-
5.7
Summary and Discussion
We see from these results how the interaction between the
different layers affects performance. First we see websites
sending and/or requesting data periodically (ads, tracking
cookies, web analytics, page refreshes, etc.). We also observe
that a key factor affecting performance is the independent
reaction of the transport protocol (i.e., TCP) and the cellular network to inferred network conditions.
TCP implementations assume their cwnd statistics do not
hold after an idle period as the network capacity might have
changed. Hence, they drop the cwnd to its initial value. That
in itself would not be a problem in wired networks as the
cwnd will grow back up quickly. But in conjunction with the
cellular network’s idle-to-active promotion delay, it results in
unintended consequences. Spurious retransmissions occurring due to the promotion delay cause the ssthresh to fall
to the cwnd value. As a result, when TCP tries to recover, it
goes through slow start only for a short duration, and then
311
260
D. Rossi – RES224
100
Congestion Window
Outstanding Data
Retransmissions
80
No. of segments
not help in improving the page load times for SPDY. This
is primarily because with SPDY, requests are issued to each
connection up front. As a result, if a connection encounters
retransmissions, pending objects requested on that connection are delayed. What is required is a late binding of the
response to an ‘available’ TCP connection (meaning that it
has a open congestion window and can transmit data packets from the proxy to the client at that instant) and avoiding
a connection that is currently suffering from the effects of
spurious timeouts and retransmissions. Such a late binding
would allow the response to come back on any available TCP
connection, even if the request was sent out on a different
connection. This takes advantage of SPDY’s capability to
send the requests out in a ‘burst’, and allows the responses
to be delivered to the client as they arrive back, avoiding
any ’head-of-the-line blocking’.
60
40
20
0
300
400
500
600
700
800
Time (sec)
Figure 17: SPDY’s Congestion window and retransmissions over LTE.
6.2
switches to congestion avoidance, even for small number of
segments. From a TCP standpoint, this defeats the design
intent where short transfers that do not have the potential
of causing congestion (and loss) should be able to rapidly
acquire bandwidth, thus reducing transfer time. This difficulty of transport protocols ‘shutting down’ after an idle
period at just the time when applications wake up and seek
to transfer data (and therefore requiring higher throughput)
is not new and has been observed before [8]. However, the
process is further exacerbated in cellular networks with the
existence of a large promotion delay. These interactions thus
degrade performance, including causing multiple (spurious)
retransmissions that have significant undesirable impacts on
the individual TCP connection behavior.
Our results also point to a fundamental flaw in TCP implementations. Existing implementations discard the congestion window value after an idle period to account for
potential changes in the bandwidth during the idle period.
However, information about the latency profile (i.e., RTT estimates) are retained. With the cellular state machine, the
latency profile also changes after an idle period; since the estimates are inaccurate, it results in spurious retransmissions.
We notice that LTE, despite an improved state machine, is
still susceptible to retransmissions when coming out of the
idle state. When we keep the device in active mode continuously, we transform the cellular network to behave more like
a traditional wired (and also a WiFi) network in terms of
latency profile. Consequently, we see results similar to the
ones seen over wired networks.
6.
6.2.1 Resetting RTT Estimate after Idle
There is a fundamental need to decay the estimate of the
available capacity of a TCP connection once it goes idle.
The typical choice made today by implementations is to
just reset cwnd to the initial value. The round trip time
(RTT) estimate, however, is left untouched by implementations. The RTT estimate drives the retransmission timeout
(RTO) value and hence controls when a packet is retransmitted. Not resetting the RTT estimate may be acceptable
in networks that have mostly ‘stable’ latency characteristics
(e.g., a wired or WiFi network), but as we see in our observations with the cellular network this leads to substantially
degraded performance. The cellular network has vastly varying RTT values. In particular, the idle to active transition
(promotion) can take a few seconds. Since the previous RTT
estimate derived when the cellular connection was active
may have been of the order of tens or hundreds of milliseconds, there is a high probability of a spurious timeout and
retransmission of one or more packets after the idle period.
These retransmissions have the cascading effect of reducing
the cwnd further, and also reducing ssthresh. Therefore,
when the cwnd starts growing, it grows in the congestion
avoidance mode, which further reduces throughput. Thus
the interaction of TCP with the RRC state machine of the
cellular network has to be properly factored in to achieve
the best performance. Our recommended approach is to reset the RTT estimate as well, to the initial default value (of
multiple seconds). This causes the RTO value to be larger
than the promotion delay for the 3G cellular network, thus
avoiding spurious timeouts and unnecessary retransmissions.
This, in turn, allows the cwnd to grow rapidly, ultimately reducing page load times.
POTENTIAL IMPROVEMENTS
Having identified the interactions between TCP and the
cellular network as the root cause of the problem, in this
section, we propose steps that can minimize their impact.
6.1
TCP Implementation Optimizations
6.2.2 Benefit of Slow Start after Idle?
Using multiple TCP connections
One approach we also considered was whether avoiding
the ’slow start after idle’ would improve performance. We
examined the benefit or drawback of the TCP connection
transitioning to slow start after idle. We disabled the slow
start parameter and studied the improvement in page load
time. Figure 15 plots the relative difference between the
average page load time of the different websites with and
without this parameter enabled. A negative value on the
Y-axis indicates that disabling the parameter is beneficial,
while a positive value indicates that enabling it is beneficial.
The observation that using a single TCP connection causes
SPDY to suffer because of retransmissions suggests a need
to explore the use of multiple TCP connections. We explore this option by having the browser use 20 SPDY connections to a single proxy process listening on 20 different
ports.4 However, the use of multiple TCP connections did
4
On the browser, we made use of a proxy auto config (PAC)
file that dynamically allocate the proxy address and one of
the 20 ports for each object requested.
312
5.3 SPDY (CoNEXT’13)
We see that the benefits vary across different websites. Our
packet traces indicate that the amount of outstanding data
(and hence throughput) is quite similar in both the cases.
The number of retransmitted packets seem similar under
good conditions, but disabling the parameter runs the risk
of having lots of retransmissions under congestion or poor
channel conditions since the cwnd value is inaccurate after
an idle period. In some instances, cwnd grows so large with
the parameter disabled, that the receive window becomes
the bottleneck and negates the benefit of a large congestion
window at the sender.
6.2.3
the use of caching at different levels (e.g., nodeB, RNC) of a
3G cellular network to reduce download latency of popular
web content.
TCP optimizations: With regards to TCP, several proposals have tried to tune TCP parameters to improve its performance [14] and address issues like Head of Line (HOL)
blocking and multi-homing. Recently, Google proposed in
an IETF RFC 3390 [4] to increase the TCP initial congestion
window to 10 segments to show how web applications will
benefit from such a policy. As a rebuttal, Gettys [6] demonstrated that changing the initial TCP congestion window
can indeed be very harmful to other real-time applications
that share the broadband link and attributed this problem
to one of ”buffer bloat”. As a result Gettys, proposed the
use of HTTP pipelining to provide improved TCP congestion behavior. In this paper, we investigate in detail how
congestion window growth affects download performance for
HTTP and SPDY in cellular networks. In particular, we
demonstrate how idle-to-active transition at different protocol layers results in unintended consequences where there are
retransmissions. Ramjee et al. [3] recognizes how challenging
it can be to optimize TCP performance over 3G networks
exhibiting significant delay and rate variations. They use an
ACK regulator to manage the release of ACKs to the TCP
source so as to prevent undesired buffer overflow. Our work
inspects in detail how SPDY and HTTP behave and thereby
TCP in cellular networks. Specifically, we point out a fundamental insight with regards to avoiding spurious timeouts.
In conventional wired networks, bandwidth changes but the
latency profile does not change as significantly. In cellular
networks, we show that spurious timeout is caused by the
fact that TCP stays with its original estimate for the RTT
and a tight retransmission timeout (RTO) estimate derived
over multiple round-trips during the active period of a TCP
connection is not only invalid, but has significant performance impact. Thus, we suggest using a more conservative
way to manage the RTO estimate.
Impact of TCP variants
We replaced TCP Cubic with TCP Reno to see if modifying the TCP variant has any positive impact on performance. We find in Table 2 that there is little to distinguish
between Reno and Cubic for both HTTP and SPDY over
3G. We see that the average page load time across all the
runs of all pages is better with Cubic. Average throughput
is quite similar with Reno and Cubic, with SPDY achieving
the highest value with Cubic. While this seemingly contradicts the result in Figure 9, note that this result is the
average across all times (ignoring idle times), while the result in Figure 9 considers the average at that one second
instant. Indeed the maximum throughput result confirms
this: HTTP with Cubic achieves a higher throughput than
SPDY with Cubic. SPDY with Reno does not grow the congestion window as much as SPDY with Cubic. This probably results in SPDY with Reno having the worst page load
time across the combinations.
Avg.
Avg.
Max.
Avg.
Max.
Page Load (msec)
Throughput (KBps)
Throughput (KBps)
cwnd (# segments)
cwnd (# segments)
Reno
HTTP
SPDY
9690.84
9899.95
121.88
119.55
1024.74
528.88
10.45
24.16
22
48
Cubic
HTTP
SPDY
9352.58
8671.09
115.36
129.79
889.33
876.98
10.59
52.11
22
197
Table 2: Comparison of HTTP and SPDY with different TCP variants.
6.2.4
8. CONCLUSION
Cache TCP Statistics?
Mobile web performance is one of the most important
measures of users’ satisfaction with their cellular data service. We have systematically studied, through field measurements on a production 3G cellular network, two of the
most prominent web access protocols used today, HTTP and
SPDY. In cellular networks, there are fundamental interactions across protocol layers that limit the performance of
both SPDY as well as HTTP. As a result, there is no clear
performance improvement with SPDY in cellular networks,
in contrast to existing studies on wired and WiFi networks.
Studying these unique cross-layer interactions when operating over cellular networks, we show that there are fundamental flaws in implementation choices of aspects of TCP,
when a connection comes out of an idle state. Because
of the high variability in latency when a cellular end device goes from idle to active, retaining TCP’s RTT estimate
across this transition results in spurious timeouts and a corresponding burst of retransmissions. This particularly punishes SPDY which depends on the single TCP connection
that is hit with the spurious retransmissions and thereby
all the cascading effects of TCP’s congestion control mechanisms like lowering cwnd etc. This ultimately reduces throughput and increases page load times. We proposed a holistic
approach to considering all the TCP implementation fea-
The Linux implementation of TCP caches statistics such
as the slow start threshold and round trip times by default
and reuses them when a new connection is established. If
the previous connection had statistics that are not currently
accurate, then the new connection is negatively impacted.
Note that since SPDY uses only one connection, the only
time these statistics come into play is when the connection
is established. It can potentially impact HTTP, however, because HTTP opens a number of connections over the course
of the experiments. We conducted experiments where we
disabled caching. Interestingly, we find from our results
that both HTTP and SPDY experience reduced page load
times. For example, for 50% of the runs, the improvement
was about 35%. However, there was very little to distinguish
between HTTP and SPDY.
7.
RELATED WORK
Radio resource management: There have been several attempts to improve the performance of HTTP over cellular networks (e.g. [10, 12]). Specifically, TOP and TailTheft
study efficient ways of utilizing radio resources by optimizing timers for state promotions and demotions. [5] studies
313
262
D. Rossi – RES224
Se
nd
/R
2 cv D
se a
c
ta
REFERENCES
[1] 3GPP TS 36.331: Radio Resource Control (RRC).
http://www.3gpp.org/ftp/Specs/html-info/36331.htm.
[2] Squid Caching Proxy. http://www.squid-cache.org.
[3] Chan, M. C., and Ramjee, R. TCP/IP performance over
3G wireless links with rate and delay variation. In ACM
MobiCom (New York, NY, USA, 2002), MobiCom ’02,
ACM, pp. 71–82.
[4] Chu, J., Dukkipati, N., Cheng, Y., and Mathis, M.
Increasing TCP’s Initial Window. http://tools.ietf.org/
html/draft-ietf-tcpm-initcwnd-08.html, Feb. 2013.
[5] Erman, J., Gerber, A., Hajiaghayi, M., Pei, D., Sen, S.,
and Spatscheck, O. To cache or not to cache: The 3g
case. IEEE Internet Computing 15, 2 (2011), 27–34.
[6] Gettys, J. IW10 Considered Harmful.
http://tools.ietf.org/html/
draft-gettys-iw10-considered-harmful-00.html, August
2011.
[7] Google. SPDY: An experimental protocol for a faster web.
http://www.chromium.org/spdy/spdy-whitepaper.
[8] Kalampoukas, L., Varma, A., Ramakrishnan, K. K.,
and Fendick, K. Another Examination of the
Use-it-or-Lose-it Function on TCP Traffic. In ATM
Forum/96-0230 TM Working Group (1996).
[9] Khaunte, S. U., and Limb, J. O. Statistical
characterization of a world wide web browsing session.
Tech. rep., Georgia Institute of Technology, 1997.
[10] Liu, H., Zhang, Y., and Zhou, Y. Tailtheft: leveraging
the wasted time for saving energy in cellular
communications. In MobiArch (2011), pp. 31–36.
[11] Popa, L., Ghodsi, A., and Stoica, I. HTTP as the
narrow waist of the future internet. In Hotnets-IX (2010),
pp. 6:1–6:6.
[12] Qian, F., Wang, Z., Gerber, A., Mao, M., Sen, S., and
Spatscheck, O. TOP: Tail Optimization Protocol For
Cellular Radio Resource Allocation. In IEEE ICNP (2010),
pp. 285–294.
[13] Shruti Sanadhya, and Raghupathy Sivakumar.
Adaptive Flow Control for TCP on Mobile Phones. In
IEEE Infocom (2011).
[14] Stone, J., and Stewart, R. Stream Control Transmission
Protocol (SCTP) Checksum Change.
http://tools.ietf.org/html/rfc3309.html, September
2002.
[15] W3techs.com. Web Technology Surveys. http://w3techs.
com/technologies/details/ce-spdy/all/all.html, June
2013.
[16] Wang, X. S., Balasubramanian, A., Krishnamurthy,
A., and Wetherall, D. Demystifying Page Load
Performance with WProf. In Usenix NSDI’13 (Apr 2013).
[17] Welsh, M., Greenstein, B., and Piatek, M. SPDY
Performance on Mobile Networks. https://developers.
google.com/speed/articles/spdy-for-mobile, April 2012.
[18] Winstein, K., Sivaraman, A., and Balakrishnan, H.
Stochastic Forecasts Achieve High Throughput and Low
Delay over Cellular Networks. In Usenix NSDI’13 (Apr
2013).
IDLE
No power
No Allocated
Bandwidth
CELL_DCH
Idle for
12 sec
CELL_FACH
Continuous
Reception
Short
DRX
Promotion
Demotion
20 msec
No power
No Allocated
Bandwidth
Send/Rcv
Data
0.4 sec
DRX
RRC_IDLE
Long
DRX
11.5 sec
RRC_CONNECTED
Figure 18: The RRC state machines for 3G UMTS
and LTE networks
states, how long a device remains in each state, and the
power it consumes in a state differ between 3G and LTE,
the main purpose is similar: the occupancy in these states
control the number of devices that can access the radio network at a given time. It enables the network to conserve
and share available radio resources amongst the devices and
for saving the device battery at times when the device does
not have data to send or receive.
3G state machine: The 3G state machine, as shown
in Figure 18, typically consists of three states: IDLE, Forward access channel (CELL F ACH) and Dedicated channel (CELL DCH). When the device has no data to send
or receive, it stays in the IDLE state. The device does
not have radio resource allocated to it in IDLE. When
it wants to send or receive data, it has to be promoted to
the CELL DCH mode, where the device is allocated dedicated transport channels in both the downlink and uplink
directions. The delay for this promotion is typically ∼2 seconds. In the CELL F ACH, the device does not have a
dedicated channel, but can transmit at a low rate. This is
sufficient for applications with small amounts or intermittent data. A device can transition between CELL DCH
and CELL F ACH based on data transmission activity. For
example, if a device is inactive for ∼5 seconds, it is demoted
from CELL DCH to CELL F ACH. It is further demoted
to IDLE if there is no data exchange for another ∼12 secs.
Note that these state transition timer values are not general
and vary across vendors and carriers.
LTE state machine: LTE employs a slightly modified
state machine with two primary states: RRC IDLE and
RRC CON N ECT ED. If the device is in RRC IDLE and
sends or receives a packet (regardless of size), a state promotion from RRC IDLE to RRC CON N ECT ED occurs in
about 400 msec. LTE makes use of three sub-states within
RRC CON N ECT ED. Once promoted, the device enters
Continuous Reception state where it uses considerable power
(about 1000mW) but can send and receive data at high
bandwidth. If there is a period of inactivity (e.g., for 100
msec), the device enters the short Discontinuous Reception
(Short DRX) state . If data arrives, the radio returns to the
Continuous Reception state in ∼400 msec. If not, the device
enters the long Discontinuous Reception (Long DRX) state.
In the Long DRX state, the device prepares to switch to the
RRC IDLE state, but is still using high power and waiting for data. If data does arrive within ∼11.5 seconds, the
radio returns to the Continuous Reception state; otherwise
it switches to the low power (< 15 mW) RRC IDLE state.
Thus, compared to 3G, LTE has significantly shorter promotion delays. This shorter promotion delay helps reduce
the number of instances where TCP experiences a spurious
timeout and hence an unnecessary retransmission(s).
APPENDIX
A.
Power:1000+ mW
High Bandwidth
Se
nd
/R
0. cv D
4
se ata
c
10
0
m
se
c
Power: 800 mW
High Bandwidth
r
fo
le c
Id se
5
>
e
siz
e old
eu esh ec
Qu Thr .5 s
1
9.
LTE
3G
tures and parameters to improve mobile web performance
and thereby fully exploit SPDY’s advertised capabilities.
CELLULAR STATE MACHINES
The radio state of every device in a cellular network follows a well-defined state machine. This state machine, defined by 3GPP [1] and controlled by the radio network controller (in 3G) or the base station (in LTE), determines when
a device can send or receive data. While the details of the
314
5.4 SPDY (NSDI’14)
How Speedy is SPDY?
Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy,
and David Wetherall, University of Washington
https://www.usenix.org/conference/nsdi14/technical-sessions/wang
This paper is included in the Proceedings of the
11th USENIX Symposium on Networked Systems
Design and Implementation (NSDI ’14).
April 2–4, 2014 • Seattle, WA, USA
ISBN 978-1-931971-09-6
Open access to the Proceedings of the
11th USENIX Symposium on
Networked Systems Design and
Implementation (NSDI ’14)
is sponsored by USENIX
264
D. Rossi – RES224
How speedy is SPDY?
Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy, and David Wetherall
University of Washington
Abstract
provides only a modest improvement [13, 19]. In our
own study [25] of page load time (PLT) for the top 200
Web pages from Alexa [1], we found either SPDY or
HTTP could provide better performance by a significant
margin, with SPDY performing only slightly better than
HTTP in the median case.
As we have looked more deeply into the performance
of SPDY, we have come to appreciate why it is challenging to understand. Both SPDY and HTTP performance depend on many factors external to the protocols
themselves, including network parameters, TCP settings,
and Web page characteristics. Any of these factors can
have a large impact on performance, and to understand
their interplay it is necessary to sweep a large portion of
the parameter space. A second challenge is that there
is much variability in page load time (PLT). The variability comes not only from random events like network
loss, but from browser computation (i.e., JavaScript evaluation and HTML parsing). A third challenge is that dependencies between network activities and browser computation can have a significant impact on PLT [25].
In this work, we present what we believe to be the
most in-depth study of page load time under SPDY to
date. To make it possible to reproduce experiments, we
develop a tool called Epload that controls the variability by recording and replaying the process of a page load
at fine granularity, complete with browser dependencies
and deterministic computational delays; in addition we
use a controlled network environment. The other key to
our approach is to isolate the different factors that affect
PLT with reproducible experiments that progress from
simple but unrealistic transfers to full page loads. By
looking at results across this progression, we can systematically isolate the impact of the contributing factors
and identify when SPDY helps significantly and when it
performs poorly compared to HTTP.
Our experiments progress as follows. We first compare SPDY and HTTP simply as a transport protocol
(with no browser dependencies or computation) that
transfers Web objects from both artificial and real pages
(from the top 200 Alexa sites). We use a decision tree
analysis to identify the situations in which SPDY outperforms HTTP and vice versa. We find that SPDY improves PLT significantly in a large number of scenarios
that track the benefits of using a single TCP connection.
Specifically, SPDY helps for small object sizes and under low loss rates by: batching several small objects in a
TCP segment; reducing congestion-induced retransmis-
SPDY is increasingly being used as an enhancement
to HTTP/1.1. To understand its impact on performance,
we conduct a systematic study of Web page load time
(PLT) under SPDY and compare it to HTTP. To identify
the factors that affect PLT, we proceed from simple, synthetic pages to complete page loads based on the top 200
Alexa sites. We find that SPDY provides a significant improvement over HTTP when we ignore dependencies in
the page load process and the effects of browser computation. Most SPDY benefits stem from the use of a single
TCP connection, but the same feature is also detrimental under high packet loss. Unfortunately, the benefits
can be easily overwhelmed by dependencies and computation, reducing the improvements with SPDY to 7%
for our lower bandwidth and higher RTT scenarios. We
also find that request prioritization is of little help, while
server push has good potential; we present a push policy based on dependencies that gives comparable performance to mod spdy while sending much less data.
1
Introduction
HTTP/1.1 has been used to deliver Web pages using multiple, persistent TCP connections for at least the past
decade. Yet as the Web has evolved, it has been criticized for opening too many connections in some settings
and too few connections in other settings, not providing
sufficient control over the transfer of Web objects, and
not supporting various types of compression.
To make the Web faster, Google proposed and deployed a new transport for HTTP messages, called
SPDY, starting in 2009. SPDY adds a framing layer for
multiplexing concurrent application-level transfers over
a single TCP connection, support for prioritization and
unsolicited push of Web objects, and a number of other
features. SPDY is fast becoming one of the most important protocols for the Web; it is already deployed by
many popular websites such as Google, Facebook, and
Twitter, and supported by browsers including Chrome,
Firefox, and IE 11. Further, IETF is standardizing a
HTTP/2.0 proposal that is heavily based on SPDY [10].
Given the central role that SPDY is likely to play in
the Web, it is important to understand how SPDY performs relative to HTTP. Unfortunately, the performance
of SPDY is not well understood. There have been several studies, predominantly white papers, but the findings often conflict. Some studies show that SPDY improves performance [20, 14], while others show that it
1
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 387
5.4 SPDY (NSDI’14)
2.1 Limitations of HTTP/1.1
sions; and reducing the time when the TCP pipe is idle.
Conversely, SPDY significantly hurts performance under
high packet loss for large objects. This is because a set
of TCP connections tends to perform better under high
packet loss; it is necessary to tune TCP behavior to boost
performance.
Next, we examine the complete Web page load process by incorporating dependencies and computational
delays. With these factors, the benefits of SPDY are reduced, and can even be negated. This is because: i) there
are fewer outstanding objects at a given time; ii) traffic is
less bursty; and iii) the impact of the network is degraded
by computation. Overall, we find SPDY benefits to be
larger when there is less bandwidth and longer RTTs. For
these cases SPDY reduces the PLT for 70–80% of Web
pages, and for shorter, faster links it has little effect, but
it can also increase PLT: the worst 20% of pages see an
increase of at least 6% for long RTT networks.
In search of greater benefits, we explore SPDY mechanisms for prioritization and server push. Prioritization helps little because it is limited by load dependencies, but server push has the potential for significant improvements. How to obtain this benefit depends on the
server push policy, which is a non-trivial issue because of
caching. This leads us to develop a policy based on dependency levels that performs comparably to mod spdy’s
policy [11] while pushing 80% less data.
Our contributions are as follows:
• A systematic measurement study using synthetic
pages and real pages from 200 popular sites that identifies the combinations of factors for which SPDY
improves (and sometimes reduces) PLT compared to
HTTP.
• A page load tool, Epload, that emulates the detailed
page load process of a target page, including its dependencies, while eliminating variability due to browser
computation. With a controlled network environment,
Epload enables reproducible but authentic page load
experiments for the first time.
• A SPDY server push policy based on dependency
information that provides comparable benefits to
mod spdy while sending much less data over the network.
In the rest of this paper, we first review SPDY background (§2) and then briefly describe our challenge and
approach (§3). Next, we extensively study TCP’s impact on SPDY (§4) and extend to Web page’s impact on
SPDY (§5). We discuss in §6, review related work in §7,
and conclude in §8.
2
When HTTP/1.1, or simply HTTP, was designed in the
late 1990’s, Web applications were fairly simple and
rudimentary. Since then, Web pages have become more
complex and dynamic, making it difficult for HTTP to
meet the increasingly demanding user experience. Below, we identify some of the limitations of HTTP:
i) Browsers open too many TCP connections to load
a page. HTTP improves performance by using parallel
TCP connections. But if the number of connections is
too large, the aggregate flow may cause network congestion, high packet loss, and reduced performance [9]. Further, services often deliver Web objects from multiple domains, which results in even more TCP connections and
the possibility of high packet loss.
ii) Web transfers are strictly initiated from the client.
Consider the loading of embedded objects. Theoretically, the server can send embedded objects along with
the parent object when it receives a request for the parent object. In HTTP, because an object can be sent only
in response to a client request, the server has to wait for
an explicit request which is sent only after the client has
received and processed the parent page.
iii) A TCP segment cannot carry more than one HTTP
request or response. HTTP, TCP and other headers could
account for a significant portion of a packet when HTTP
requests or responses are small. So if there are a large
number of small embedded objects in a page, the overhead associated with these headers is substantial.
2.2 SPDY
SPDY addresses several of the issues described above.
We now review the key ideas in SPDY’s design and implementation and its deployment status.
Design: There are four key SPDY features.
i) Single TCP connection. SPDY opens a single
TCP connection to a domain and multiplexes multiple
HTTP requests and responses (a.k.a., SPDY streams)
over the connection. The multiplexing here is similar to
HTTP/1.1 pipelining but is finer-grained. A single connection also helps reduce SSL overhead. Besides clientside benefits, using a single connection helps reduce the
number of TCP connections opened at servers.
ii) Request prioritization. Some Web objects, such as
JavaScript code modules, are more important than others
and thus should be loaded earlier. SPDY allows the client
to specify a priority level for each object, which is then
used by the server in scheduling the transfer of the object.
iii) Server push. SPDY allows the server to push embedded objects before the client requests for them. This
improves latency but could also increase transmitted data
if the objects are already cached at the client.
iv) Header compression. SPDY supports HTTP
Background
In this section, we review issues with HTTP performance and describe how the new SPDY protocol addresses them.
2
388 11th USENIX Symposium on Networked Systems Design and Implementation USENIX Association
266
D. Rossi – RES224
Implementation: SPDY is implemented by adding a
framing layer to the network stack between HTTP and
the transport layer. Unlike HTTP, SPDY splits HTTP
headers and data payloads into two kinds of frames.
SYN_STREAM frames carry request headers and SYN_
REPLY frames carry response headers. When a header
exceeds the frame size, one or more HEADERS frames
will follow. HTTP data payloads are sliced into DATA
frames. There is no standardized value for the frame size,
and we find that mod spdy caps frame size to 4KB [11].
Because frame size is the granularity of multiplexing, too
large a frame decreases the ability to multiplex while too
small a frame increases overhead. SPDY frames are encapsulated in one or more consecutive TCP segments. A
TCP segment can carry multiple SPDY frames, making it
possible to batch up small HTTP requests and responses.
0.6
0.6
0.4
HTTP
0.4
0.2
SPDY
0
HTTP
SPDY
0
0 0.5 1 1.5 2 2.5 3 3.5 4
Page load time (seconds)
(a) The flag page
0 0.5 1 1.5 2 2.5 3 3.5 4
Page load time (seconds)
(b) Twitter home page
Figure 1: Distributions of PLTs of SPDY and HTTP. Performed a thousand runs for each curve without caching.
likely yield different, even conflicting, results, if they use
different experimental settings. Therefore, a comprehensive sweep of the parameter space is necessary to evaluate under what conditions SPDY helps, what kinds of
Web pages benefit most from SPDY, and what parameters best support SPDY.
Second, we observed in our experiments that the measured page load times have high variances, and this often
overwhelms the differences between SPDY and HTTP.
For example, in Figure 1(b), the variance of the PLT for
the Twitter page is 0.5 second but the PLT difference between HTTP and SPDY is only 0.02 second. We observe
high variance even when we load the two pages in a fully
controlled network. This indicates that the variability
likely stems from browser computation (i.e., JavaScript
evaluation and HTML parsing). Controlling this variability is key to reproducing experiments so as to obtain
meaningful comparisons.
Third, prior work has shown that the dependencies between network operations and computation has a significant impact on PLT [25]. Interestingly, page dependencies also influence the scheduling of network traffic and
affects how much SPDY helps or hurts performance (§4
and §5). Thus, on one hand, ignoring browser computations can reduce PLT variability, but on the other hand,
dependencies need to be preserved in order to obtain accurate measurements under realistic offered loads.
Pinning SPDY down
We would like to experimentally evaluate how SPDY
performs relative to HTTP because SPDY is likely to
play a key role in the Web. But, understanding SPDY
performance is hard. Below, we identify three challenges
in studying the performance of SPDY and then provide
an overview of our approach.
3.1
1
0.8
0.2
Deployment: SPDY is deployed over SSL and TCP. On
the client side, SPDY is enabled in Chrome, Firefox,
and IE 11. On the server side, popular websites such
as Google, Facebook, and Twitter have deployed SPDY.
Another popular use of SPDY is between a proxy and a
client, such as the Amazon Silk browser [16] and Android Chrome Beta [2]. SPDY version 3 is the most recent specification and is widely deployed [21].
3
1
0.8
CDF
CDF
header compression since experiments suggest that
HTTP headers for a single session contain duplicate
copies of the same information (e.g., User-Agent).
Challenges
We identify the challenges on the basis of previous studies and our own initial experimentation. As a first step,
we extensively load two Web pages for a thousand times
using a measurement node at the University of Washington. One page displays fifty world flags [12], which
is advertised by mod spdy [11] to demonstrate the performance benefits of SPDY, and the other is the Twitter
home page. The results are depicted in Figure 1.
First, we observe that SPDY helps the flag page but
not the Twitter page, and it is not immediately apparent
as to why that is the case. Further experimentation in emulated settings also revealed that both the magnitude and
the direction of the performance differences vary significantly with network conditions. Taken together, this indicates that SPDY’s performance depends on many factors such as Web page characteristics, network parameters, and TCP settings, and that measurement studies will
3.2 Approach
Our approach is to separate the various factors that affect
SPDY and study them in isolation. This allows us to control and identify the extent to which these factors affect
SPDY.
First, we extensively sweep the parameter space of all
the factors that affect SPDY including RTT, bandwidth,
loss rate, TCP initial window, number of objects on a
page, and object sizes. We initially ignore page load
dependencies and computation in order to simplify our
analysis. This systematic study allows us to identify
when SPDY helps or hurts and characterize the importance of the contributing factors. Based on further analysis of why SPDY sometimes hurts, we propose some
3
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 389
5.4 SPDY (NSDI’14)
Categ
simple modifications to TCP.
Second, before we perform experiments with page
load dependencies, we address the variability caused by
computation. We develop a tool called Epload that emulates the process of a page load. Instead of performing
real browser computation, Epload records the process
of a sample page load, identifies when computations happen, and replays the page load by introducing the appropriate delays associated with the recorded computations.
After emulating a computation activity, Epload performs real network requests to dependent Web objects.
This allows us to control the variability of computation
while also modeling page load dependencies. In contrast to the methodology that statistically reduces variability by obtaining a large amount of data (usually from
production), our methodology mitigates the root cause
of variability and thus largely reduces the amount of required experiments.
Third, we study the effects of dependencies and computation by performing page loads with Epload. We are
then able to identify how much dependencies and computation affect SPDY, and to identify the relative importance of other contributing factors. To mitigate the negative impact of dependencies and computation, we explore the use of prioritization and server push that enable
the client and the server to coordinate the transfers. Here,
we are able to evaluate the extent to which these mechanisms can improve performance when used appropriately.
4
Net
TCP
Page
Range
20ms, 100ms, 200ms
1Mbps, 10Mbps
0, 0.005, 0.01, 0.02
3, 10, 21, 32
100B, 1K, 10K, 100K, 1M
2, 8, 16, 32, 64, 128, 512
High
≥100ms
≥10Mbps
≥ 0.01
≥ 21
≥ 1K
≥ 64
Table 1: Contributing factors to SPDY performance. We
define a threshold for each factor, so that we can classify
a setting as being high or low in our analysis.
3 without SSL which allows us to decode the SPDY
frames in TCP payloads. To control the exact size of
Web objects, we turn off gzip encoding.
Client: Because we issue requests at the granularity
of Web objects and not pages, we do not work with
browsers, and instead develop our own SPDY client by
following the SPDY/3 specification [21]. Unlike other
wget-like SPDY clients such as spdylay [22] that open
a TCP connection per request, our SPDY client allows us
to reuse TCP connections. Similarly, we also develop an
HTTP client for comparison. We set the maximum number of parallel TCP connections for HTTP to six, as used
by all major browsers. As the receive window is autotuned, it is not a bottleneck in our experiments.
Web pages: To experiment with synthetic pages, we create objects with pre-specified sizes and numbers. To
experiment with real pages, we download the home
pages of the Alexa top 200 websites to our own server.
To avoid the negative impact of domain sharding on
SPDY [18], we serve all embedded objects from the same
server including those that are dynamically generated by
JavaScript.
We run the experiments presented in the entire paper
from June to September, 2013. We repeat our experiments five times and present the median to exclude the
effects of random loss. We collect network traces at both
the client and the server. We define page load time (PLT)
as the elapsed time between when the first object is requested and when the last object is received. Because we
do not experiment within a browser, we do not use the
W3C load event [24].
TCP and SPDY
In this section, we extensively study the performance of
SPDY as a transfer protocol on both synthetic and real
pages by ignoring page load dependencies and computation. This allows us to measure SPDY performance without other confounding factors such as browser computation and page load dependencies. Here, SPDY is only
different from HTTP in the use of a single TCP connection, header compression, and a framing layer.
4.1
Factor
rtt
bw
pkt loss
iw
obj size
# of obj
Experimental setup
We conduct the experiments by setting up a client and a
server that can communicate over both HTTP and SPDY.
Both the server and the client are connected to the campus LAN at the University of Washington. We use Dummynet [6] to vary network parameters. Below details the
experimental setup.
4.2 Experimenting with synthetic pages
In experimenting with synthetic pages, we consider a
broad range of parameter settings for the various factors
that affect performance. Table 1 summarizes the parameter space used in our experiments. The RTT values include 20ms (intra-coast), 100ms (inter-coast), and 200ms
(3G link or cross-continent). The bandwidths emulate
a broadband link with 10Mbps [4] and a 3G link with
1Mbps [3]. We inject random packet loss rates from zero
to 2% since studies suggest that Google servers experience a loss rate between 1% and 2% [5]. At the server,
Server: Our server is a 64-bit machine with 2.4GHz 16
core CPU and 16GB memory. It runs Ubuntu 12.04 with
Linux kernel 3.7.5 using the default TCP variant Cubic.
We use a TCP initial window size of ten as the default
setting, as suggested by SPDY best practices [18]. HTTP
and SPDY are enabled on Apache 2.2.2 with the SPDY
module, mod spdy 0.9.3.3-386, installed. We use SPDY
4
390 11th USENIX Symposium on Networked Systems Design and Implementation USENIX Association
D. Rossi – RES224
obj size
large
SPDY
# of obj
large
small
SPDY
RTT
loss
low
high
low
high
BW
BW
large
SPDY
low
low
high
EQUAL
icwnd
EQUAL
small
SPDY RTT
RTT
low
low
high
low
high
RTT
high
high low
HTTP
BW
low
high
EQUAL
HTTP
high
icwnd
EQUAL
2%
5
HTTP
4 SPDY
3
2
1
2
8
16
32
64
128
(b) Object number
high
HTTP better
12
10 HTTP
SPDY
8
6
4
2
0
100Byte
Figure 2: The decision tree that tells when SPDY or
HTTP helps. A leaf pointing to SPDY (HTTP) means
SPDY (HTTP) helps; a leaf pointing to EQUAL means
SPDY and HTTP are comparable. Table 1 shows how we
define a factor being high or low.
1KB
10KB
100KB
(c) Object size
Figure 3: Performance trends for three factors with a
default setting: rtt=200ms, bw=10Mbps, loss=0, iw=10,
obj size=10K, obj number=64.
tions under which SPDY outperforms HTTP (s < 0.9)
and under which HTTP outperforms SPDY (s > 1.1).
The decision tree analysis generates the likelihood that
a configuration works better under SPDY (or HTTP). If
this likelihood is over 0.75, we mark the branch as SPDY
(or HTTP); otherwise, we say that SPDY and HTTP perform equally.
We obtain the decision tree in Figure 2 as follows.
First, we produce a decision tree based on all the factors.
To populate the branches, we also generate supplemental
decision trees based on subsets of factors. Each supplemental decision tree has a prediction accuracy of 84% or
higher. Last, we merge the branches from supplemental
decision trees into the original decision tree.
we vary TCP initial window size from 3 (used by earlier
Linux kernel versions) to 32 (used by Google servers).
We also consider a wide range of Web object sizes (100B
to 1M) and object numbers (2 to 512). For simplicity, we
choose one value for each factor which means that there
is no cross traffic.
When we sweep this large parameter space, we find
that SPDY improves performance under certain conditions, but degrades performance under other conditions.
4.2.1
1%
0
SPDY
SPDY better or equal to HTTP
0.5%
(a) Packet loss rate
large
BW
SPDY EQUAL
SPDY
0
HTTP
EQUAL
high low
7
6 HTTP
5 SPDY
4
3
2
1
0
# of obj
# of obj
small
EQUAL
low
high
PLT (seconds)
loss
low
PLT (seconds)
small
PLT (seconds)
268
When does SPDY help or hurt
There have been many hypotheses as to whether SPDY
helps or hurts based on analytical inference about parallel versus single TCP connections. For example, one
hypothesis is that SPDY hurts because a single TCP connection increases congestion window slower than multiple connections; another hypothesis is that SPDY helps
stragglers because HTTP has to balance its communications across parallel TCP. However, it is unclear
how much hypotheses contribute to SPDY performance.
Here, we sort out the most important findings, meaning
that hypotheses that are shown here contribute more to
SPDY performance than those that are not shown.
Results: The decision tree shows that SPDY hurts when
packet loss is high. However, SPDY helps under a number of conditions, for example, when there are:
• Many small objects, or small objects under low loss.
• Many large objects under low loss.
• Few objects under good network conditions and a
large TCP initial window.
The decision tree also depicts the relative importance
of contributing factors. Intuitively, factors close to the
root of the decision tree affect SPDY performance more
than those near the leaves. This is because the decision
tree places the important factors near the root to reduce
the number of branches. We find that object size and loss
rate are the most important factors in predicting SPDY
performance. However, RTT, bandwidth, and TCP initial
window play a less important role.
Methodology: To understand the conditions under
which SPDY helps or hurts, we build a predictive model
based on decision tree analysis. In the analysis, each configuration is a combination of values for all factors listed
in Table 1. For each configuration, we add an additional
variable s, which is the PLT of SPDY divided by that of
HTTP. We run the decision tree to predict the configura5
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 391
5.4 SPDY (NSDI’14)
CDF
CDF
0.9
0.85
1
0.8
0.6
0.4
0.2
0
CDF
1
0.95
0
0.8
50
100
150
# objects
1
0.8
0.6
0.4
0.2
0
200
0
50
100
150
# objects
200
HTTP
0.75
(a) # of objects
SPDY w/ TCP
(b) # of objects < 1.5KB
0.7
50
100
# of retransmits
150
200
CDF
Figure 4: SPDY reduces the number of retransmissions.
How much SPDY helps or hurts: We present three
trending graphs in Figure 3. Figure 3(a) shows that
HTTP outperforms SPDY by half when loss rate increases to 2%, Figure 3(b) shows the trend that SPDY
performs better as the number of objects increases, and
Figure 3(c) shows the trend that SPDY performs worse
as the object size increases. We publish the results,
trends, and network traces at http://wprof.cs.
washington.edu/spdy/.
4.2.2
1
0.8
0.6
0.4
0.2
0
CDF
0
0
500 1000 1500
Page size (KB)
(c) Page size (KB)
2000
1
0.8
0.6
0.4
0.2
0
0
20 40 60 80 100
Average object size (KB)
(d) Mean object size (KB)
Figure 5: Characteristics of top 200 Alexa Web pages.
aggressively reduces the congestion window compared
to HTTP which reduces the congestion window on only
one of its parallel connections.
4.3 Experimenting with real pages
Why does SPDY help or hurt
In this section, we study the effects of varying object
sizes and number of objects based on the distributions
observed in real Web pages. We continue to vary other
factors such as network conditions and TCP settings
based on the parameter space described in Table 1. Due
to space limit, we only show results under a 10Mbps
bandwidth.
First, we examine the page characteristics of real
pages because they can explain why SPDY helps or hurts
when we relate them to the decision tree. Figure 5 shows
the characteristics of the top 200 Alexa Web pages [1].
The median number of objects is 30 and the median page
size is 750KB. We find high variability in the size of objects within a page. The standard deviation of the object
size within a page is 31KB (median), even more than the
average object size 17KB (median).
Figure 6 shows PLT of SPDY divided by that of HTTP
across the 200 Web pages. It suggests that SPDY helps
on 70% of the pages consistently across network conditions. Interestingly, SPDY shows a 2x speedup over
half of the pages, likely due to the following reasons.
First, SPDY almost eliminates retransmissions (as indicated in Figure 7). Compared to a similar analysis for artificial pages (see Figure 4), SPDY’s retransmission rate
is even lower. Second, we find in Figure 5(b) that 80% of
the pages have small objects, and that half of the pages
have more than ten small objects. Since SPDY helps
with small objects (based on the decision tree analysis),
it is not surprising that SPDY has lower PLT for this set
of experiments. In addition, we hypothesize that SPDY
could help with stragglers since it multiplexes all objects
on to a single connection and thus reduces the dynamics of congestion windows. To check this hypothesis, we
ran a set of experiments with overall page size and the
While the decision tree informs the conditions under
which SPDY helps or hurts, it does not explain why. To
this end, we analyze the network traces we collected to
explain SPDY performance. We discuss below our findings.
SPDY helps on small objects. Our traces suggest that
TCP implements congestion control by counting outstanding packets not bytes. Thus, sending a few small
objects with HTTP will promptly use up the congestion window, though outstanding bytes are far below the
window limit. In contrast, SPDY batches small objects
and thus eliminates this problem. This explains why the
flag page [12], which mod spdy advertised, benefits from
SPDY.
SPDY benefits from having a single connection. We
find several reasons as to why SPDY benefits from a single TCP connection. First, a single connection results in
fewer retransmissions. Figure 4 shows the retransmissions in SPDY and HTTP across all configurations except those with zero injected loss. SPDY helps because
packet loss occurs more often when concurrent TCP connections are competing with each other. There are additional explanations for why SPDY benefits from using a
single connection. In our previous study [25], our experiments showed that SPDY significantly reduced the contribution of the TCP connection setup time to the critical
path of a page download. Further, our experiments in §5
will show that a single pipe reduces the amount of time
the pipe is idle due to delayed client requests.
SPDY degrades under high loss due to the use of a
single pipe. We discussed above that a single TCP connection helps under several conditions. However, a single connection hurts under high packet loss because it
6
392 11th USENIX Symposium on Networked Systems Design and Implementation USENIX Association
D. Rossi – RES224
1
1
0.8
0.8
0.6
0.6
CDF
CDF
270
0.4
2% loss;TCP+
2% loss;TCP
0
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
0
2
1
0.8
1
0.6
0.9
0.4
0.8
CDF
CDF
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
0 loss;TCP
0.7
HTTP
2% loss;TCP
SPDY w/ TCP
0.6
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
SPDY w/ TCP+
2
0.5
0
(b) rtt=200ms, bw=10Mbps
1
CDF
0.8
0.7
HTTP
SPDY w/ TCP
0.5
20
40
60
80
100
# of retransmits
120
140
Figure 7: SPDY helps reduce retransmissions.
number of objects drawn from the real pages, but with
equal object sizes embedded inside the pages. When we
perform this experiment, HTTP’s performance improves
only marginally indicating that there is very little straggler effect.
4.4
40
60
80
100
# of retransmits
120
140
nections. Third, when packet loss occurs, the congestion
window (cwnd) backs off with a rate β = 1 − (1 − β)/n
where β is the original backoff rate. In practice, the number of concurrent connections changes over time. Because we are unable to pass this value to the Linux kernel
in real time, we assume that HTTP uses six connections
and set n = 6. We use six here because it is found optimal and used by major browsers [17].
We perform the same set of SPDY experiments with
both synthetic and real pages using TCP+. Figure 8
shows that SPDY performs better with TCP+, and the decision tree analysis for TCP+ suggests that loss rate is no
longer a key factor that determines SPDY performance.
To evaluate the potential side effects of TCP+, we look
at the number of retransmissions produced by TCP+.
Figure 9 shows that SPDY still produces much fewer
retransmissions with TCP+ than with HTTP, meaning
that TCP+ does not abuse the congestion window under the conditions that we experimented with. Here, we
aim to demonstrate that SPDY’s negative impact under
high random loss can be mitigated by tuning the congestion window. Because the loss patterns in real networks
are likely more complex, a solution for real networks
requires further consideration and extensive evaluations
and is out of the scope of this paper.
0.9
0
20
Figure 9: With TCP+, SPDY still produces few retransmissions.
Figure 6: SPDY performance across 200 pages with object sizes and numbers of objects drawn from real pages.
SPDY helps more under a 1Mbps bandwidth.
0.6
2
Figure 8: TCP+ helps SPDY across the 200 pages.
RTT=20ms, BW=10Mbps. Results on other network settings are similar.
(a) rtt=20ms, bw=10Mbps
0.2
2% loss;TCP
0.2
0 loss;TCP
0.2
0.4
TCP modifications
Previously, we found that SPDY hurts mainly under high
packet loss because a single TCP connection reduces the
congestion window more aggressively than HTTP’s parallel connections. Here, we demonstrate that the negative
impact can be mitigated by simple TCP modifications.
Our modification (a.k.a., TCP+) mimics behaviors of
concurrent connections with a single connection. Let the
number of parallel TCP connections be n. First, we propose to multiply the initial window by n to reduce the effect of slow start. Second, we suggest scaling the receive
window by n to ensure that the SPDY connection has the
same amount of receive buffer as HTTP’s parallel con-
5
Web pages and SPDY
This section examines how SPDY performs for real Web
pages. Real page loads incur dependencies and computation that may affect SPDY’s performance. To incorporate dependencies and computation while controlling
variability, we develop a page load emulator Epload
7
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 393
5.4 SPDY (NSDI’14)
Parse css tag
Parse js tag
Parse img
g tag
g
Parse js tag
1
HTML
0.8
CSS
CDF
0.6
JS1
0.4
Image
Chrome
0.2
JS2
Epload
0
Start
Network activity
Elapsed Time
Computation activity
0
Page load
1
1.5
Page load time (seconds)
2
Figure 11: Page loads using Chrome v.s. Epload.
De
Dependency
Figure 10: A dependency graph obtained from WProf.
do not occur deterministically and significantly, we exclude them here.
Using the recorded dependency graph, Epload replays the page load process as follows. First, Epload
starts the activity that loads the root HTML. When the
activity is finished, Epload checks whether it should
trigger a dependent activity based on whether all activities that the dependent activity depends on are finished.
For example in Figure 10, the dependent activity is parsing the HTML, and it should be triggered. Next, it starts
the activity that parses the HTML. Instead of performing
HTML parsing, it waits for the same amount of time that
parsing takes (based on the recorded information) and
checks dependent activities upon completion. This proceeds until all activities are finished. The actual replay
process is more complex because a dependent activity
can start before an activity is fully completed. For example, parsing an HTML starts after the first chunk of the
HTTP response is received; and loading the CSS starts
after the first chunk of HTML is fully parsed. Epload
models all of these aspects of a page load.
that hides the complexity and variations in browser computation while performing authentic network requests
(§5.1). We use Epload to identify the effect of page
load dependencies and computation on SPDY’s performance (§5.2). We further study SPDY’s potential by examining prioritization and server push (§5.3).
5.1
0.5
Epload: emulating page loads
Web objects in a page are usually not loaded at the same
time, because loading an object can depend on loading
or evaluating other objects. Therefore, not only network
conditions, but also page load dependencies and browser
computation, affect page load times. To study how much
SPDY helps the overall page load time, we need to evaluate SPDY’s performance by preserving dependencies and
computation of real page loads.
Dependencies and computation are naturally preserved by loading pages in real browsers. However, this
procedure incurs high variances in page load times that
stem from both network conditions and browser computation. We have conducted controlled experiments to
control the variability of network, and here introduce the
Epload emulator to control the variability of computation.
Implementation: Epload recorder is implemented
based on WProf to generate a dependency graph that
specifies activities and their dependencies. Epload
records the computational delays while performing the
page load in the browser, whereas the network delays are
realized independently for each replay run. We implement Epload replayer using node.js. The output from
Epload replayer is a series of throttled HTTP or SPDY
requests to perform a page load. The Epload code
is available at http://wprof.cs.washington.
edu/spdy/.
Design: The key idea of Epload is to decouple network
operations and computation in page loads. This allows
Epload to simplify computation while scheduling network requests at the appropriate points during the page
load.
Epload records the process of a page load by capturing the dependency graph using our previous work,
WProf [25]. WProf captures the dependency and timing
information of a page load. Figure 10 shows an example
of a dependency graph obtained from WProf where activities depend on each other. This Web page embeds a
CSS, a JavaScript, an image, and another JavaScript. A
bar represents an activity (i.e., loading objects, evaluating CSS and JavaScript, parsing HTML) while an arrow
represents that one activity depends on another. For example, evaluating JS1 depends on both loading JS1 and
evaluating CSS. Therefore, evaluating JS1 can only start
after the other two activities complete. There are other
dependencies such as layout and painting. Because they
Evaluation: We validate that Epload controls the variability of computation. We compare the differences of
two runs across 200 pages loaded by Epload and by
Chrome. The network is tuned to a 20ms RTT, a 10Mbps
bandwidth, and zero loss. Figure 11 shows that Epload
produces at most 5% differences for over 80% of pages
which is a 90% reduction compared to Chrome.
5.2
Effects of dependencies and computation
We use Epload to measure the impact of dependencies
and computation. We set up experiments as follows. The
Epload recorder uses a WProf-instrumented Chrome to
8
394 11th USENIX Symposium on Networked Systems Design and Implementation USENIX Association
272
D. Rossi – RES224
1
1
0.8
0.8
0.6
CDF
CDF
0.6
0 loss;TCP
0.4
0 loss;TCP
0.4
0 loss;TCP+
0 loss;TCP+
2% loss;TCP
0.2
2% loss;TCP
0.2
2% loss;TCP+
2% loss;TCP+
0
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
2
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
(a) rtt=20ms, bw=1Mbps
(b) rtt=200ms, bw=1Mbps
1
1
0.8
0.8
0.6
CDF
0.6
CDF
2
0 loss;TCP
0.4
0 loss;TCP
0.4
0 loss;TCP+
0 loss;TCP+
2% loss;TCP
0.2
2% loss;TCP
0.2
2% loss;TCP+
2% loss;TCP+
0
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
2
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
(c) rtt=20ms, bw=10Mbps
2
(d) rtt=200ms, bw=10Mbps
Figure 12: SPDY performance using emulated page loads. Compared to Figure 6, it suggests that dependencies and
computation reduce the impact of SPDY and that RTT and bandwidth become more important.
obtain the dependency graphs of the top 200 Alexa Web
pages [1]. Epload runs on a Mac with 2GHz dual core
CPU and 4GB memory. We vary other factors based on
the parameter space described in Table 1. Due to space
limit, we only show figures under a 10Mbps bandwidth.
1
0.8
CDF
0.6
0.4
HTTP
0.2
Figure 12 shows the performance of SPDY versus
HTTP after incorporating dependencies and computation. Compared to Figure 6, dependencies and computation largely reduce the amount that SPDY helps or
hurts. We make the following observations along with
supporting evidence. First, computation and dependencies increase PLTs of both HTTP and SPDY, reducing
the network load. Second, SPDY reduces the amount of
time a connection is idle, lowering the possibility of slow
start (see Figure 13). Third, dependencies help HTTP by
making traffic less bursty, resulting in fewer retransmissions (see Figure 14). Fourth, having fewer outstanding
objects diminishes SPDY’s gains, because SPDY helps
more when there are a large number of outstanding objects (as suggested by the decision tree in Figure 2).
Here, we see that dependencies and computation reduce
and can easily nullify the benefits of SPDY, implying
that speeding up computation or breaking dependencies
might be necessary to improve the PLT using SPDY.
SPDY
0
0
0.2
0.4
0.6
% of idle RTTs
0.8
1
Figure 13: Fractions of RTTs when a TCP connection is
idle. Experimented under 2% loss rate.
amount of impact that computation has on SPDY. This
explains why SPDY provides minimal improvements under good network conditions (see Figure 12(c)).
To identify the impact of computation, we scale the
time spent in each computation activity by factors of 0,
0.5, and 2. Figure 15 shows the performance of SPDY
versus HTTP, both with scaled computation and under
high bandwidths, suggesting that speeding up computation increases the impact of SPDY. Surprisingly, speeding up computation to the extreme is sometimes no better
than a x2 speedup. This is because computation delays
the requesting of dependent objects which allows for previously requested objects to be loaded faster, and therefore possibly lowers the PLT.
Interestingly, we find that RTT and bandwidth now
play a more important role in the performance of SPDY.
For example, Figure 12 shows that SPDY helps up to
80% of the pages under low bandwidths, but only 55%
of the pages under high bandwidths. This is because RTT
and bandwidth determine the amount of time page loads
spend in network relative to computation, and further the
5.3 Advancing SPDY
SPDY provides two mechanisms, i) prioritization and ii)
server push, to mitigate the negative effects of dependencies and computation of real page loads. However, little
is known about how to better use the mechanisms. In this
section, we explore advanced policies to speed up page
9
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 395
5.4 SPDY (NSDI’14)
WProf dependency graph
CDF
1
0.9
H
0.8
C
0.7
J1
0.6
w/o dep. & comp.
I1
w/ dep. & comp.
J2
Object dependency graph
d:4
H
H
d:3
C
C
J1
J1
d:1
I1
J2 d:2
J2
d:3
0.5
0
20
40
60
80
100
# of retransmits
120
I2
140
Figure 14: SPDY helps reduce retransmissions.
Network activity
1
0.6
Computation activity
De
Dependency
PLT_w/_priority / PLT_w/o_priority
x1
0.2
x2
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
2
Figure 15: Results by varying computation when
bw=10Mbps, rtt=200ms.
1.4
1.2
chrome-priority
dependendy-priority
1
0.8
0.6
0.4
0.2
0
10th
loads using these mechanisms.
50th
90th
(a) rtt=20ms, bw=10Mbps
PLT_w/_priority / PLT_w/o_priority
x0
0.4
x 0.5
5.3.1
d:1
Figure 16: Converting WProf dependency graph to an
object-based graph. Calculating a depth to each object in
the object-based graph.
0.8
CDF
I2
Elapsed Time
1.4
1.2
chrome-priority
dependendy-priority
1
0.8
0.6
0.4
0.2
0
10th
50th
90th
(b) rtt=200ms, bw=10Mbps
Figure 17: Results of priority (zero packet loss) when
bw=10Mbps. bw=1Mbps results are similar to (b).
Basis of advancing
To better schedule objects, both prioritization and server
push provide mechanisms to specify the importance for
each object. Thus, the key issue is to identify the importance of objects in an automatic manner. To highlight
the benefits, we leverage the dependency information obtained from a previous load of the same page. This information gives us ground truth as to which objects are critical for reducing PLT. For example, in Figure 10, all the
activities depend on loading the HTML, making HTML
the most important object; but no activity depends on
loading the image, suggesting that the image is not an
important object.
To quantify the importance of an object, we first look
at the time required to finish the page load starting from
the load of this object. We denote this as time to finish
(TTF). In Figure 10, TTF of the image is simply the time
to load the image alone, while TTF of JS2 is the time
to both load and evaluate it. Because TTF of the image
is longer than TTF of JS2, this image is more important
than JS2. Unfortunately in practice, it is not clear as to
how long it would take to load an object, before we make
the decision to prioritize or push it.
Therefore, we simplify the definition of importance.
First, we convert the activity-based dependency graph to
an object-based graph by eliminating computation while
preserving dependencies (Figure 16). Second, we calculate the longest path from each object to the leaf objects;
this process is equivalent to calculating node depths of a
directed acyclic graph. Figure 16 (right) shows an example of assigned depths. Note that the depth here equals
TTF if we ignore computation and suppose that the load
of each object takes the same amount of time.
We use this depth information to prioritize and push
objects. This implies that the browser or the server
should know this beforehand. We provide a tool to let
Web developers measure the depth information for objects transported by their pages.
5.3.2 Prioritization
SPDY/3 allows eight priority levels for clients to
use when requesting objects. SPDY best practices
website [18] recommends prioritizing HTML over
CSS/JavaScript and CSS/JS over the rest (chromepriority). Our priority levels are obtained by linearly mapping the depth information computed above
(dependency-priority).
We compare the two prioritization policies to baseline
SPDY in Figure 17. Interestingly, we find that there is
almost no benefit by using chrome-priority while
dependency-policy marginally helps under a 20ms
RTT. The impact of explicit prioritization is minimal because the dependency graph has already implicitly prioritized objects. Implicit prioritization results from browser
policies, independent of Web pages themselves. For example in Figure 10, all other objects cannot be loaded
before HTML; Image and JS2 cannot be loaded before
CSS and JS1. As dependencies limit the impact of SPDY,
prioritization cannot break dependencies, and thus is unlikely to improve SPDY’s PLT.
10
396 11th USENIX Symposium on Networked Systems Design and Implementation USENIX Association
274
D. Rossi – RES224
1
1
1
0.8
0.8
0.8
By embedding
0.4
By dependency
0.6
CDF
0.6
Push all
CDF
CDF
0.6
0.4
0.4
By dependency
0.2
0.2
0
0
By dependency
By embedding
By embedding
0.2
Push all
0
0.2
0.4
0.6
% of pushed bytes
(a) Pushed bytes
0.8
1
Push all
0
0
0.5
1
1.5
PLT w/ server push divided by SPDY PLT
2
(b) rtt=20ms, bw=10Mbps
0
0.5
1
1.5
PLT w/ server push divided by SPDY PLT
2
(c) rtt=200ms, bw=10Mbps
Figure 18: Results of server push when bw=10Mbps.
5.3.3
Server push
1
SPDY allows servers to push objects to save round trips.
However, server push is non-trivial because there is a tension between making page loads faster and wasting bandwidth. Particularly, one should not overuse server push
if pushed objects are already cached. Thus, the key goal
is to speed up page loads while keeping the cost low.
We find no standard or best practices guidance from
Google on how to do server push. Mod spdy can be configured to push up to an embedding level, which is defined as follows: the root HTML page is at embedding
level 0; objects at embedding level i are those whose
URLs are embedded in objects at embedding level i − 1.
An alternative policy is to push based on the depth information.
Figure 18 shows server push performance (i.e., push
all objects, one embedding level, and one dependency
level) compared to baseline SPDY. We find that server
push helps, especially under high RTT. We also find that
pushing by dependency incurs comparable speedups to
pushing by embedding, while benefiting from a 80% reduction in pushed bytes (Figure 18(a)). Note that server
push does not always help because pushed objects share
bandwidth with more important objects. In contrast to
prioritization, server push can help because it breaks dependencies which limits the performance gains of SPDY.
5.4
0.8
CDF
0.6
None
TCP+
0.2
TCP+; push
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
2
(a) rtt=20ms, bw=10Mbps
1
0.8
CDF
0.6
0.4
None
TCP+
0.2
TCP+; push
0
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
2
(b) rtt=200ms, bw=10Mbps
Figure 19: Put all together when bw=10Mbps.
1
0.8
CDF
0.6
0.4
No sharding
By domain
0.2
Putting it all together
By TLD
0
We now pool together the various enhancements (i.e.,
TCP+ and server push by one dependency level). Figure 19 shows that this improves SPDY by 30% under
high RTTs. But this improvement largely diminishes under low RTTs where computation dominates page load
times.
6
0.4
0
0.5
1
1.5
PLT of SPDY divided by PLT of HTTP
2
Figure 20: Results of domain shading when bw=10Mbps
and rtt=20ms.
143Mbps) and loss rates are extremely low. These network parameters well explain our SPDY evaluations in
the wild (not shown due to space limit) that are similar to synthetic ones under high bandwidths and low loss
rates. The evaluations here are preliminary and covering
a complete set of scenarios would be future work.
Discussions
SPDY in the wild: To evaluate SPDY in the wild,
we place clients at Virginia (US-East), North California (US-West), and Ireland (Europe) using Amazon EC2
micro-instances. We add explanatory power by periodically probing network parameters between clients and
the server, and find that RTTs are consistent: 22ms
(US-East), 71ms (US-West), and 168ms (Europe). For
all vantage points, bandwidths are high (10Mbps to
Domain sharding: As suggested by SPDY best practices [18], we used a single connection to fetch all the
objects of a page to eliminate the negative impact of domain sharing. In practice, migrating objects to one domain suffers from deployment issues given popular uses
of third parties (e.g., CDNs, Ads, and Analytics). To this
11
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 397
5.4 SPDY (NSDI’14)
end, we evaluate situations when objects are distributed
to multiple servers that cooperatively use SPDY. We distribute objects by full domain to represent the state-ofthe-art of domain sharding. We also distribute objects by
top-level domain (TLD). This demonstrates the situation
when websites have eliminated domain sharding but still
use third-party services. Figure 20 compares SPDY performance under these object distributions. We find that
domain sharding hurts as expected but hosting objects by
TLD is comparable to using one connection, suggesting
that SPDY’s performance does not degrade much when
some portions of the page are provided by third-party services.
SPDY, but the other studies show that SPDY helps only
marginally. While providing invaluable measurements,
these studies look at a limited parameter space. Studies
by Microsoft [14] and Cable Labs [19] only measured
single Web pages and the other studies consider only a
limited set of network conditions. Our study extensively
swept the parameter space including network parameters, TCP settings, and Web page characteristics. We are
the first to isolate the effect of dependencies, which are
found to limit the impact of SPDY.
TCP enhancements for the Web: Google have proposed and deployed several TCP enhancements to make
the Web faster. TCP fast open eliminates the TCP connection setup time by sending application data in the
SYN packet [15]. Proportional rate reduction smoothly
backs off congestion window to transmit more data under packet loss [5]. Tail loss probe [23] and other
measurement-driven enhancements described in [8] mitigated or eliminated loss recovery by retransmission timeout. Our TCP modifications are specific to SPDY and are
orthogonal to Google’s proposals.
SSL: SSL adds overhead to page loads which can degrade the impact of SPDY, but it keeps the handshake
overhead low by using a single connection. We conduct
our experiments using SSL and find that the overhead of
SSL is too small to affect SPDY’s performance.
Mobile: We perform a small set of SPDY measurements
under mobile environments. We assume large RTTs, low
bandwidths, high losses, and large computational delays,
as suggested by related literature [3, 26]. Results with
simulated slow networks suggest that SPDY helps more
but also hurts more. It also shows that prioritization and
server push by dependency help less (not shown due to
space limit). However, large computational delays on
mobile devices reduce the benefits provided by SPDY.
This means that the benefits of SPDY under mobile scenarios depends on the relative changes in performance
of the network and computation. Further studies on real
mobile devices and networks would advance the understanding in this space.
Advanced SPDY mechanisms: There are no recommended policies on how to use the server push mechanism. We find that mod spdy [11] implements server
push by embedding levels. However, we find that this
push policy wastes bandwidths. We provide a server
push policy based on dependency levels that performs
comparably to mod spdy’s while pushing 80% less data.
8 Conclusion
Our experiments and prior work show that SPDY can either help or sometimes hurt the load times of real Web
pages by browsers compared to using HTTP. To learn
which factors lead to performance improvements, we
start with simple, synthetic page loads and progressively
add key features of the real page load process. We find
that most of the performance impact of SPDY comes
from its use of a single TCP connection: when there is
little network loss a single connection tends to perform
well, but when there is high loss a set of connections tend
to perform better. However, the benefits from a single
TCP connection can be easily overwhelmed by dependencies in real Web pages and browser computation. We
conclude that further benefits in PLT will require changes
to restructure the page load process, such as the server
push feature of SPDY, as well as careful configuration at
the TCP level to ensure good network performance.
Limitations: Our work does not consider a number of
aspects. First, we did not evaluate the effects of header
compression because it is not expected to provide significant benefits. Second, we did not evaluate dynamic
pages which take more time in server processing. Similar to browser computation, server processing will likely
reduce the impact of SPDY. Last, we are unable to evaluate SPDY under production servers where network is
heavily used.
7
Related Work
SPDY studies: Erman et al. [7] studied SPDY in the
wild on 20 Web pages by using cellular connections and
SPDY proxies. They found that SPDY performed poorly
while interacting with radios due to a large body of unnecessary retransmissions. We used more reliable connections, enabled SPDY on servers, and swept a more
complete parameter space. Other SPDY studies include
the SPDY white paper [20] and measurements by Microsoft [14], Akamai [13], and Cable Labs [19]. The
SPDY white paper shows a 27% to 60% speedup for
Acknowledgements
We thank Will Chan, Yu-Chung Cheng, and Roberto
Peon from Google, our shepherd, Sanjay Rao, and the
anonymous reviewers for their feedback. We thank
Ruhui Yan for helping analyze packet traces.
12
398 11th USENIX Symposium on Networked Systems Design and Implementation USENIX Association
276
D. Rossi – RES224
References
[14] J. Padhye and H. F. Nielsen. A comparison of
SPDY and HTTP performance. In MSR-TR-2012102.
[1] Alexa - The Web Information Company.
http://www.alexa.com/topsites/
countries/US.
[15] S. Radhakrishnan, Y. Cheng, J. Chu, A. Jain, and
B. Raghavan. TCP Fast Open. In Proc. of the International Conference on emerging Networking EXperiments and Technologies (CoNEXT), 2011.
[2] Data compression in chrome beta for android.
http://blog.chromium.org/2013/03/
data-compression-in-chrome-betafor.html.
[16] Amazon silk browser. http://amazonsilk.
wordpress.com/.
[3] A. Balasubramanian,
R. Mahajan,
and
A. Venkataramani.
Augmenting Mobile 3G
Using WiFi. In Proc. of the international conference on Mobile systems, applications, and services
(Mobisys), 2010.
[4] National Broadband Map.
broadbandmap.gov/.
[17] Chapter 11. HTTP 1.X.
http://
chimera.labs.oreilly.com/books/
1230000000545/ch11.html.
[18] SPDY best practices.
http://dev.
chromium.org/spdy/spdy-bestpractices.
http://www.
[19] Analysis of SPDY and TCP Initcwnd.
http://tools.ietf.org/html/draftwhite-httpbis-spdy-analysis-00.
[5] N. Dukkipati, M. Mathis, Y. Cheng, and
M. Ghobadi. Proportional Rate Reduction for TCP.
In Proc. of the SIGCOMM conference on Internet
Measurement Conference (IMC), 2011.
[20] SPDY whitepaper. http://www.chromium.
org/spdy/spdy-whitepaper.
[6] Dummynet. http://info.iet.unipi.it/
˜luigi/dummynet/.
[21] SPDY protocol–Draft 3.
http://www.
chromium.org/spdy/spdy-protocol/
spdy-protocol-draft3.
[7] J. Erman, V. Gopalakrishnan, R. Jana, and K. Ramakrishnan. Towards a SPDYier Mobile Web? In
Proc. of the International Conference on emerging Networking EXperiments and Technologies
(CoNEXT), 2013.
[22] Spdylay - SPDY C Library. https://github.
com/tatsuhiro-t/spdylay.
[23] Tail Loss Probe (TLP): An Algorithm for
Fast Recovery of Tail Losses.
http:
//tools.ietf.org/html/draftdukkipati-tcpm-tcp-loss-probe-01.
[8] T. Flach, N. Dukkipati, A. Terzis, B. Raghavan,
N. Cardwell, Y. Cheng, A. Jain, S. Hao, E. KatzBassett, and R. Govindan. Reducing Web Latency:
the Virtue of Gentle Aggression. In Proc. of the
ACM Sigcomm, 2013.
[24] W3C DOM Level 3 Events Specification.
http://www.w3.org/TR/DOM-Level3-Events/.
[9] T. J. Hacker, B. D. Noble, and B. D. Athey. The
Effects of Systemic Packet Loss on Aggregate TCP
Flows . In Proc. of IEEE Iternational Parallel and
Distributed Processing Symposium (IPDPS), 2002.
[25] X. S. Wang, A. Balasubramanian, A. Krishnamurthy, and D. Wetherall. Demystifying page load
performance with WProf. In Proc. of the USENIX
conference on Networked Systems Design and Implementation (NSDI), 2013.
[10] HTTP/2.0 Draft Specifications.
https://
github.com/http2/http2-spec.
[26] X. S. Wang, H. Shen, and D. Wetherall. Accelerating the Mobile Web with Selective Offloading. In
Proc. of the ACM Sigcomm Workshop on Mobile
Cloud Computing (MCC), 2013.
[11] mod spdy. https://code.google.com/p/
mod-spdy/.
[12] World Flags mod spdy Demo. https://www.
modspdy.com/world-flags/.
[13] Not as SPDY as you thought.
http:
//www.guypo.com/technical/notas-spdy-as-you-thought/.
13
USENIX Association 11th USENIX Symposium on Networked Systems Design and Implementation 399
Example de contrôle de connaissances (CC)
277
6 Example de contrôle de connaissances (CC)
Remarque. Pendant la periode du contrôle annexé comme example, les lectures obligatoires étaient les
suivantes. Les questions QCM concernant les LO ne sont donc pas forcement pertinents pour la periode
courante et sont donnée á titre indicatif.
• Alckok, R. Nelson, “Application flow control in YouTube video streams,” ACM SIGCOMM CCR
Vol. 41, No. 2, April 2011
• A. Finamore et al. “YouTube everywhere: Impact of Device and Infrastructure Synergies on User
Experience,” ACM IMC’11, Nov 2011
• P. Marciniak et al., “Small is not always beautiful,” USENIX IPTPS’08, Tampa Bay, FL, Feb 2008
• A. Legout et al., “Clustering and sharing incentives in BitTorrent,” ACM SIGMETRICS’07, San
Diego CA, Jun 2007
278
D. Rossi – RES224
Examen de RES224
25/11/2011
Prénom
Cycle/ID
Nom
QCM No.
43
Instructions :
– Sans documents ni instrument electroniques (e.g., calculette, iPhone, etc.).
– Tous les informations nécessaires à la resolution de cet exam vous sont fournis dans le texte : à l’occurrence,
une table des codes ASCII et les entetes des protocoles peut se trouver à la fin du questionnaire (au cas
où cela puisse se rendre utile à la resolution des exercices).
– Il est nécessaire de rendre toutes les feuilles.
– Veillez à compléter vos données personnelles (e.g., prénom, nom, ID, etc.) dans l’espace approprié en haut
de cette page et dans la pages des exercices.
– Cet examen se compose de question choix multiples sur le contenu du cours (QCM, environ 2/3 de la note
d’examen) et d’exercices ou question ouvertes ou QCM sur les lectures obligatoires (environ 1/3 de la note
d’examen).
– Chaque question QCM a une seule bonne reponse (à noter que certaines réponses du QCM peuvent avoir
un poids négatif ).
– Réportez impérativement au plus une réponse par question du QCM dans la table en bas utilisant des LETTRES MAJUSCULES (seules les réponses reportées dans la table de la première page seront comptées).
Question
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Réponse
Question
Réponse
Question 1. Lequel parmi les protocoles suivants fonctionne normalement en modalité “PUSH” ?
A) User Datagram Protocol
B) Post-Office Protocol
C) Hyper-Text Transfer Protocol
D) Simple Mail Transfer Protocol
Question 2. Le protocole HTTP :
A) verifie toujours que le code HTML des pages qu’il transporte soit conforme aux normes W3C
B) ne verifie jamais si le code HTML des pages qu’il transporte est conforme aux normes W3C
C) le coté serveur de HTTP est capable de verifier que le code HTML des pages qu’il envoye est conforme aux
normes W3C, pourvu que cette option soit specifié en debut de requete
D) le coté client de HTTP est capable de verifier que le code HTML des pages recues est conforme aux normes
W3C, pourvu que cette option soit specifié en debut de reponse
Question 3. Dans un système P2P comme BitTorrent, le fichier à partager est initialement servi par un seul
“seed” (qui est aussi appellé “source”). Les auteurs de [1] étudient les variations du temps de complètement du
système en fonction du taux d’upload US du premier “seed” en le mettant en relation avec le taux de download
DL des “leechers” (les autres paires intéressés à la ressource). Quelle est la condition nécessaire indiquée dans
[1] afin que le partage du fichier soit efficace ?
1
6.1 Enoncé du 25/11/2011
[1] A. Legout et al., “Clustering and sharing incentives in BitTorrent,” ACM SIGMETRICS, San Diego CA,
Jun 2007.
A) US ≥ DL
B) US > 10DL
C) DL > 10US
D) DL > US
Question 4. Le protocole DNS :
A) n’utilise que le protocole UDP
B) n’utilise le protocole TCP que si le serveur secondaire est en panne
C) utilise le protocole TCP uniquement pour la procedure de transfer de zone
D) peut utiliser le protocole TCP en fonction de la taille de la reponse
Question 5. Par méthode expérimentale, les auteurs de [1] analysent les performance de BitTorrent. Quel
est le phénomène du “clustering” observé en [1], et quel sont ses conséquences ?
[1] A. Legout et al., “Clustering and sharing incentives in BitTorrent,” ACM SIGMETRICS, San Diego CA,
Jun 2007.
A) le “clustering” est le phénomène pour lequel, dans un système hétérogène avec paires ayant différent taux
d’upload, l’algorithme de choix des paires (i.e., choking) favorise les échanges entre paires ayant de taux
d’upload similaires ; par conséquent, le temps de complètement du fichier se stratifient en fonction du taux
d’upload où les paires ayant le taux d’upload les plus élevés obtiennent les temps de complètement les
plus courts
B) le “clustering” est le phénomène pour lequel, dans un système hétérogène avec paires ayant différent taux
d’upload, l’algorithme de choix des paires (i.e., choking) favorise les échanges entre paires ayant de taux d’upload différents ; par conséquent, le temps de complètement du fichier s’égalisent indifféremment du taux
d’upload, car les paires ayant les taux d’upload les plus élevés aident à diminuer le temps de complètement
des paires ayant les taux d’upload les plus faibles
C) le “clustering” est le phénomène pour lequel, dans un système hétérogène avec paires ayant différent délai
d’accès , l’algorithme de choix des paires (i.e., choking) favorise les échanges entre paires ayant des délai
d’accès différents ; par conséquent, le temps de complètement du fichier s’égalisent indifféremment du délai
d’accès, car les paires ayant le délai d’accès plus faibles aident à diminuer le temps de complètement des
paires ayant les délai d’accès les plus élevés
D) le “clustering” est le phénomène pour lequel, dans un système hétérogène avec paires ayant différent délai
d’accès, l’algorithme de choix des paires (i.e., choking) favorise les échanges entre paires ayant de délai
d’accès similaires ; par conséquent, le temps de complètement du fichier se stratifient en fonction du délai
d’accès, où les paires ayant les délai d’accès les plus faibles obtiennent les temps de complètement les plus
courts
Question 6.
Dans l’applicatif Skype, on observe des retransmission potentiellement non nécessaires du contenu voix :
A) en fin d’appel, lors de la terminaison de la connexion par signalisation au niveau applicatif
B) en début d’appel, lorsque les conditions du réseau sont inconnues
C) en début de connexion, lors de la mise en place de l’appel par signalisation au niveau applicatif
Question 7. Concernant le Post Office Protocol POP3 et les repertoires de stockage des messages :
A) POP n’impose pas une limite au nombre de repertoirs, (pourvu que la taille totale des messages n’excède
pas la taille totale disponible à l’usager)
B) POP est capable de gérer le meme nombre de repertoires qu’un serveur IMAP
C) POP limite generalement à 64 le nombre de sous-repertoire dans chaque repertoire (alors que ce limite ne
s’applique pas aux serveurs IMAP)
D) POP n’est pas capable de gérer des repertoires
2
280
D. Rossi – RES224
Question 8.
Dans l’applicatif Skype, comparez le volume du trafic de signalisation (e.g., gestion et maintenance du réseau
overlay, recherche et notification des contacts, etc.) contre le volume du trafic générés par les services (appel
voix, vidéo, chat, etc) :
A) les services sont généralement prédominants par rapport à la signalisation
B) la signalisation est généralement prédominante par rapport aux services
C) les deux volumes sont généralement comparables
Question 9. Considerez un système peer-2-peer (P2P) où la localisation des ressources est gérée par un
serveur centrale S : quand le peer A veut rapatrier une ressource R, qui est stoqué chez les peers B et C :
A) seulement les reponses concernant la localisation de R, mais pas les messages portant la ressource R,
transitent par S
B) aussi bien les requetes/reponses concernant la localisation de R que les messages portant la ressource R,
transitent par S
C) seulement les requetes/reponses concernant la localisation de R, mais pas les messages portant la ressource
R, transitent par S
D) seulement les requetes concernant la localisation de R, mais pas les messages portant la ressource R, transitent par S
Question 10.
Lequel parmi les suivants n’est pas un algorithme implémenté par BitTorrent ?
A) Optimistic Unchoking
B) Rarest First
C) Restrained Flooding
D) Anti-snubbing
E) Chocking
Question 11.
La diffusion de fichier par BitTorrent repose sur la parcellisation du fichier à transmettre (de taille F) en
pièces (de taille M). Par rapport à [1], comment la taille de ces pièces est elle choisie ? Quel est son valeur
typique ?
[1] P. Marciniak et al., “Small is not always beautiful,” USENIX IPTPS 2008, Tampa Bay, FL, 2008
A) la taille des pièces est spécifiée dans un document normatif BEP ; elle est généralement de 256 KBytes
B) la taille des pièces n’est pas specifiée par le protocole ; elle est généralement de 16 KBytes
C) la taille des pièces est spécifiée dans un document normatif IETF RFC ; elle est généralement de 256 KBytes
D) la taille des pièces détermine le nombre de pièces F/M dont le fichier .torrent doit reporter un checksum
pour en vérifier l’intégrité : la taille des pièces est choisie pour limiter la taille résultante du fichier .torrent ;
elle est généralement de 256 KBytes
Question 12. Considerons le mecanisme du “GET conditionnel” de l’application Web, qui permet de vérifier
si le contenu du cache d’un Proxy, et en particulier un objet adressé par une URL specifique, est à jour : laquelle
parmi les affirmations suivantes est vraie ?
A) ce mecanisme n’utilise pas une méthode HTTP dediée mais la méthode HTTP GET, en conjonction avec
une option specifiée par le Proxy dans les lignes suivant la première (i.e., dans les header lines) du message
de requête HTTP ; le serveur repond dans tous les cas à cette requête, mais sa reponse ne contient l’objet
en question que si ce dernier a changé
B) ce mecanisme n’utilise pas la méthode HTTP GET mais une méthode HTTP dediée, qui est donc specifiée
par le Proxy dans la première ligne (i.e., request line) du message de requête HTTP ; le serveur repond
dans tous les cas à cette requête, mais sa reponse ne contient l’objet en question que si ce dernier a changé
3
6.1 Enoncé du 25/11/2011
C) ce mecanisme n’utilise pas une méthode HTTP dediée mais la méthode HTTP GET, en conjonction avec
une option specifiée par le Proxy dans les lignes suivant la première (i.e., dans les header lines) du message
de requête HTTP ; le serveur repond uniquement si l’objet en question à changé, et le Proxy peut donc
implicitement en deduir si l’objet en question a changé
D) ce mecanisme n’utilise pas la méthode HTTP GET mais une méthode HTTP dediée, qui est donc specifiée
par le Proxy dans la première ligne (i.e., request line) du message de requête HTTP ; le serveur repond
uniquement si l’objet en question à changé, et le Proxy peut donc implicitement en deduir si l’objet en
question a changé
Question 13. Les messages DISCOVERY du protocole DHCP :
A) sont periodiquement envoyés par les serveurs DHCP, pour verifier si des nouveaux hotes ont besoin d’etre
parametrés, et contiennent la plage d’addresses courament disponible
B) sont periodiquement envoyés par les serveurs DHCP, pour verifier si des nouveaux hotes ont besoin d’etre
parametrés, mais ne contiennent pas la plage d’addresses courament disponible
C) sont envoyé par les clients DHCP qui n’ont pas encore un addresse IP pour decouvrir les serveurs DHCP
present dans le reseau, et peuvent contenir l’addresse preferentiel que le client souhaite recevoir
D) sont envoyé par les clients DHCP qui n’ont pas encore un addresse IP pour decouvrir les serveurs DHCP
present dans le reseau, mais ne contiennent pas d’addresse preferentiel que le client souhaite recevoir
Question 14. Le Start of Authority (SOA) :
A) est un champ obligatoire du database DNS qui demarque une zone et qui specifie des parametres temporels
(e.g., temps de caching, temps entre transfers de zone, etc.)
B) est un champ optionnel du database DNS qui specifie des parametres temporels (e.g., temps de caching,
temps entre transfers de zone, etc.) lorsque ceci sont different par les default du DNS
C) est un commande de l’application DNS qui permet le transfer de la base de donnée entre le serveur primaire
et serveur secondaire
D) est un commande de l’application DNS qui permet un echange de role entre le serveur primaire et serveur
secondaire
Question 15.
Quels sont les avantages et inconvenients du paradigme peer-2-peer par rapport au paradigme client-server
A) passage à l’échelle et robustesse, au detriment de complexité d’implementation
B) passage à l’échelle et simplicité d’implementation, au detriment de robustesse
C) simplicité d’implementation et robustesse, au detriment du passage à l’échelle
Question 16. Dans un système P2P BitTorrent, le fichier à partager (de taille F) est parcellisé en morceaux
(de taille M). L’article [1] illustre comment les performance de téléchargement dépendent de la taille M.
Spécialement, dans le cas de diffusion de très grand fichiers (F¿¿M), que se passe-t-il lorqsue on choisit de
morceaux de très petite taille ?
[1] P. Marciniak et al., “Small is not always beautiful,” USENIX IPTPS 2008, Tampa Bay, FL, 2008
A) les performance du système deviennent optimales, car plus la taille des morceaux est petite, plus la diffusion
est dynamique (réduisant les pièces rares) et rapide (car la probabilité de rester bloqués sur une pièce rare
se réduit).
B) la très petite taille des morceaux engendre un sur-coût de communication (e.g., augmentation du nombre
des messages de type “HAVE”, augmentation de la longueur des messages de type “Bitfield”) qui, à lui
tout seul, réduit l’efficacité de la diffusion (car la quantité de données à transférer augmente énormément)
C) lorsque la taille des morceaux devient trop petite, le temps de diffusion s’allonge d’un part en raison de la
diminution d’efficacité de chaque connexion TCP (e.g., car à chaque nouveau échange, ou après inactivités,
les paramètres de la connexion sont réinitialisés), d’autre part en raison de la diminution d’efficacité
du “pipelining” des requêtes sur l’ensemble des connexions TCP (car si les morceaux sont rapidement
téléchargés, on paye le prix lié au délai pour la diffusion des nouvelles requêtes).
4
282
D. Rossi – RES224
Question 17. Considerez un message DNS correspondant à une reponse pour une resolution de nom, qui
porte multiple reponses. Lequel parmi les noms de domaine suivants, reporté en hexadecimal, ne peut pas etre
reperé dans un message applicatif DNS valable ?
A) 03 77 77 77 06 67 6F 6F 67 6C 65 02 66 72
B) 03 77 77 77 11 74 65 6C 65 63 6F 6D 2D 70 61 72 69 73 74 65 63 68 02 66 72 00
C) 03 6E 73 33 c0 10
D) c0 10
Question 18. Le protocole HTTP :
A) est un protocole qui ne gere que l’état representé par les cookies
B) est un protocole qui ne gere que l’état nécessaire pour l’authentication des usager
C) est un protocole sans état
D) est un protocole à état
Question 19.
Les serveurs YouTube implémentent un algorithme de contrôle de flux au niveau applicatif qui est défini
“block sending” en [1]. En indiquant avec RT T le délai d’allée-retour entre le server YouTube et le PC usager
(en secondes), avec R le débit d’encodage de la vidéo (en bits par seconde) et avec P la taille des segments (en
octets), comment cet algorithme se déroule ?
[1] S. Alckok, R. Nelson, Application flow control in YouTube video streams, ACM SIGCOMM CCR Vol.
41, No. 2, April 2011
A) plusieurs fois par RT T l’application génère un bloc de RT /(8P ) messages qui sont passés à TCP pour la
segmentation et l’envoi ; la durée de transmission d’un bloc (i.e., temps entre le premier et le RT /(8P )-ème
paquets) est très bref (et en générale << RT T ), et comme T << RT T , plusieurs blocs sont donc envoyés
chaque RT T .
B) chaque T > RT T , un bloc d’octets de longueur RT est passés de l’application à TCP, qui les envoye
comme rafale de RT /(8P ) paquets ; la durée de transmission d’un bloc (i.e., temps entre le premier et le
RT /(8P )-ème paquets) est très bref (et en générale << RT T ).
C) chaque T > RT T , un bloc d’octets de longueur RT est passé de l’application à TCP, qui les découpe en
RT /(8P ) paquets ; TCP envoie ces paquets espacés dans le temps (i.e. la laps de temps entre deux paquets
d’un même bloc est donc environ 8P RT T /(RT )) ; la durée de transmission d’un bloc est donc environ un
RT T .
Question 20.
Quel est le rôle de l’entité “tracker” en BitTorrent ?
A) le tracker assiste les Peers lorsqu’il entrent dans le système, en leur envoyant une liste de Peers à contacter
B) le tracker assiste les Peers lorsqu’il entrent dans le système, en leur envoyant la liste des Chunks dans lequel
le fichier à été parcellisé
C) le tracker assiste les Peers lorsqu’il entrent dans le système, en leur envoyant aléatoirement quelques Chunks
dans lequel le fichier à été parcellisé
Question 21. Dans le paradigme client-server,
A) le client est l’host qui prend l’initiative de contacter le server
B) le server est l’host qui initie la communication (i.e., celui qui effectue une “open active” de la connexion)
C) un host peut etre soit client soit server, mais pas les deux à la fois
Question 22. YouTube est un service de vision de contenu multimédia (e.g., vidéos encodé avec un débit
moyen R) qui implémente un algorithme de contrôle de flux au niveau applicatif [1]. Cet algorithme se superpose
aux contrôles de flux et congestion élastique implémentés par TCP : de combien de phases se compose cet
algorithme, et quel est l’intérêt de chaque phase ?
[1] S. Alckok, R. Nelson, Application flow control in YouTube video streams, ACM SIGCOMM CCR Vol.
41, No. 2, April 2011
5
6.1 Enoncé du 25/11/2011
A) L’algorithme se compose de deux phases : dans la première l’application donne à TCP les données à une
vitesse R′ >> R, dans la deuxième la vitesse baisse à environ R. La première phase essaye de réduire
le délai initiale en remplissant aussi vite que possible (i.e., à la vitesse de TCP) le tampon de vision de
l’usager. La deuxième phase essaye de n’envoyer pas plus que nécessaire (i.e., la vitesse du taux d’encodage
de la vidéo) à l’usager, car ce dernier pourrait interrompre la vision de la vidéo à tout moment.
B) L’algorithme se compose de deux phases : dans la première l’application donne à TCP les données à une
vitesse R′ >> R, dans la deuxième la vitesse baisse à environ R. La première phase essaye de réduire le
délai initiale en remplissant aussi vite que possible (i.e., beaucoup plus rapidement que TCP ne le ferait,
donc R′ >> RT CP > R) le tampon de vision de l’usager. La deuxième phase essaye de n’envoyer pas
plus que nécessaire (i.e., la vitesse du taux d’encodage de la vidéo) à l’usager, car ce dernier pourrait
interrompre la vision de la vidéo à tout moment.
C) L’algorithme se compose d’une seule phase, qui limite le taux d’envoi des donnés à la vitesse R. Son but
est de n’envoyer pas plus que nécessaire (i.e., la vitesse du taux d’encodage de la vidéo) à l’usager, car ce
dernier pourrait interrompre la vision de la video à tout moment
Question 23. Le téléchargement des vidéos sur YouTube se fait de façon différente lorsque l’usager accède
au portail depuis un PC par rapport aux accès via un terminal mobile (e.g., tel qu’un smartphone). De façon
générale, [1] explique que :
[1] A. Finamore et al. “YouTube everywhere : Impact of Device and Infrastructure Synergies on User Experience”, In ACM IMC’11, Nov 2011
A) dans le cas d’accès via PC, la vidéo est parcellisée en plusieurs “chunks”, qui sont demandés via des requêtes
HTTP GET (qui spécifient les octets à télécharger en exploitant l’entête “Range” du protocole HTTP)
et qui sont ensuite telechargés en exploitant la même connexion TCP
B) dans le cas d’accès via PC, la vidéo est parcellisée en plusieurs “chunks”, qui sont téléchargés en exploitant
plusieurs connexions TCP en parallèle ouvertes vers différents serveurs
C) dans le cas d’accès via PC, le transfert a lieu en plusieurs phases, dont notamment une phase de redirection
(qui engendre potentiellement une chaı̂ne de redirections HTTP vers différent serveurs de contenu) avant
le démarrage du transfert de la video
Question 24. Laquelle parmi les affirmations suivantes concernant les zones et domaines DNS est fausse ?
A) un domaine est un sous-arbre de l’espace de nommage
B) zone et domaine sont synonymes, car il n’y a pas de difference sémantique entre les deux termes
C) un sous-domaine est entierement contenu dans un domaine
D) une zone est un lieu administratif
Question 25. Le decouplage de l’architecture de messagerie en serveurs SMTP et POP/IMAP :
A) evite la necessité de synchronization entre emetteur (qui utilise SMTP pour envoyer le message) et recepteur
(qui utilise POP/IMAP)
B) est en réalité fictive, car ces different services (SMTP, POP, IMAP) sont typiquement offert pas le meme
serveur
C) est lié au mechanisme dit de “backward compatibility”, c’est à dire de compatibilité des nouveaux systèmes
(SMTP) avec les systèmes existant (POP/IMAP)
D) est consequence de l’évolution du service de messagerie, et notamment de la possibilité d’envoyer different
‘ ‘attachments” (i.e., pièce-jointes) avec le corps textuel du message
Question 26. Laquelle parmi les affirmation suivantes est fausse ?
A) Le développeur a un contrôle total de la partie application des sockets, mais il n’a que peu de contrôle sur
la partie transport
B) La socket est parfois implementée en hardware pour des raisons de performance
C) La socket est l’interface entre la couche application et la couche transport d’un hôte
D) La socket est la porte d’entrée/sortie des données du réseau vers un processus applicatif executé sur un hôte
6
284
D. Rossi – RES224
Question 27. Concernant le Post Office Protocol POP3 et les repertoires de stockage des messages :
A) POP n’impose pas une limite au nombre de repertoirs, (pourvu que la taille totale des messages n’excède
pas la taille totale disponible à l’usager)
B) POP limite generalement à 64 le nombre de sous-repertoire dans chaque repertoire (alors que ce limite ne
s’applique pas aux serveurs IMAP)
C) POP n’est pas capable de gérer des repertoires
D) POP est capable de gérer le meme nombre de repertoires qu’un serveur IMAP
Question 28. Le téléchargement des vidéos de YouTube peut engendrer la transmission de données nonutiles, par exemple (i) lorsque la vision des vidéos est interrompue par les usagers, ou (ii) lorsque les mêmes
données sont téléchargés plusieurs fois (e.g., un total de X bytes telechargés pour une vidéo de longueur Y < X).
Vis-Ã -vis de ces deux possibilités, les auteurs de [1] font remarquer que :
[1] A. Finamore et al. “YouTube everywhere : Impact of Device and Infrastructure Synergies on User Experience”, In ACM IMC’11, Nov 2011
A) (i) les usagers interrompent souvent la vision des vidéos (plus que la moitié des vidéos est regardée pendant
moins que 1/5eme de sa longueur)
(ii) on ne rencontre jamais un excès de données transférées (i.e., soit X = Y , soit X < Y lorsque la vidéo
est interrompue)
B) (i) les usagers interrompent exceptionnellement la vision des vidéos (seul 1/10eme des vidéos sont regardés
au plus la moitié de leur longueurs)
(ii) on ne rencontre jamais un excès de données transférées (i.e., soit X = Y , soit X < Y dans les rares
cas où la vidéo est interrompue)
C) (i) les usagers interrompent souvent la vision des vidéos (plus que la moitié des vidéos est régardée pendant
moins que 1/5eme de sa longueur)
(ii) dans le cas de YouTube sur mobile, il est possible que les données soient téléchargées plusieurs fois
suite à une mauvaise interaction entre TCP et les requêtes au niveau application (ce qui se produit au
plus dans un tiers des cas, avec parfois X > 4Y ).
7
6.1 Enoncé du 25/11/2011
Prénom
Nom
ASCII Code Table
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0
NUL
SOH
STX
ETX
EOT
ENQ
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
SI
1
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
2
space
!
”
#
$
%
&
’
(
)
*
+
,
.
/
3
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
4
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
5
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
6
‘
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
7
p
q
r
s
t
u
v
w
x
y
z
{
|
}
∼
del
Example : le charactere ASCII ”A” correspond à ”41” en notation hexadecimale
8