Cloud CompuQng - Technische Universität Darmstadt

Transcription

Cloud CompuQng - Technische Universität Darmstadt
Telecoopera)on Prof. Dr. Max Mühlhäuser Telecoopera)on 1 Chapter 2.4:
Distributed Program Development Cloud Compu)ng Prof. Dr. Max Mühlhäuser, Dr. Erwin Aitenbichler Copyrighted material; for TUD student use only
Technische Universität Darmstadt Hos)ng §  How to make Services available on the Internet? §  In-­‐process Server §  Embedded web server used in server process §  PracGcally, only for development §  In-­‐house hos)ng or external root servers §  Servlet Containers (e.g., Apache Tomcat) §  Originally designed for Web ApplicaGons §  Generic servlets (e.g., jaxws-­‐ri) enable hosGng of Web Services §  Java for Business IntegraGon (e.g., Apache ServiceMix) §  Specifically designed for Web Services §  To date, not as mature as Servlet Containers §  Also: IIS, .NET Services, IBM Websphere, … §  Cloud Compu)ng §  Provides scalable infrastructure for hosGng TK1-­‐K2.4: Cloud CompuGng 2 In-­‐House vs. External Hos)ng Cow Supermarket: BoGle of Milk Big upfront investment Small conGnuos expense Produces more than you need Buy what you need Uses up resources Less resources needed Maintenance needed No maintenance Unpleasant waste product Waste is not my problem Source: Donald Kossmann: What is new in the Cloud? & Kraska 2009
TK1-­‐K2.4: Cloud CompuGng 3 Cloud Compu)ng §  Cloud CompuGng objecGves § provide resources or services over the Internet to customers with the reliability of a data center § o]en virtualized infrastructure that can be scaled on demand regarding storage, computaGon and network capacity § Service providers/users do not need to have knowledge of maintaining the technology infrastructure in the "cloud" §  Cloud CompuGng vs. tradiGonal HosGng § sold on demand (typically by minute or hour) §  “pay as you go” – no need for long-­‐standing contracts § elasGc: user can determine amount of service (CPU, Storage, ...) he wants at any given Gme § service is fully managed TK1-­‐K2.4: Cloud CompuGng 4 Cloud Compu)ng: Roles Service Hoster
Service Users
(End Users)
Services, Data, VM-Images
Service Provider (Developer)
TK1-­‐K2.4: Cloud CompuGng 5 Public vs. Private Clouds §  Public Cloud: sells service to anyone on the Internet §  ObservaGon: 70% of IT budget is spent on average for maintenance versus adding new capabiliGes §  Goal: Replace fixed IT-­‐costs with variable costs §  Gartner says: “OrganizaGons are switching from company-­‐owned hardware and so]ware assets to per-­‐use service-­‐based models” §  Low entry barrier §  For applicaGons you have to do once §  For applicaGons you do not have the resources for §  e.g., Amazon EC2 §  Private Cloud: used inside a business §  ObservaGon: Most systems are oversized and run at 10-­‐20% uGlizaGon most of the Gme §  Goal: more efficient uGlizaGon of resources in a business §  e.g., IBM Blue Cloud TK1-­‐K2.4: Cloud CompuGng 6 Cloud Compu)ng §  Key characterisGcs § „Unbounded scale“ §  Rapid up-­‐ and downscaling possible §  No more oversized systems that run at 10-­‐20% uGlizaGon most of the Gme §  SGll, the peak load capacity can be provided §  AssumpGon: The data center hosts many different applicaGons and not all of them need peak resources at the same Gme § Reliability §  redundant computers and data centers § More locaGon independence §  hoster may operate data centers worldwide § O]en based on cheap hardware § Compared to clusters and Grid slow interconnecGon network § SubscripGon-­‐based or Pay-­‐per-­‐use TK1-­‐K2.4: Cloud CompuGng 7 Types of Cloud Service Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)
TK1-­‐K2.4: Cloud CompuGng 8 Infrastructure as a Service (IaaS) §  Computer Infrastructure delivery model §  A plaoorm virtualizaGon environment -­‐ „virt. taken a step further“ §  CompuGng resources, such as processing capacity and storage §  This pay-­‐for-­‐what-­‐you-­‐use model resembles the way electricity, fuel and water are consumed -­‐> also referred to as u)lity compu)ng. §  Example §  You want to run a batch job but you don’t have the infrastructure necessary to run it in a Gmely manner -­‐> use Amazon EC2 §  Providers §  Amazon EC2, S3 §  virtual server instances with unique IP addresses §  blocks of storage on demand §  API to start, stop, access, configure virtual servers and storage §  Rackspace: cloudServers §  FlexiScale TK1-­‐K2.4: Cloud CompuGng 9 PlaOorm as a Service (PaaS) §  Plaoorm delivery model §  Plaoorms are built upon Infrastructure §  Hoster is responsible for maintenance of OS & base frameworks §  Problem: currently, there are not standards for interoperability or data portability. Some providers will not allow so]ware created by their customers to be moved off their plaoorm. §  Examples §  You want to start storage services on your network for a large number of files and you do not have the storage capacity -­‐> use Amazon S3 §  You want to host a website and expect many hits, but only for a few days -­‐> Rackspace cloudSites §  Providers: GoogleApps, Force.com, Rackspace, ... §  Rackspace: cloudSites, cloudFiles §  cloudSites: scalable website w/o need to manage dedicated server TK1-­‐K2.4: Cloud CompuGng 10 SoPware as a Service (SaaS) § So]ware delivery model § No hardware or so]ware to manage § Service delivered through a browser (in most cases) § Provider hosts both applicaGon and data § End user is free to use the service from anywhere § Customers use the service on demand § Instant Scalability § Examples § Your email is hosted on an Exchange server in your office and it is very slow. -­‐> outsource this using Hosted Exchange § Your current CRM package is not managing the load or you simply don’t want to host it in-­‐house. -­‐> use a SaaS provider such as Salesforce.com § SaaS is a very broad market: many domain-­‐specific applicaGons TK1-­‐K2.4: Cloud CompuGng 11 Cloud Compu)ng PaaS
Service Provider
(Developer)
SaaS
Service Users
(End Users)
§ PaaS for Developer • A set of so]ware & product development tools hosted on the provider's infrastructure • Developers create applicaGons on the provider’s plaoorm over the Internet § SaaS for End User TK1-­‐K2.4: Cloud CompuGng 12 Cloud Compu)ng: History § Grid Compu)ng Utility
Computing
• 
§  Solving large problems with parallel compuGng §  Made mainstream by • 
Globus Alliance TK1-­‐K2.4: Cloud CompuGng Offering
computing
resources as a
metered service
Software as a
Service
•  Network-based
subscriptions to
applications
•  Gained
momentum in
Introduced in late
2001
1990s
Cloud Computing
•  Next-generation
Internet
computing
•  Next-generation
Data Centers
13 Grid Compu)ng § Grid DefiniGon: (Ian Foster) § Coordinated resource sharing and problem solving in dynamic, mul)-­‐
ins)tu)onal virtual organiza)ons § Hence, grid is coordinated, aimed at solving problems § MoGvaGon to join a grid: more resources available § Resources typically provided by everyone and usable by all § Focus on large scale resource sharing, innovaGve applicaGons, and high performance § This focus disGnguishes grid from tradiGonal distributed compuGng § CollecGons of individuals etc. also called virtual organizaGons § Main challenges: authenGcaGon, authorizaGon, resource access, and resource discovery TK1-­‐K2.4: Cloud CompuGng 14 The word „Grid“ § Word “grid” has several meanings in English § Map grid, starGng grid, … § How do these definiGons relate to grid compuGng? § Are the computers arranged in a grid? ;-­‐) § Best analogy is power grid § DistribuGon of electricity § In power grid, there are producers and consumers § Grid connects them and lets any consumer use any producer seamlessly § SomeGmes grid compuGng defined in a similar way to power grid § AnyGme, anywhere, just plug in and you are connected § However, this is not the original definiGon for grid compuGng TK1-­‐K2.4: Cloud CompuGng 15 Grid: Virtual Organiza)ons Org Y
VO A
Org X
VO B
Org Z
§ Three real organizaGons, two virtual organizaGons § A real organizaGon can parGcipate in mulGple VO § OrganizaGon decides what to contribute and under which rules § For example, Org Y lets VO B simulate only “simple” problems TK1-­‐K2.4: Cloud CompuGng 16 Grid: Interoperability § Grid architecture enables sharing § VO must be able to establish relaGonship with any parGcipant § Central issue is interoperability § Interoperability means common protocols § Grid architecture is a set of common protocols § Protocols define: § NegoGaGon, establishment, managing, and using resources § Standard protocols allow anyone to contribute § Why interoperability is so important? § Without a common protocol, all relaGonships are bi-­‐lateral § Common protocol lets everyone parGcipate with liyle effort § Compare: HTTP and HTML are common protocols for Web TK1-­‐K2.4: Cloud CompuGng 17 Why are protocols important? § Protocol defines how distributed elements interact § How to achieve desired behavior § Structure and semanGcs of the messages § Protocols do not define the internals of elements § OrganizaGon has full control over implementaGon § Local control important for tradiGonal sharing, grid does it too § APIs are also important § Protocols allow interoperability, but no support for programmer § APIs and SDKs allow development of sophisGcated programs § Without a common protocol, API is not enough §  We would need same implementaGon everywhere, or have every implementaGon know every other implementaGon §  The Agent approach is basically this (need homogeneous plaoorm everywhere) TK1-­‐K2.4: Cloud CompuGng 18 Protocol vs. API Application
API
Protocol
Application
API
OS
OS
§  API is local to a computer §  Interface to services provided by OS or middleware §  Only API standardized -­‐> need same so]ware everywhere §  So far, this did not work in the Internet! §  Protocol defines what goes between computers §  “Interface” to services on other computers §  AddiGonal effort for implemenGng protocol handling §  Peers can use any implementaGon TK1-­‐K2.4: Cloud CompuGng 19 Grid: Interoperability § Interoperability in Grids § Architecture and protocol standards establish requirements §  Open Grid Services Architecture (OGSA) §  Open Grid Services Infrastructure (OGSI) §  Job Submission DescripGon Language (JSDL) §  Distributed Resource Management ApplicaGon API (DRMAA) §  Grid Security Infrastructure (GSI) § Vendor-­‐specific implementaGons of protocol possible § Globus Toolkit widely adopted TK1-­‐K2.4: Cloud CompuGng 20 Grid vs. Cloud Compu)ng § Share a lot commonality: intenGon, architecture and technology; problems are mostly the same: § manage large, central faciliGes § define methods by which consumers discover, request & use resources § provide means for efficient parallel execuGon of so]ware § are both o]en based on inexpensive hardware § Differences: § Business Model / accessibility § Compute Model / supported applicaGon types § Programming Model / job/applicaGon submission § Interoperability § ElasGcity TK1-­‐K2.4: Cloud CompuGng 21 Grid vs. Cloud Compu)ng §  Business Model / accessibility §  Grids are only accessible within pre-­‐defined mulG-­‐organizaGons §  ParGcipants share resources, i.e., most parGcipants also offer resources §  Each individual organizaGon maintains full control of their resources §  User/Hoster-­‐Roles in Cloud CompuGng §  Cloud compuGng hoster offers its service to anyone on the Internet §  Compute Model / applica)on types §  Grids are mostly limited to batch processing §  Clouds support interac)ve applica)ons (and also batch) §  Good support for “web-­‐faced” applicaGons met demand of many e-­‐commerce applicaGons -­‐> great market success §  Programming Model / job/applica)on submission §  Grid: builds mostly on source code distribu)on §  Send archive with source code to server in Grid §  Job will be compiled and executed there §  Cloud: builds on Virtualiza)on TK1-­‐K2.4: Cloud CompuGng 22 Grid vs. Cloud Compu)ng § Interoperability § Grid §  Wide adopGon of the OGF (Open Grid Forum) architecture/protocol standards §  Wide adopGon of the Globus toolkit § Cloud §  Currently no interoperability §  Big players do not seem to be interested much in standards §  Amazon, Microso], Salesforce, Google §  Open Cloud ConsorGum §  Members: Aerospace, Cisco, Infoblox, Nasa, Yahoo, ... §  ... but the big players are not here §  -­‐> Vendor lock-­‐in! § Elas)city § Cloud: You pay for the resources you use (pay-­‐per-­‐use) § Grid: AccounGng across organizaGons never worked TK1-­‐K2.4: Cloud CompuGng 23 Grid vs. Cloud Compu)ng §  Amdahl‘s Law revisited: s … speedup P … number of parallel processes o(P) … communicaGon overhead a … sequenGal fracGon of program §  Max. expected speedup of overall system, if only a fracGon is improved §  Expresses relaGonship btw. sequenGal algorithm and parallelized version §  The sequen)al frac)on of a program (in terms of run Gme) determines the upper bound for speedup! -­‐ sGll true, but only if we look at solving one problem §  Grids §  Parallel computers with fast interconnecGon networks -­‐> low o(P) §  Solve o]en only 1 problem §  Clouds §  Inexpensive hardware, „slow“ interconnects -­‐> high o(P) §  Either a is really small (cf. MapReduce) §  or mostly: solve n problems in parallel TK1-­‐K2.4: Cloud CompuGng 24 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
Virtualization
Block Storage
TK1-­‐K2.4: Cloud CompuGng EC2
(Amazon)
S3
(Amazon)
GFS
(Google)
25 Amazon S3 §  Amazon Simple Store Service (S3) § Based on REST-­‐style API, also browser-­‐based management tools available § Organized in buckets §  bucket = “drive” in which directories and files can be stored § ACLs allow to make parts of the bucket publicly accessible § Other services build on S3, such as EC2 §  Example: CreaGng a Bucket PUT /onjava HTTP/1.1
Content-Length: 0
User-Agent: jClientUpload
Host: s3.amazonaws.com
Date: Sun, 05 Aug 2007 15:33:59 GMT
Authorization: AWS 15B4D3461F177624206A:YFhSWKDg3qDnGbV7JCnkfdz/IHY=
§ AuthorizaGon: AWS <AWSAccessKeyID>:<Signature> § AWSAccessKeyID: customer ID § Signature: SHA1 hash of message, encrypted with customer’s private key TK1-­‐K2.4: Cloud CompuGng 26 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
Virtualization
Block Storage
TK1-­‐K2.4: Cloud CompuGng EC2
(Amazon)
S3
(Amazon)
GFS
(Google)
27 Google Filesystem (GFS) § Shares many same goals as previous distributed file systems: performance, scalability, reliability, and availability § GFS design has been driven by four key observaGons @ Google: § Component failures are the norm rather than the excepGon §  quanGty and quality of components virtually guarantees failures §  constant monitoring, error detecGon, fault tolerance, and automaGc recovery are integral to the system § Huge files (by tradiGonal standards) §  MulG GB files are common §  I/O operaGons and blocks sizes must be revisited § Most files are mutated by appending new data §  focus of performance opGmizaGon and atomicity guarantees § Co-­‐designing the applicaGons and APIs benefits overall system by increasing flexibility TK1-­‐K2.4: Cloud CompuGng 28 GFS: Architecture § GFS cluster consists of § a single master § mulGple chunkservers and § is accessed by mulGple clients § Files organized hierarchically in directories & idenGfied by pathnames TK1-­‐K2.4: Cloud CompuGng 29 GFS: Architecture §  GFS Chunkservers §  Files are divided into fixed-­‐size chunks §  Each chunk has a immutable globally unique 64-­‐bit chunk-­‐handle §  handle is assigned by the master at chunk creaGon §  Chunk size is 64 MB §  Chunks are stored as files in local Linux filesystem §  Each chunk is replicated on 3 (default) servers §  GFS Master §  Maintains all file system metadata: names space, access control info, file to chunk mappings, chunk (including replicas) locaGon, etc. §  Client contacts Master to get chunk locaGons, then deals directly with chunkservers §  Master is not a boyleneck for reads/writes §  Single master §  simplifies design §  helps make sophisGcated chunk placement and replicaGon decision, using global knowledge TK1-­‐K2.4: Cloud CompuGng 30 GFS: Architecture §  GFS Clients §  GFS does not implement a standard API such as POSIX §  Linked as library into apps §  Communicates with master and chunkservers for reading and wriGng §  Master interacGons only for metadata §  Chunkserver interacGons for data §  OperaGons §  Standard: create, delete, open, close, read, and write files §  Snapshot §  creates a copy of a file or a directory tree at low cost §  Record Append §  allows mulGple clients to append data to the same file concurrently §  guaranteeing the atomicity of each individual client’s append §  useful for implemenGng mulG-­‐way merge results §  useful for producer-­‐consumer queues that many clients can simultaneously append to without addiGonal locking TK1-­‐K2.4: Cloud CompuGng 31 GFS: Read 1. Client sends Master a request containing file name and chunk index (= byte offset / 64MB) 2. Master replies chunk handle and loca)ons of replicas 3. Client sends request to one replica (closest one) 4. Client reads data from chunkserver 1
2
3
4
TK1-­‐K2.4: Cloud CompuGng 32 GFS: Write 1. Client asks the master for Lease Holder of chunk to write to 2. IdenGty of the primary and locaGons of other (secondary) replicas 3. Client pushes data to all replicas (on chunkservers) 4. Once all the replicas have acknowledged receiving the data, client sends a write request to the primary • request idenGfies all other replicas • primary assigns consecuGve serial numbers -­‐> serializaGon of writes • data in local write buffer is ordered according to serial number order 5. Primary forwards write request to all secondary replicas • data is ordered according to serial number order 6. Secondaries all reply to primary indicaGng compleGon 7. Primary replies to Client TK1-­‐K2.4: Cloud CompuGng 33 GFS: Atomic Record Append § Problem § In a tradiGonal write, the client specifies the offset at which data is to be wriyen § Concurrent writes to the same region are not serializable: the region may end up containing data fragments from mulGple clients § Atomic Record Append § Client specifies only the data to write § GFS chooses and returns the offset it writes to § No need for a distributed lock manager §  GFS choses the offset, not the client § Follows similar control flow as Write §  Primary tells secondary replicas to append at the same offset §  Data must be wriyen at the same offset for all chunk replicas for success to be reported TK1-­‐K2.4: Cloud CompuGng 34 GFS Fault Tolerance: Chunkservers §  Purpose §  maintain a consistent mutaGon order across replicas by idenGfying the primary copy (mutaGon = operaGon that changes contents or metadata of a chunk) §  deallocaGon of resources in case of a failure §  Chunkserver: Failure Recovery §  Write, 1st step: Client asks Master for Lease Holder §  If no primary assigned for specified chunk -­‐> Master issues new Lease §  Lease has an iniGal Gmeout of 60 seconds §  As long as mutaGon goes on, primary periodically requests extensions from Master §  Now, if the master loses communicaGon with a primary: §  Master can safely grant a new lease to another replica a]er the old lease has expired §  Now, if the faulty chunk server comes back: §  Master maintains version number for each chunk §  Allows detecGon of Stale Replicas TK1-­‐K2.4: Cloud CompuGng 35 GFS Fault Tolerance: Master § OperaGon Log § Record of all criGcal metadata changes § Stored on Master and replicated on other machines § Defines order of concurrent operaGons § Changes not visible to clients unGl metadata changes are persistent § Master checkpoints its state whenever the log grows beyond a certain size §  Checkpoint is a compact B-­‐tree (used for namespace lookup) § Master Recovery § Read latest complete checkpoint § Replay operaGon log § Master ReplicaGon § “shadow” masters provide read-­‐only access even when primary master is down TK1-­‐K2.4: Cloud CompuGng 36 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
TK1-­‐K2.4: Cloud CompuGng Virtualization
EC2
(Amazon)
Block Storage
S3
(Amazon)
GFS
(Google)
37 Amazon EC2 §  Amazon ElasGc Cloud CompuGng § EC2 is not much more than “VM hosGng” § Amazon Machine Image (AMI) = encrypted VM image § Images are stored in an S3 repository § Pre-­‐configured templates provided §  OSes: Windows Server 2003, …-­‐Linux, OpenSolaris §  Databases: Oracle 11g, MSSQL, MySQL §  CommunicaGon/Processing: Hadoop, Condor, Open MPI §  Web hosGng: Apache, IIS § Instances can be controlled using a Web Service interface § The local storage of instances is non-­‐persistent § Instances are accessed via RDP or SSH § Availability Zones §  Describes the locaGon of a datacenter as geographic region (e.g., us-­‐east-­‐1a) §  Placement of instances can be controlled §  DistribuGon of instances can be used for fault tolerance TK1-­‐K2.4: Cloud CompuGng 38 Amazon EC2: Block Store §  Amazon ElasGc Block Store § Raw block device that can be mounted (“ayached”) by an instance §  can only be mounted by a single instance at any Gme! § Allows persistence beyond the lifeGme of an instance § Snapshots can be stored to/restored from S3 § API funcGons §  CreateVolume / DeleteVolume §  DescribeVolume: get informaGon about a volume (size, source snapshot, creaGon Gme, status) §  AyachVolume / DetachVolume §  CreateSnapshot / DeleteShapshot §  DescribeSnapshots: get informaGon about snapshots § API accessed using XML/SOAP or REST TK1-­‐K2.4: Cloud CompuGng 39 Amazon EC2: API §  Running § RegisterImage / DeregisterImage § RunInstances: launch the specified number of instances § TerminateInstances § RebootInstances §  Networking § AllocateAddress: acquire an elasGc IP address § AssociateAddress: associate elasGc IP address with instance §  Storage § CreateVolume / DeleteVolume § AyachVolume: mount an EBS volume § CreateSnapshot / DeleteSnapshot TK1-­‐K2.4: Cloud CompuGng 40 Amazon EC2: Services §  Amazon Simple Store Service (S3) § Based on REST-­‐style API § AuthorizaGon: AWS <AWSAccessKeyID>:<Signature> §  AWSAccessKeyID: customer ID §  Signature: SHA1 hash of message, encrypted with customer’s private key §  Amazon Simple DB § Cluster database for storing persistent data § Does not provide full relaGonal / SQL support §  Amazon Simple Queue Service § MOM for inter-­‐service communicaGon TK1-­‐K2.4: Cloud CompuGng 41 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
Virtualization
Block Storage
TK1-­‐K2.4: Cloud CompuGng EC2
(Amazon)
S3
(Amazon)
GFS
(Google)
42 MapReduce §  Consider the problem of counGng the number of occurrences of each word in a large collecGon of documents §  MapReduce Programming Model §  Inspired from map and reduce operaGons commonly used in funcGonal programming languages like Lisp 1. map: (key1, val1) -­‐> [(key2, val2)] 2. reduce: (key2, [val2]) -­‐> [val3] §  Map §  is a pure funcGon (always evals to same result & no side effects) §  processes input key/value-­‐pair §  produces set of intermediate key/value pairs §  e.g.: (document_name, document_content) -­‐> [(word1, 1),...,(wordn, 1)] §  Reduce §  Combines all intermediate values for a parGcular key §  Produces a set of merged output values (usually just one) §  e.g.: (word, [1,...,1]) -­‐> [n] TK1-­‐K2.4: Cloud CompuGng 43 MapReduce §  Example: count word occurrences map(String input_key, String input_value):
// input_key: document name
// input_value: document contents
for each word w in input_value:
EmitIntermediate(w, "1");
reduce(String output_key, Iterator intermediate_values):
// output_key: a word
// output_values: a list of counts
int result = 0;
for each v in intermediate_values:
result += ParseInt(v);
Emit(AsString(result));
TK1-­‐K2.4: Cloud CompuGng 44 MapReduce: Execu)on TK1-­‐K2.4: Cloud CompuGng 45 MapReduce: Distributed Execu)on Input Reader
Map
Partition
Compare
Reduce
Output Writer
TK1-­‐K2.4: Cloud CompuGng 46 MapReduce: Data Flow §  The staGc part of MapReduce is a large distributed sort; applicaGons define: 1. Input Reader • Reads data from (distributed) filesystem and generates key/value pairs • e.g., read directory full of text files & return each line as record 2. Map FuncGon 3. ParGGon FuncGon • Assigns the output of all map operaGons to a reducer: reducerIndex = parGGonFuncGon(key, numberOfReducers) • Note: all idenGcal keys must be processed by same reducer • e.g., reducerIndex = hash(key) % numberOfReducers 4. Compare FuncGon • Defines which keys should be considered idenGcal • Groups input for reduce step 5. Reduce FuncGon 6. Output Writer • Writes output to (distributed) filesystem TK1-­‐K2.4: Cloud CompuGng 47 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
Virtualization
Block Storage
TK1-­‐K2.4: Cloud CompuGng EC2
(Amazon)
S3
(Amazon)
GFS
(Google)
48 Hadoop § 
§ 
§ 
§ 
For processing vast amouts of data in big (10K+ nodes) clusters/clouds Inspired by Google‘s MapReduce Programmer must provide suitable map and reduce funcGons Not suitable for all problems, but many applicaGons possible § Example uses: distributed grep, distributed sort, web link-­‐graph traversal, document clustering, machine learning, staGsGcal machine translaGon, … §  Advantage of this approach: Latency of communicaGon subsystem not so important TK1-­‐K2.4: Cloud CompuGng 49 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
Virtualization
Block Storage
TK1-­‐K2.4: Cloud CompuGng EC2
(Amazon)
S3
(Amazon)
GFS
(Google)
50 BigTable (Google) § BigTable is a distributed storage system for managing structured data § Lots of (semi-­‐)structured data at Google § URLs: §  Contents, crawl metadata, links, anchors, pagerank, ... § Per-­‐user data: §  User preference se‡ngs, recent queries/search results, ... § Geographic locaGons: §  Physical enGGes (shops, restaurants, etc.), roads, satellite image data, user annotaGons, ... § Designed to scale to a very large size § Petabytes of data across thousands of servers § Billions of URLs, many versions/page (~20K/version) § Hundreds of millions of users, thousands or q/sec § Many TB of satellite image data § Scale is too large for most SQL databases §  Even if it weren’t, licensing costs would be very high §  Low-­‐level storage opGmizaGons help performance significantly §  Much harder to do when running on top of a database layer TK1-­‐K2.4: Cloud CompuGng 51 BigTable: Goals § Want asynchronous processes to be conGnuously updaGng different pieces of data § Want access to most current data at any Gme § Need to support § Very high read/write rates (millions of ops per second) § Efficient scans over all or interesGng subsets of data § Efficient joins of large one-­‐to-­‐one and one-­‐to-­‐many datasets § O]en want to examine data changes over Gme § E.g. Contents of a web page over mulGple crawls § Fault-­‐tolerant, persistent § Self-­‐managing § Servers can be added/removed dynamically § Servers adjust to load imbalance TK1-­‐K2.4: Cloud CompuGng 52 BigTable: Applica)ons § Used in many Google projects (data from 2006) TK1-­‐K2.4: Cloud CompuGng 53 BigTable: WebTable §  BigTable data model §  „a sparse, distributed mulG-­‐dimensional sorted map“ §  BigTable example: WebTable §  Row name is a reversed URL §  The contents column family contains the page contents §  The anchor column family contains references to the page (anchor:cnnsi.com, anchor:my.look.ca) §  Each anchor cell has one version; the contents column has three versions: t3, t5, and t6 TK1-­‐K2.4: Cloud CompuGng 54 Google App Engine Datastore § Builds on top of BigTable § GQL § SQL-­‐like language for retrieving enGGes or keys from the App Engine scalable datastore § no UPDATE syntax TK1-­‐K2.4: Cloud CompuGng 55 DBMS vs. Key/Value Store Database Structure
Relational Database
§  Tables w/ columns and rows, rows contain column values, rows in table all have same schema §  Data model is well-­‐defined in advance, schema is typed & has constraints & relaGonships to enforce integrity §  Data model based on „natural“ representaGon of data, independent of app data structures §  Data model is normalized to remove duplicaGon. RelaGonships associate between mulGple tables §  No good support for large amounts of binary data TK1-­‐K2.4: Cloud CompuGng Key/Value Database
§  Table is a bucket were key/value pairs are put into. Different rows can have different „schemas“ §  Records are idenGfied by keys. A dynamic set of ayributes can be associated with a key §  SomeGmes ayributes are all of a string type only §  No relaGonships are explicitly defined. No normalizaGon models §  Handles binary data well 56 DBMS vs. Key/Value Store Data Access
Relational Database
§  Data is created, updated, and retrieved using SQL §  SQL queries can act on single tables, but also act on mulGple tables through table joins §  SQL provides funcGons for aggregaGon & complex filtering §  „AcGve Database“ with triggers, stored procedures, funcGons TK1-­‐K2.4: Cloud CompuGng Key/Value Database
§  Data manipulated through proprietary APIs §  SomeGmes SQL-­‐like syntax without join or update funcGonality (e.g., GQL) §  Only basic filter predicates (=, !=, <, >). O]en: Processing not considered to be part of DB -­‐> MapReduce, Hadoop §  App & data integrity logic all contained in app code 57 DBMS vs. Key/Value Store Application Interface
Relational Database
§  De-­‐facto standards for SQL database drivers (e.g., ODBC, JDBC) §  Data is stored in „natural“ representaGon. O]en mapping between app data structures and relaGonal structure used: Object-­‐
RelaGonal Mapping (ORM) TK1-­‐K2.4: Cloud CompuGng Key/Value Database
§  Proprietary APIs, someGmes SOAP/
REST §  Direct serializaGon of app objects into the value part of records 58 DBMS vs. Key/Value Store § ApplicaGons for Key/Value Stores § data is heavily document-­‐oriented -­‐> more natural fit with the key/value model than the relaGonal model § large amounts of binary data § applicaGon builds heavily on object-­‐orientaGon -­‐> key/value database minimizes the need for „plumbing“ code § data store is cheap & integrates easily with cloud plaoorm § on-­‐demand, vast scalability is of vital importance §  Thousands of servers §  Terabytes of in-­‐memory data §  Petabyte of disk-­‐based data §  Millions of reads/writes per second, efficient scans TK1-­‐K2.4: Cloud CompuGng 59 The „Cloud Stack“ Application
AppEngine
(Google)
Web
Azure Web Role
(MS)
Azure Worker
Role (MS)
Services
Platform
Sherpa
(Yahoo)
Storage
Batch
Processing
Hadoop
(Apache)
BigTable
(Google)
PIG
(Yahoo)
Azure Tables
(MS)
MapReduce
(Google)
DryadLinq
(MS)
OS
Infrastructure
Virtualization
Block Storage
TK1-­‐K2.4: Cloud CompuGng EC2
(Amazon)
S3
(Amazon)
GFS
(Google)
60 Google App Engine §  Allows to run own web applicaGons on Google’s infrastructure §  Only Python supported as programming language §  Sandbox with several limitaGons §  App code can only run as result of HTTP(S) request and may not spawn sub-­‐
processes §  Cannot write to file system §  App can only fetch URLs or use email services for communicaGon §  The Datastore §  Distributed storage service with query engine and transacGons §  Stores data objects of mixed types, each object consists of typed key/value pairs §  Query with GQL, a limited SQL variant §  Google accounts §  App can access informaGon from Google accounts §  Currently free of charge, 500MB storage and 5M page views/month §  AddiGonal compuGng resources for purchase in near future TK1-­‐K2.4: Cloud CompuGng 61 Windows Azure §  Architecture of Web ApplicaGons § ApplicaGon has mulGple instances, each running all or a part of the code § Each instance runs in its own VM (64-­‐bit Windows Server 2008) § Developers may provide .NET applicaGons § One-­‐to-­‐one mapping between VMs and CPU cores -­‐> guaranteed performance TK1-­‐K2.4: Cloud CompuGng 62 Windows Azure Instances §  Web Role Instance § Can be accessed from the Web § Must be stateless! § Load balancer does not guarantee that requests from the same user will be dispatched to the same instance § Windows Azure storage must be used to maintain state §  Worker Role Instance § For “background” processes § Use queues in Azure storage for communicaGon § RelaGon to three-­‐Ger architectures § PresentaGon Tier -­‐> Web Role Instance § ApplicaGon Tier -­‐> Worker Role Instance § Data Tier -­‐> Windows Azure Storage or MS SQL Server TK1-­‐K2.4: Cloud CompuGng 63 Windows Azure Storage §  Windows Azure Storage § Blobs §  For big, unstructured data (images, audio, video, …) §  One or more containers, each holding blobs § Tables §  No defined schema, properGes can have various types §  Tables can be large and parGGoned across many servers § Queues §  Used for communicaGon between instances §  Accessed using REST, also from outside §  All data is replicated three Gmes TK1-­‐K2.4: Cloud CompuGng 64 Windows Azure §  Development Process TK1-­‐K2.4: Cloud CompuGng 65 Azure PlaOorm § Windows Azure: A Windows environment in Microso] data centers. § SQL Azure: Cloud-­‐based SQL relaGonal data services § AppFabric: Cloud-­‐based infrastructure services § Marketplace: An online service for purchasing cloud-­‐based data and applicaGons. TK1-­‐K2.4: Cloud CompuGng 66 Azure AppFabric § Service Bus § ApplicaGon provides funcGonality through Web Services § How to make the services visible to the outside world? §  VM instance may have some random IP address and hostname § How to find services? § Access Control § Working with idenGty is fundamental to distributed applicaGons §  HTTP and web service invocaGons are basically stateless §  Hence, idenGty & session support essenGal for web/cloud frameworks § Azure provides an abstracGon layer for authenGcaGon & authorizaGon § Caching § Distributed in-­‐memory cache service § Allows fast access to, e.g., session objects TK1-­‐K2.4: Cloud CompuGng 67 Azure AppFabric: Service Bus § InvocaGon through Service Bus 1. 
2. 
3. 
4. 
5. 
Service registers endpoints with SB SB exposes corresponding endpoints Client uses SB Registry to find a service Client invokes operaGons on SB endpoint SB forwards call to service § NAT and Firewall-­‐traversal § In step 1, service keeps TCP connecGon to SB open § In step 5, SB dispatches call over this connecGon § Service Bus § Only endpoints of the SB are visible to outside clients § Message buffers allow to have a queue for messages § MulGple service can be registered to the same endpoint for load balancing TK1-­‐K2.4: Cloud CompuGng 68 Azure AppFabric: Service Bus Registry § Proprietary Service Bus Registry vs. DNS § names service endpoints <-­‐> names hosts § Updates reflected instantaneously <-­‐> high update propagaGon latency § R/W access with access control <-­‐> limited write-­‐access model § ProgrammaGc access to Naming § discover: Atom feed hierarchy § publish: Atom publishing protocol § Atom Standards § Atom Syndica)on Format is an XML language for web feeds §  Services are described in registry feeds § Atom Publishing Protocol (AtomPub or APP) §  simple HTTP-­‐based protocol for creaGng and updaGng web resources § Originally, Atom was developed as an alternaGve to RSS TK1-­‐K2.4: Cloud CompuGng 69 Azure AppFabric: Access Control TK1-­‐K2.4: Cloud CompuGng 70 Azure AppFabric: Access Control § Framework provides support for Claims-­‐based Iden)ty § User sends token full of claims that contain idenGty informaGon § Tokens are issued by Iden)ty Providers (IdPs) §  e.g., AcGve Directory, Windows Live, Google, Facebook § Access Control for Web ApplicaGons 1.  User ayempts to access applicaGon via browser 2.  ApplicaGon redirects to IdenGty Provider (IdP) §  User authenGcates, e.g., with username/password §  IdP returns IdP Token containing claims about user 3.  Browser sends IdP Token to Access Control 4.  Access Control validates IdP token and creates Access Token (AC) §  Rule engine can transform ayributes (username, etc.) to common representaGon 5.  AC is returned to Browser 6.  Browser submits AC token to applicaGon 7.  ApplicaGon verifies AC, then uses the claims it contains TK1-­‐K2.4: Cloud CompuGng 71 Azure AppFabric: Access Control § Accessing the user‘s idenGty in web applicaGons § Example below uses Thread.CurrentPrincipal § Iterates through claims and finds email address protected void SendLeyer_Click(object sender, EventArgs e) { IClaimsIdenGty id = ((IClaimsIdenGty)Thread.CurrentPrincipal).IdenGGes[0]; // find user‘s email address in idenGty string email = null; foreach (Claim c in id.Claims) { if (c.ClaimType == ClaimTypes.Email) { email = c.Value; break; } } new SmtpClient().Send(new MailMessage( “[email protected]“, email, “Message from Fabrikam“, “Thank you for shopping with us!“)); } TK1-­‐K2.4: Cloud CompuGng 72 Access Control for Web Services with WS-­‐Trust STS 1
Client GET ...?wsdl WSDL Policy Token Required Service 1. Client requests WSDL interface descripGon from Service § WSDL contains Policy (described in WS-­‐Policy + WS-­‐SecurityPolicy) § requiring a SAML (Security AsserGon Markup Language) token TK1-­‐K2.4: Cloud CompuGng 73 Access Control for Web Services with WS-­‐Trust 2
GET ...?wsdl STS Client WSDL Policy Username Required Service 2. Client requests WSDL from Security Token Service (STS) § WSDL contains Policy requiring, e.g., the username in a signed message TK1-­‐K2.4: Cloud CompuGng 74 Access Control for Web Services with WS-­‐Trust 3
RST C Name= “Consumer“ Client RSTR STS Token STS Service 3. Client requests Token at STS § Client sends Request Security Token (RST), signed by Client § STS validates RST and sends RST Response (RSTR) with Token, signed by STS TK1-­‐K2.4: Cloud CompuGng 75 Access Control for Web Services with WS-­‐Trust 4
SOAP Token STS STS Request Client SOAP Response Service 4. Client calls service § Request contains Token signed by STS § Service validates signature and executes request TK1-­‐K2.4: Cloud CompuGng 76 SQL Azure §  The focus of Google‘s BigTable and Azure Storage Tables is more on analy)cs and OLAP workloads §  Large linear reads and writes §  However, business applicaGons require tradiGonal transac)on processing §  Read and update workloads §  RDBMS provide data-­‐processing capabiliGes on top of a storage system §  E.g., a single query (a few bytes) to the DB will cause it to retrieve the data (possibly many gigabytes) from disk into memory, filter the data (down to several hundred megabytes in common scenarios), calculate the sum of sales figures and return the sum (a few bytes). §  SQL Azure §  Provides a cloud-­‐based DBMS §  Here: DBMS is a cloud infrastructure component, outside Compute VMs §  (Mostly: DBMS run inside VMs) §  SQL Azure Data Sync §  allows synchronizing data between SQL Azure DB and on-­‐premises SQL Server DB §  can also be used to synchronize data across different SQL Azure DBs in different Microso] data centers TK1-­‐K2.4: Cloud CompuGng 77 SQL Azure §  SQL Azure Scalability §  TradiGonal SQL databases do not scale well §  also see discussion on “DBMS vs. Key/Value Store” before §  Current database research: fundamental re-­‐design of DBMSes for Clouds! §  Size of single DB currently limited to 50 GB §  Sharding (… and the trouble with it) §  ParGGon data to several DBs, e.g., §  store product database (small) in DB1 §  store customers A..L in DB2, M..Z in DB3 §  Must be handled by the applicaGon! = probably have to change a lot of code §  What if the A..L-­‐part is much bigger? §  What if all the key customers are in M..Z? §  The selecGon of a parGGon key for sharding is dependant on §  the number of rows that will be in each shard and §  the usage profile of the candidate shard over Gme §  How and when to rebalance? TK1-­‐K2.4: Cloud CompuGng 78 Transac)on Processing in the Cloud § What can we expect from Amazon, Microso] Azure, and Google when we want to do transacGon processing? § Performance? § Costs? § Results of a TPC-­‐W benchmark are presented in the following Source: Donald Kossmann, Tim Kraska, and Simon Loesing: An EvaluaGon of AlternaGve Architectures for TransacGon Processing in the Cloud, in: ACM SIGMOD’10, 2010. TK1-­‐K2.4: Cloud CompuGng 79 Distributed Database Architectures § Classic Database Architecture § Load balancer dispatches client request to an available applicaGon server § Central database server ensures ACID §  (atomicity, consistency, isolaGon, durability) § DB server can be highly efficient (use of SAN, SSDs, etc.), but it is sGll the boyleneck § Scalability and elas)city at the App Server and Storage Gers Clients
HTTP
Web Server
+
App Server
SQL
DB Server
Put/Get
Storage
§ Par))oning § Database is logically parGGoned and each parGGon is controlled by a separate server § Problems of parGGoning were discussed before § Problem with scalability/elasGcity: adding/removing resources requires reparGGoning of data § Remedy: combinaGon with replicaGon TK1-­‐K2.4: Cloud CompuGng 80 Distributed Database Architectures §  Replica)on §  Each database controls a copy of the whole database (or a parGGon) §  ROWA (read-­‐once, write-­‐all) §  Read can be serviced from local copy §  Write requires synchronous update of all replicas §  Good scalability for read-­‐mostly workloads §  For update-­‐intensive workloads, master can become boyleneck §  Distributed Control §  shared-­‐disk architecture with loose coupling between nodes §  DB servers concurrently & autonomously access storage system §  Different levels of consistency guaranteed by distributed protocols §  Database Ger merged with Web+App Ger TK1-­‐K2.4: Cloud CompuGng 81 The TPC-­‐W Benchmark § TransacGon Processing Performance Council Web e-­‐Commerce benchmark (TPC-­‐W) § End-­‐to-­‐end performance evaluaGon of enterprise web applicaGons that involve transacGon processing § TPC-­‐W models an online bookstore with a mix of fourteen different kinds of requests §  Database size is 315MB of raw data, about 1GB in total § The benchmark models Emulated Browsers (EBs) §  Each EB simulates one user who issues a request (according to the probabiliGes of the workload mix), waits for the answer, and then issues the next request a]er a specified waiGng Gme § Throughput metric: measures the number of valid TPC-­‐W requests per second (= WIPS) §  valid = response must be within 3..20 seconds, depending on request kind § Cost metric: Measures Cost/WIPS TK1-­‐K2.4: Cloud CompuGng 82 Transac)on Processing: Benchmark TK1-­‐K2.4: Cloud CompuGng 83 Transac)on Processing: Benchmark § Amazon S3 (S3) § S3 only provides a low-­‐level put/get interface § SQL query processing implemented in a custom library § Provides eventual consistency – distributed control § Results: §  shows opGmal behavior § Azure § SQL Azure details not published by MS, however it is based on replica)on § TPC-­‐W DB small enough for the 50GB limit § Results: §  shows opGmal behavior TK1-­‐K2.4: Cloud CompuGng 84 Transac)on Processing: Benchmark § MySQL § Single MySQL server in separate Amazon EC2 instance §  MySQL cluster version had worse performance! § Results: §  Throughput plateaus a]er 3500 EBs and stays constant §  All requests return answers, but a growing percentage of requests are not answered within the response Gme constraints § Amazon Rela)onal Database Service (RDS) § Pre-­‐packaged MySQL server § Results: no difference to above TK1-­‐K2.4: Cloud CompuGng 85 Transac)on Processing: Benchmark §  Amazon Simple Database (SDB) §  Uses combinaGon of parGGoning and replicaGon §  Only supports low-­‐level insert, update, delete §  Similar to BigTable and Azure Storage Tables §  SQL joins and aggregaGons have to be implemented at applicaGon level §  Results: §  WIPS grow up to about 3000 EBs and more than 200 WIPS §  Already overloaded at about 1000 EBs and 128 WIPS §  In an overload situaGon, SimpleDB simply drops requests and returns errors §  Google App Engine (AE/C) §  Uses combinaGon of parGGoning and replicaGon §  Has proprietary query language GQL §  does not support group by, aggregate funcGons, joins, or LIKE predicates §  Results: §  Like SimpleDB, Google AE drops requests in overload situaGons §  Both read and write requests of all kinds are dropped in overload situaGons §  Google’s focus is on supporGng low-­‐end workloads for the lowest possible price TK1-­‐K2.4: Cloud CompuGng 86 Transac)on Processing: Benchmark § Costs Cost per WIPS in m$ at different EBs TK1-­‐K2.4: Cloud CompuGng 87 Summary § 
§ 
§ 
§ 
Cloud CompuGng objecGves Types of Cloud Service: SaaS, PaaS, IaaS Cloud vs. Grid CompuGng, Scalability Architecture of a Cloud Plaoorm, “The Cloud Stack” §  Block Storage: Amazon S3, Google File System §  VirtualizaGon: Amazon EC2 §  Batch Processing: MapReduce, Hadoop §  Structured Distributed Storage: BigTable §  Google App Engine, Microso] Azure •  How cloud compuGng achieves scalability § 
§ 
§ 
§ 
DecomposiGon of app funcGonality into roles Scalable persistent storage, DBMS vs. Key/Value Stores Message-­‐oriented Middleware (MOM) for comm. btw. Instances Scalable computaGon in presence of slow interconnecGon network TK1-­‐K2.4: Cloud CompuGng 88 Ques)ons TK1-­‐K2.4: Cloud CompuGng 89 Literature §  Sanjay Ghemawat, Howard Gobioff, Shun-­‐Tak Leung. "The Google File System". in Symposium on Opera2ng Systems Principles (SOSP), 2003. §  Jeffrey Dean, Sanjay Ghemawat. "MapReduce: Simplified Data Processing on Large Clusters". in Opera2ng Systems Design and Implementa2on (OSDI), 2004. §  Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. "Bigtable: A Distributed Storage System for Structured Data". in Opera2ng Systems Design and Implementa2on (OSDI), 2006. §  Tony Bain. Is the RelaGonal Database Doomed?. Retrieved Feb 03, 2010, from hyp://
www.readwriteweb.com/enterprise/2009/02/is-­‐the-­‐relaGonal-­‐database-­‐doomed.php, 2009. §  Donald Kossmann, Tim Kraska, and Simon Loesing: An EvaluaGon of AlternaGve Architectures for TransacGon Processing in the Cloud, in: ACM SIGMOD’10, 2010. §  David Chappell: Introducing the Windows Azure Plaoorm, retrieved Dec 07, 2010, from hyp://www.microso].com/windowsazure/ §  Simon Munro: The Trouble With Sharding, retrieved Dec 07, 2010, from hyp://simonmunro.com/2009/09/10/the-­‐trouble-­‐with-­‐sharding/ §  Keith Brown, Sesha Mani: Microso] Windows IdenGty FoundaGon (WIF) Whitepaper for Developers, 2009. TK1-­‐K2.4: Cloud CompuGng 90