slides - VECPAR

Transcription

slides - VECPAR
A P2P Approach to Many Tasks
Computing for Scientific Workflows
Eduardo Ogasawara
Daniel Oliveira
Carlos Pivotto
Vanessa Braganholo
Marta Mattoso
Jonas Dias
Carla Rodrigues
Rafael Antas
Patrick Valduriez
COPPE/UFRJ
PPGI/UFRJ
Université Montpellier/INRIA
VECPAR’10
Our Scenario
• Scientific Experiments
– Large scale simulations
– Chain of programs and activities
– Scientific Workflows
• SWfMS (Kepler, VisTrails)
• Exploration with different methods, parameters or data
– Many Task Computing (MTC)
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
2
What we have
• Workflow parallelization
– Homogenous environments
• Multiprocessor or cluster systems
• Centralized control
• High‐throughput, low latency, high performance
– Heterogeneous environments
• Grids, desktop grids and volunteering computing,
hybrid clouds
• Different efforts to parallelize the workflow
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
3
The problems
• To parallelize in the context of scientific
workflows
– Independent of the target environments
– Better control of the experiment
• Provenance gathering
– On heterogeneous distributed environment
– Experiments must be reproducible
• Control X Performance
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
4
What we propose
• Peer‐to‐Peer approach
– Decentralized control
– Scalability
– Dynamic behavior of nodes
• P2P to support MTC
– Through Scientific Workflows
• SciMule
– A tool to parallelize and distribute scientific
workflows activities through P2P computing
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
5
Agenda
1. Backgrounds on P2P networks
– P2P approaches overview and comparison
2. SciMule Architecture
– Features and strategies
3. Experimental results regarding SciMule
– Does it work?
4. Conclusions
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
6
P2P types
Super‐Peer
KaZaA
Napster
e2dk
6/24/2010
DHT
Hierarchical
Torrent
Kademlia
Canon
Gnutella
A P2P Approach to Many Tasks Computing
for Scientific Workflows
7
Factors that impact on P2P
architectures
• Load Balancing
– An activity can generate thousands of tasks
• Scalability
– Large Scale Networks
• Churn risk
– How impactful a churn event can be
• Maintenance cost
– Affects the scheduling process
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
8
Comparing
Network type
Centralized
Decentralized
Hierarchical
Load Balancing
Low
High
Moderate
Scalability
Low
Moderate
High
Churn Risks
High
Low
Moderate
Maintenance Cost
Low
High
Moderate
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
Factor
9
Design of SciMule
SWfMS
• Middleware
SciMule
Peer
– Distribute, control and monitor Wf Activities
• Promote Workflow/Activity Parallelization
• Hierarchical approach
• Three‐layer architecture
– Submission layer
– Execution layer
– Overlay layer
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
10
SciMule Peer Roles
• Client Peer
– Submits
• Executor Peer
– Executes
• Gate Peer
– Keeps the list of nearby nodes and their subject
• Peers may play any role during SciMule
lifetime
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
11
SciMule Purpose
• To make it easier to distribute Wf activities
– To make scientists happier!
• Using MTC paradigm
– MTC without control are hard to maintain
• Activities that demand high computation
• Adaptation of Hydra
– Real system
– Our approach for clusters
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
12
SciMule Architectural Features
• Two types of parallelization
– Data parallelism
– Parameter sweep
• Distributed provenance gathering
– Heterogeneous network
• Simple to deploy
• Linked to SWfMS
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
13
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
14
SciMule Overlay
• Overlay affects the performance
• Balance peers
– Locality principles
– Subjects
• Gate Peers
–
–
–
–
6/24/2010
Keep a list of nearby nodes
Control subjects
Have low churn frequency and high reputation
Keep a backup node
A P2P Approach to Many Tasks Computing
for Scientific Workflows
15
Overlay characteristics
• New peers connect to few neighbors
– Half the average of the old peers neighborhood
– Avoid free riders
• Neighborhood grows over time
• Controlled connectivity
– Based on the number of gate peers on the
network
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
16
Overlay behaviour
• Network with n nodes and g Gate Peers
…
Gate Peers
n/g
n/g
n/g
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
17
Overlay
…
Gate Peers
n/g
n/g
6/24/2010
+
A P2P Approach to Many Tasks Computing
for Scientific Workflows
=2n/g
18
Overlay
…
Gate Peers
+ n/g
Maximum of neighbors: 3n/g
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
19
SciMule Evaluation
• Evaluate the proposed architecture
– Does SciMule parallelization scale?
• Simulation environment
– Experimentation on real large scale P2P is expensive
and difficult
– Flexibility to evaluate hybrid networks
• PeerSim extension
– SciMule peers, Link, Data package, Workflow
Activities and Task components
– Scheduling, transference and execution systems
– Overlay
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
20
The study
•
•
•
•
Five‐factor, four‐treatment study
14400 cycles of simulation (5 days simulated)
4096 nodes in the network
0.01 Poisson activity submission frequency
Factors
Independent Variables
cycles
14400
n
4096
f
k
0.01
32
64
128
256
6/24/2010
tasks
128
512
cost
size
churn
4000
8000
12000
24000
0.00
0.05
16000
32000
48000
96000
0.10
A P2P Approach to Many Tasks Computing
for Scientific Workflows
21
Presented Scenarios
• 384 instances of simulation
• Representative Activities cases
– Churn Poisson frequency: 0%, 5% and 10%
Activity Name
Tasks
Low Cost Medium Size
Low Cost Big Size
Medium Cost Small Size
High Cost Small Size
512
128
128
512
6/24/2010
TaskCost
TaskSize (MB)
(p.u.)
4,000
6
4,000
12
16,000
1.5
32,000
1.5
A P2P Approach to Many Tasks Computing
for Scientific Workflows
22
Parameter Sweep results
Activity Name
Tasks
TaskCost
TaskSize (MB)
(p.u.)
Low Cost Medium Size
512
4,000
6
Low Cost Big Size
128
4,000
12
A P2P Approach to Many Tasks Computing
6/24/2010
23
for
Scientific Workflows
Medium
Cost Small Size
128
16,000
1.5
Data Parallelism results
Activity Name
Tasks
TaskCost
TaskSize (MB)
(p.u.)
Low Cost Medium Size
512
4,000
6
Low Cost Big Size
128
4,000
12
A P2P Approach to Many Tasks Computing
6/24/2010
24
for
Scientific Workflows
Medium
Cost Small Size
128
16,000
1.5
Conclusions
• SciMule scales
– Distributing scientific workflow activities over a
P2P network
– Promising approach to deal with MTC on
heterogeneous environments
• P2P
• Hybrid Clouds
• Distribution approaches should vary
• Impact of churn events
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
25
Future Work
• To improve the scheduling mechanism
– Considering data transferring minimization
• Minimize data transfer impact
– Improve data discovery
– Improve data distribution
• Torrent
• Compression
• Replication
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
26
Acknowledgements
6/24/2010
A P2P Approach to Many Tasks Computing
for Scientific Workflows
27
A P2P Approach to Many Tasks Computing for
Scientific Workflows
Thanks!
Eduardo Ogasawara
Daniel Oliveira
Carlos Pivotto
Vanessa Braganholo
Marta Mattoso
Jonas Dias
Carla Rodrigues
Rafael Antas
Patrick Valduriez
COPPE/UFRJ
PPGI/UFRJ
Université Montpellier/INRIA
VECPAR’10