Clusterix:Network Management System

Transcription

Clusterix:Network Management System
Clusterix:Network Management System
Michał Balcerkiewicz [email protected]
Bartosz Belter
[email protected]
Artur Binczewski
[email protected]
Radosław Krzywania [email protected]
Maciej Stroiński
[email protected]
Jan Węglarz
[email protected]
Clusterix Network Management
– Goals And Objectives
• Building network infrastructure
• Using network as a GRID resource
• Dynamically attaching new clusters
• Active network monitoring
Clusterix Network Architecture
•
•
•
•
•
Local Cluster
Communication to all cluster is
Switch
passed through router/firewall
Access Node
Routing based on IPv6
protocol, with IPv4 for back
compatibility feature
Application and Clusterix
middleware are adjusted to
IPv6 usage
For security reason only
outgoing connections to
Computing
Internet are permitted
Nodes
Two 1 Gbps VLANs are used to
improve management of
Communication
network traffic
& NFS VLANs
–
–
Communication VLAN is dedicated
to support nodes messages
exchange
NFS VLAN is dedicated to support
file transfer
Clusterix Storage
Element
PIONIER
Core Switch
1 Gbps
Backbone Traffic
Internet Network
Internet Network
Access
Router
Firewall
Network as a GRID resource
• Network can be seen as a set of parametrized resources
• Knowledge of network utilization is used by task broker to improve its
job by choosing optimal routes for task delegation
• Network managment module:
– provides the following metrics :
- Round trip time
- Throughput
- Out of order packets
- Duplicated packets
- Packet jitter
- Lost packets
– provides information about devices accessibility
– provides managment information via SNMP
• All parameters can be accessed via industry standard Web Services
Integration with Broker
Application A – distributed computation, high communication (small chunks of
data)
Application B – visualisation, less communication, heavy use of data, massive
output results
Request A {
Max_Clusters
Processors
RTT
Bandwidth
Packet_loss
}
= 4;
= 16;
= 5ms;
= 5Mb/s;
= 0%;
Request B {
Max_Clusters
Processors
RTT
Bandwidth
}
= 2;
= 16;
= 20ms;
= 500Mb/s;
Purposes of dynamic cluster attachment
• External clusters can be easily attached to Clusterix
infrastructure in order to:
– Increase computing power with new clusters
– Utilize external clusters during nights or non-active
periods
– Make Clusterix infrastructure scalable
Dynamic Cluster Attachment - Architecture
• Dynamic cluster attachment:
– Requirements needs to be
checked against new clusters
Local
Switch
PIONIER
Backbone Switch
• Installed software
• SSL certificates
– Communication through
router/firewall
Internet
– Network Management System
will automatically discover new Regular
resources
Cluster
– New cluster can serve
computing power on regular
basis
Router
Firewall
Dynamic
Resources
Active network monitoring
• Measurement
architecture
SNMP
Monitoring
– Distributed 2-level
Network
measurement agent mesh
Manager
(backbone/cluster)
Measurement
Reports
– Centralized control
manager (multiple
redundant instances)
– Switches are monitored via
SNMP
– Measurements reports are
stored by manager
(forwarded to database)
– IPv6 protocol and
addressing schema is used
Computing Cluster
for measurement
PIONIER
Backbone
Measurements
Local Cluster
Measurements
• Backup managers improves
failure recovery (active
manager switching)
• External applications are
allowed to retrieve various
network statistics
• Devices and agents
management modules collect
network data
System Manager
• GUI shows network status and
configure manager
System
Resources
• Statistics are stored in
external database (short time
backup is stored in manager)
External
Entities
Software Architecture
Database
External
Clients
Controller
GUI
External
Interfaces
Backup
Manager
Redundancy
Controller
System Logic
Measurement Agents
Manager
Backbone measurements
Local Cluster measurements
Device
Manager
Devices
Graphical User Interface
• GUI
– Provides view of network
status
– Gives a look at statistics
– Simplifies network
troubleshooting
– Allows to configure
measurement sessions
– Useful for topology browsing
Summary
• Network is used as a regular GRID
resource
• Sharing measurements with other tools
• Dynamic architecture allows easy power
upgrades
• Failure resist network monitoring system
Thank you for your
attention!
Visit http://www.clusterix.pcz.pl