Private Cloud at Wipro Cloud computing based on Condor
Transcription
Private Cloud at Wipro Cloud computing based on Condor
Private Cloud at Wipro Cloud computing based on Condor © 2009 Wipro Ltd - Confidential Agenda 1 Background 2 Wipro Private Cloud 3 System architecture 4 Use of Condor 2 © 2009 Wipro Ltd - Confidential Background Need: • • • • Share physical infrastructure between multiple projects and CoE's (Center of Excellence) to reduce server sprawl and number of physical labs Provide an environment for evaluating new technologies, developing solutions and enabling collaboration between multiple labs Centralize infrastructure procurement and management Reduce infrastructure cost of CoE's by enabling multiple development environments Solution: • • • • 3 Setup a private cloud for virtual compute and application infrastructure Build a self-service portal for on-demand provisioning to reduce process overheads Support multiple types of virtualization software Reuse existing physical infrastructure, procure minimal new infrastructure © 2009 Wipro Ltd - Confidential Wipro Private Cloud 4 © 2009 Wipro Ltd - Confidential Wipro Private Cloud Wipro Users SaaS User Intranet Developers SaaSApp SaaS Mgmt Virtual Lab SaaS Enablers Managed Network Managed Network Wipro Cloud Portal / Web Services API Layer Wipro Cloud Core • Automated Provisioning • Multi-tenancy & Isolation • Cloud Accounting & Auditing • Performance & Fault Monitoring • Automated Network & Security Physical Resource Pool - Servers, Storage, Network 5 © 2009 Wipro Ltd - Confidential Cloud OA&M Portal • Virtual Machines • Shared Storage • Virtual Appliances • Application Services Cloud Admin Cloud Services catalogue Service Element Service Feature Virtual desktop – equivalent to 1.2GHz,512MB RAM,10GB HDD,25Mbps N/w Compute Servers Low End Server – equivalent to 2x1.2GHz, 2GB RAM, 20GB HDD, 25Mbps N/w High End Server – equivalent to 4x1.2GHz, 4GB RAM, 40GB HDD, 25Mbps N/w OS types Storage Public images/ appliances 6 Linux (CentOS, RHEL) and Windows XP/Server on Intel x86, x86_64 architecture ISCSI (RAID 5), NFS and CIFS Data persistence across power-off, suspend & resume of VM’s Ready-to-use public images RHEL 5, Windows XP, LAMP (CentOS 5.2, Apache, Axis, Tomcat, MySQL, PHP, Python) Preconfigured Software load balancer, firewall appliances Network Isolation between CoE's resources IPSec, SSL based VPN Public and Private IP Addresses with NAT support Private images Can Upload VMware Server,VMware ESX and Xen Virtual Machine Image formats Reports Reporting on CPU, Storage and memory usage back to user © 2009 Wipro Ltd - Confidential Levels of Service • L1 - Virtual Servers on demand • Virtual servers, desktops, storage • Migration assistance • Self-service portal • L2 – Application infrastructure on demand • Appliances of standard software • Managed backup, proactive monitoring and help-desk • Itemized billing and charge-back • L3 – Business service infrastructure on demand • Scalable business services • Multi-tenant application infrastructure (content management, identity management, database, load balancer, firewall, ...) 7 © 2009 Wipro Ltd - Confidential System Architecture 8 © 2009 Wipro Ltd - Confidential Private Cloud – in Action Virtual M/c design •Standardize •Automate •Agile •Caching •Appliances Network Control Service Layer Load Balancer Service LB - Active LB - passive App Layer Inst 1 Inst 2 Inst n Virtual Machine Layer VM 1 VM 2 VM n Provisioning •Resource mgmt •Workload mgmt •Auto recovery •Task & Process Automation Bare-metal design •Standardize •Automate •Re-provisioning 9 Bare-metal Layer © 2009 Wipro Ltd - Confidential •Configuration & Change mgmt Alarms Monitoring •Performance •Availability •Alarms •Billing Cloud Management •SLA’s, Policies, rules, priorities •Packaging •Custom agents •Shared Services •Billing parameters OA&M Portal & Web Service Gateway Operations Monitoring •Design, Test •Package, Deploy Service design Customer OA & M Access Business Users Provisioning Developers Management •Service Governor •Policy enforcement •Incident mgmt •Optimizer •Contention Architecture & Service layers Cloud service 10 © 2009 Wipro Ltd - Confidential System Components Web Service Gateway Customer Portal Charge-back Service Governor Metrics Monitor Grid Scheduler VM Caching Workflow Manager Cloud State N/W Plugin Storage Plug-in Bare-metal Plug-in VM Plug-in Nagios plug-in N/W provisioning Storage provisioning Bare-metal provisioning VM provisioning N/W (nagios) Monitoring Legend: 11 Alerts Developed in Wipro In Development © 2009 Wipro Ltd - Confidential 3rd Party components VM Repo Identity Management Deployment Example Router, firewall VPN Server, IPS, IDS, NAT Project X 192.168.5.0/24 Project Y 192.168.6.0/24 VM Virtual Storage VM Cloud Backbone 10.201.72.0/24 Virtual Machines Project Z 192.168.7.0/24 VM Storage Isolated network per project Mgmt Server HA Pair Cloud physical systems Cloud Mgmt 192.168.3.0/24 12 © 2009 Wipro Ltd - Confidential Switch Fabric Use of Condor 13 © 2009 Wipro Ltd - Confidential Why Condor? • Trusty old features – – – – Flexibility – ClassAd mechanism, configurations and policies Web Services API High availability Resource utilization of jobs • Newer features we like – – – – VM Universe Partitionable Slots Lease management Integration with Amazon EC2 (public cloud) • Proven in large scale deployments • Condor-users and condor-admin support • Open source 14 © 2009 Wipro Ltd - Confidential How are we using Condor? • • • • • Mostly standard configuration A few custom class ads in jobs and machines Schedd and Collector configured in HA mode Condor spool for VM persistence Virtual machine provision request handled by Condor – VM job to physical machine match-making, file transfer • Partitionable slots for dynamic partitioning of physical machine resources • Customized condor_vm_* files for configuring and starting VM's – VLAN control, Swap disk and additional storage creation, ... • Lease management for limiting the number of running instances of a licensed image 15 © 2009 Wipro Ltd - Confidential Observations, Workarounds, Wish list Working with Condor: – With advanced Condor skills, a lot can be achieved without modifying condor code Workarounds: – – – – Passing number of virtual CPUs to VMware Patch to pass proxy username and password to gSOAP for EC2 integration Patch to get VM resource usage details on ESX Special configuration to handle 2 hour delay in detecting a few execute node failures (Thanks Todd!) Feature wish list: – – 16 Remote IWD support for VM universe, to avoid any file transfer Live migration of VM jobs © 2009 Wipro Ltd - Confidential Thank You [email protected] [email protected] © 2009 Wipro Ltd - Confidential