High-Productivity, Standards-Based Computing for Weather
Transcription
High-Productivity, Standards-Based Computing for Weather
Headline in Arial Bold 30pt High-Productivity, Standards-Based Computing for Weather Forecasting and Climate Modelling Dave Parry, Senior Vice President of Products, SGI Computers are beginning to catch up to ideas 64,000 “computers” working together Lewis Fry Richardson, Weather Prediction by Numerical Process, 1922 9/24/2007 Slide 2 Top 500 System Sizes with 8,192 or more core 140,000 120,000 100,000 80,000 64,000 60,000 40,000 (29 Total) 20,000 (3 Total) 0 2003 (4 Total) 2004 8,192 cores 9/24/2007 Slide 3 (19 Total) (10 Total) 2005 2006 >8,192 cores 2007 Recent trends in system deployment Top 500 Systems by Type Top 500 Weather/Environment Systems by Type 500 35 400 30 25 300 20 15 200 10 100 5 0 2003 2004 2005 2006 2007 0 2003 2004 2005 2006 2007 Year Year Weather/Environment System Size By Type 5500 Earth Simulator (Vector) 5000 Clusters 2500 MPP/Specialty 2000 SMP/Constellation 1500 Vector 1000 500 9/24/2007 Slide 4 0 2003 2004 2005 Year 2006 2007 Altix systems for Weather Forecasting • • • • • • • • • • • • • • • • • • • • Finnish Meteorological Institute – 304p KNMI (Netherlands) - 224p Hungarian Meteorological Service (144p) KMI (Belgium) – 56p Puertos del Estado (Spain) – 20p Catalan Meteorological Service (CESCA/MeteoCat) – 128p Roshydromet – 112p Romanian Met – 2p Meteo Croatia – 16p Desert Research Institute (DRI) – 72p NOAA NSSL – 64p BAMS – 24p INMET Brazil – 64p China Met. Administration – 22p China Met. Administration, Institute of Arid Meteorology – 20p Shanghai Meteorology Center – 64p Yunnan Meteorology Bureau – 80p Sichuan Weather Bureau – 192p Taiwan Central Weather Bureau – 28p Meteorological Service of New Zealand – 20p 9/24/2007 Slide 5 Altix systems for meteorology and climate research • NOAA GFDL –2560p Altix 3700 + 2560c Altix 4700– MOM4, AM, CM2.1 • University of Oceanography of China, Tsing Dao – 224p • Polar Research Inst. Of China – 64p • Nanjing UIST- 128p + 8p • U Tasmania/Antarctic CRC – 128p • CMMACS – 80p Altix 3700 BX2 & 350 – MOM4 • U of Waterloo – 64p Altix 3700 & 16p A350 • Universidad Complutense – 64p • Beijing Normal University, Climate Modeling Branch, State Lab of Remote Sensing Science – 56p • First Institute of Oceanography (China) – 56p • Georgia Tech – 48p • Institute of Desert Meteorology, China – 32p • Univ. of Florida- 32p • Harvard University – 28p • Univ. of Wisconsin CMISS – 24p • MIT Dept of Earth, Atmosphere and Planetary Science – 20p • Dalhousie University – 16p • NIO, Goa, India – 16p • Univ. of Utah – 16p • Woods Hole Oceanographic Institute – 16p • Univ. of Colorado Boulder - 12p • APAT -8p • Utrecht Univ – 8p Slide 6 Florida – 4p • 9/24/2007 Univ. of South • Florida Institute of Technology – 2p SGI systems for storage and data management in Weather Forecasting and climate modeling • INM (Spain) uses DMF and CXFS with Cray vector system • NRW (Queensland) uses DMF with Cray vector system • Environment Canada uses SGI servers for pre- and post-processing with IBM SP system • MeteoFrance uses DMF with NEC vector system 9/24/2007 Slide 7 Selected Large Altix Installations • NASA Columbia – 10,240p Altix 3700 constellation, 20 nodes + 512 core Altix 4700 – 2048p single NL fabric with 4 512p partitions + 16x512 (Madison 9M) • LRZ 9728p Altix 4700 – 19x512 core nodes, single NL fabric • • • • WPAFP (US DoD) – 9216p (18x512 core nodes), single NL fabric, Montecito TU Dresden – 2048 core Altix 4700, Montecito NOAA GFDL - 2560p Altix 3700 and Altix 3700 BX2 systems (Madison) + 2560 cores Altix 4700 (Montecito) and SGI largest DMF installation (~10 PB) APAC – 1936p Altix 3700/BX2, multiple partitions All use CXFS for shared filesystem 9/24/2007 Slide 8 Cluster Systems – Usage, Goals, Issues Top 3 System Goals Cluster Utilization by Industry 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 60% 50% 40% 30% 20% 10% 0% Digital GeoContent science Creation Average Weather Software Engineering Aggregate Aggregate Processor High Performance Performance Availability I/O Top 3 Current Issues 60% 50% 40% 30% Source: IDC 2007 20% 10% 9/24/2007 Slide 9 0% Facilities System Power & Cooling Management Application Complexity Escalating Computer Center Concerns • Cost : 1 MW costs $1M /year • Government – EPA Energy-Star Bill – EC Renewable Energies Unit “Code of Conduct”? • 2005, U.S. spent – $20.5B on computer equipment – $9.3B on electricity to run computers 9/24/2007 Slide 10 } } 339W 216W Weather & Environment Application Scaling Application Scaling 1400 1600 1200 1400 Simulation speedup Simulation Speed (Forecast hours/hour) WRF V2.1.2+, 12km CONUS WSM5 uphys 1000 800 600 400 200 Global Ocean Model, POP 0.1o 1200 1000 800 600 Regional Forecast Model 400 Globalo Ocean Model, WRF CONUS 5km POP 1 (970x720x37) 200 0 0 256 512 768 1024 1280 1536 1792 2048 Number of cores 0 0 400 800 Number of cores Altix 4700 MTC 1.6GHz/18M Altix XE310 Xeon 5355 2.67GHz Altix ICE Xeon 5355 2.67GHz 9/24/2007 1200 Slide 11 Source: SGI Internal Testing 1600 2000 What conclusions can we draw? • Weather & climate application scale better than most – 1,000+ cores feasible in global weather, regional weather, ocean modeling • Availability is much more important than in other markets – Forecasts must be delivered on time • SMP/Constellation systems are dominant – Production installations in both weather forecasting and climate research • Clusters are O.K. for research and small-scale production, but not for large, complex OWF – Maturity still not there for availability, system management, I/O – Progress is being made in each area – see SGI Altix ICE discussion 9/24/2007 Slide 12 Critical Issues - What is SGI doing? • Reliability - hardware, Linux, tool improvements • Scalability - hardware, Linux, I/O improvements • Achievable performance - low-latency interconnects & MPI,Linux improvements, IB storage • Facilities issues - more efficient power supplies, water cooling • System management - system management tools for 1,000s of nodes, parallel booting, lights-out error reporting 9/24/2007 Slide 13 SGI Linux Driving Performance Computing into the Community • SGI is behind only IBM and RedHat in total contributions to the Linux Community Of 16,678 “Copyright” instances in Linux-2.6.18.tar.gz (Sept20’06) : • • “IBM”, “International Business” or “ibm.” 676+86+180= 942 • “Red Hat”, “RedHat”, “redhat.” 438+2+244 = 684 • “SGI”, “Silicon Graphics”, “sgi.” 5+454+151 = 610 • “SuSE”, “SUSE”, “suse.” 67+2+422 = 491 • “HP”, “Hewlett Packard” & “Hewlett-Packard”, “hp.” 7+20+244+296= 467 SGI continues to drive the interests of our customers with Linux developers 9/24/2007 Slide 14 9/24/2007 Slide 15 LINUX OPERATING SYSTEM BIOS PROFILERS, DEBUGGERS LIBRARIES COMPILERS FILE SYSTEM STORAGE MANAGEMENT SYSTEM MANAGEMENT Today’s Linux Environment JOB SCHEDULER SGI Altix • Proven in production weather & climate environments • Single system based on commodity CPUs, memory, disks – – – – Single system management with up to 1,024 Intel Itanium2 cores Global shared memory up to 128TB Very low latency MPI Unified I/O with 10GB/sec I/O • Capable of running high-resolution simulations or multiple members of forecast ensembles 9/24/2007 Slide 16 produced by gsiCom - foto Kai Hamman. SGI Altix 4700 at LRZ. SGI Altix ICE • Merging cluster economics with HPC integration • Next generation blade solution – Density optimized: Up to 512 Intel Xeon 5300 cores (6+ TFLOPS) per rack – Power optimized: 76% rack level power efficiency (1.4x typical clusters), water cooling – Reliability optimized: Hot-swap N+1 power, hot-swap N+1 fans, diskless nodes, integrated dual-plane IB 4xDDR switches, parallel boot & system management, cable free blade enclosure for up to 128 core. – Performance optimized: IB RAID storage, O/S jitter control, dual IB 4XDDR 9/24/2007 Slide 17 Multi-Rack Pizza Box Cluster Multi-Rack SGI Altix ICE SGI Altix ICE • In standard cluster environments, the OS is not synchronized across nodes, this “jitter” can reduce overall performance. • SGI Altix ICE implements application transparent synchronication of the OS to improve overall performance Unsynchronized OS Noise => Wasted Cycles Process on: Node 1 Node 2 System Overhead Compute Cycles Node 3 Process on: Node 1 Node 2 Node 3 Wasted Cycles Wasted Cycles Wasted Cycles System Overhead Wasted Cycles System Overhead Barrier Complete System Overhead System Overhead System Overhead Synchronized OS Noise => Faster Results 9/24/2007 Wasted Cycles Wasted Cycles Slide 18 Time SGI Altix ICE • SGI Tempo System Management Environment – – – – – 9/24/2007 Hierarchical system management Server image management and provisioning Rapid parallel diskless booting System monitoring Designed to scale to 1,000s of nodes Slide 19 SGI® Server Technology Roadmap 9/24/2007 Slide 20 Altix 4700/450 SHUB2/NUMAlink4 IA64 Montvale RASC FPGA Future Project UV NUMAlink 5 UV HUB Xeon Itanium Altix ICE Altix Altix 4700/450 SHUB2/NUMAlink™4 IA64 Montecito RASC™ FPGA RC100 2008 Altix ICE 8200 IB DDR4X fabric Ultradense, cool Pkg SGI® Tempo Project Carlsbad+ IB DDR4X fabric FPGA co-processing Cluster Mgmt Advances Altix XE Red Hat® or SUSE® Linux with SGI ProPack™ Standard Rack mount Servers & Clusters SGI Data and System Management SW Integrated HPC Blade MPP/Cluster Solution Intel® Xeon® Enterprise Class SMP Blade Solution Intel Itanium® Today Altix XE 210/240 Altix XE 1200 Cluster Altix XE 310 Altix XE 1300 Cluster MS Windows CCS Project Dixon Dixon Cluster Project Gallup 2 Gallup 2 Cluster MS Windows CCS Project Carlsbad2 IB Future Future Altix XE Project Gallup 3 Gallup 3 Cluster Industrial Strength Linux Environment LINUX OPERATING SYSTEM 9/24/2007 Slide 21 BIOS PROFILERS, DEBUGGERS LIBRARIES COMPILERS FILE SYSTEM STORAGE MANAGEMENT SYSTEM MANAGEMENT JOB SCHEDULER 9/24/2007 Slide 22 LINUX OPERATING SYSTEM BIOS PROFILERS, DEBUGGERS LIBRARIES COMPILERS FILE SYSTEM STORAGE MANAGEMENT SYSTEM MANAGEMENT FY09 – Shared Workload JOB SCHEDULER Project UV Overview Bringing Fast MPI and Scalable Shared-Memory to x86-64 •Scale-up capability in an accessible, industry-standard platform •Versatile Performance for data-intensive or very large-scale workloads •Robust, reliable operation with complete software solution on industry standard Linux •Affordable acquisition and operation costs 9/24/2007 Slide 23 UV HUB/Node Controller Features Unmatched MPI and big-data capabilities •Enabling Enterprise-class scalability and reliability on x86-64 •Cache-coherence across nodes •Fault resiliency •Extensive fault isolation, datapath protection, monitoring/debug functions •Accelerating Large-scale workloads •Fast Message-Passing •Extends cpu capability for load requests •System scale to 256+ sockets, 2048+ cores •Accelerating Data-intensive applications •Extended physical memory address •Extended TLB page size •Off-load instructions 9/24/2007 Slide 24 UV Multi-paradigm Architecture Socket-attach Co-processors Globally GloballyShared SharedMemory Memory [S] calar Intel Xeon Intel Itanium [V] ector [P] IM memory ops Proc in memory [A] pp-Specific Graphics - GPU Signals - DSP Prog’ble - FPGA Accelerator – ClearSpeed, Cell [A] [A] [A] [A] [S] [S] Intel[S] Intel[S] Intel Intel PI PI QuickPath/CSI PI PI PI GRU [V] GRUPINI [V] PI PI SGI 180 nm TIO 90 nm [V] GRU [V] GRUPI 90 nm PI NI PI SGI SHub2 180 nm NI TIO 90 nm [V] GRU [V] GRU NI 90 nm PI [P] AMU [P] AMU SGI MI SHub2 180 nm NI TIO 90 nm [V] GRU [V] GRU NI 90 nm [P]MIAMU [P]MIMI AMU SGI SHub2 180 nm NI TIO 90 nm 90 nm [P]MIAMU [P]MIMI AMU NI SHub2 [P]MIAMU [P]MIMIAMU MI MI NUMAlink Interconnect Fabric 9/24/2007 Slide 25 2 Skt UV Node UV Multi-Paradigm Architecture I/O Attached Co-processors Globally GloballyShared SharedMemory Memory [S] [S] Intel[S] Intel[S] IO [A] IO [A] IO [A] IO [A] [S] calar Intel Xeon Intel Itanium [V] ector [P] IM memory ops Proc in memory [A] pp-Specific Graphics - GPU Signals - DSP Prog’ble - FPGA Accelerator – ClearSpeed, Cell Intel Intel [S] [S] Intel[S] Intel[S] Intel Intel PI PI PI PI PI GRU [V] GRUPINI [V] PI PI SGI 180 nm TIO 90 nm [V] GRU [V] GRUPI 90 nm PI NI PI SGI SHub2 180 nm NI TIO 90 nm [V] GRU [V] GRU NI 90 nm PI [P] AMU [P] AMU SGI MI SHub2 180 nm NI TIO 90 nm [V] GRU [V] GRU NI 90 nm [P]MIAMU [P]MIMI AMU SGI SHub2 180 nm NI TIO 90 nm 90 nm [P]MIAMU [P]MIMI AMU NI SHub2 [P]MIAMU [P]MIMIAMU MI MI NUMAlink NUMAlinkInterconnect Interconnect Fabric Fabric 9/24/2007 Slide 26 2 Skt UV Node Energy Efficiency : Rack Level Rack UV stretch goal 80% Net (all-in) Rack Energy Efficiency Roadmap (N.B. even higher efficiency if no water-coil) 78% 75% 75% 70% 70% 65% 65% 60% 60% 55% 9/24/2007 Slide 27 Origin 2000 Origin 3000 Altix 3000 Altix 4000 Carlsbad Ultraviolet Altix Æ ICE Æ UV Cooling Solution Integrated Water Cooled Option (2) 18-Receptacle Power-Strips (4) Hinged Water-Cooled Coils Rack Chilled-Water Supply 45°F to 60°F (7.2°C to 15.6°C) 14.4 gpm (3.3 m3/hr) Max. (2) 60A 200-240VAC 3-Phase IEC 60309 Plugs 9/24/2007 Slide 28 Rear View UV Software in Development Assuring a Complete Solution •Linux OS Community Features to Support UV •Key items already submitted to assure adoption by UV launch •Drivers, APIs •UV HUB/Node Controller Feature Enablement •System Management, Integration •Console •Monitoring, debug •Partitioning •Integration with storage, data sharing across UV and other systems •RAS – enable resiliency features of UV HUB + advanced memory RAS •Unified Parallel C Source-to-source translator •On Intel or GCC compiler •Ongoing system management, MPT and other Propack advances 9/24/2007 Slide 29 2010 – Hybrid System with Isle GPU XEON CO-PROC LINUX OPERATING SYSTEM 9/24/2007 Slide 30 BIOS PROFILERS, DEBUGGERS UV LIBRARIES ITANIUM COMPILERS FILE SYSTEM STORAGE MANAGEMENT SYSTEM MANAGEMENT JOB SCHEDULER Conclusion • SMP/Constellation systems own the majority of production weather forecasting & environmental research. • Cluster systems bring further improvements in economics, but don’t meet production requirements for large systems. • Next generation HPC cluster systems like SGI Altix ICE will enhance current density, power, reliability and performance characteristics. • Future systems will bring enterprise RAS and scalability to HPC systems for weather forecasting and environmental research – but without the enterprise cost. • SGI intends to remain at the technology forefront and drive this evolution. 9/24/2007 Slide 31
Similar documents
SGI® with NVIDIA® Quadro®
SGI® UV™ 300 and UV™ 30EX: Newly enhanced SGI UV 300 and SGI UV 30EX servers are designed for data-intensive, I/O heavy workloads such as data analytics, visualization, and real-time streaming. Fea...
More informationSGI UV 300, UV 30EX: Big Brains for No
SGI UV 300 delivers unparalleled Intel performance with optimum flexibility. Providing a high memory to processor ratio, the system’s x86 architecture now features Intel® Xeon® E7-8800 v4 and E7-48...
More information