New CSC computing resources

Transcription

New CSC computing resources
New CSC computing resources
Per Öster, CSC – IT Center for Science Ltd.
[email protected]
Outline
CSC at glance
New Kajaani Data Centre
Finland’s new supercomputers
– Sisu (Cray XC30)
– Taito (HP cluster)
CSC resources available for researchers
25.01.2013
CSC presentation @ JYFL
CSC’s Services
FUNET Services
Computing Services
Application Services
Data Services for
Science and Culture
Information
Management Services
25.01.2013
CSC presentation @ JYFL
Universities
Polytechnics
Ministries
Public sector
Research centers
Companies
CSC at glance
Founded in 1971
– technical support unit for Univac 1108
Connected Finland to Internet in 1988
Reorganized as a company, CSC –
Scientific Computing Ltd. in 1993
All shares to the Ministry of Education
and Culture of Finland in 1997
Operates on a non-profit principle
Facilities in Espoo and Kajaani
Staff ~250 people
Turnover 2011 27.3 million euros
25.01.2013
CSC presentation @ JYFL
Users
About 700 active computing projects
– 3000 researchers use CSC’s computing capacity
– 4250 registered customers
Haka identity federation covers 100% of
universities and higher education
institutes (287 000 users)
Funet - Finnish research and education network
– Total of 360 000 end users
25.01.2013
CSC presentation @ JYFL
Users of computing resources by discipline 3Q/2012
(total 1305 users)
Biosciences
37
31
27 25
Physics
141
389
44
Nanoscience
Chemistry
Language research
123
Grid usage
Computational fluid
dynamics
Computational drug design
126
129
233
Earth sciences
Engineering
Other
Usage of processor time by discipline 3Q/2012
(total 2012 96 million cpuh)
Physics
2%
6%
3% 2%
1%
Nanoscience
2%
35%
6%
Chemistry
Biosciences
Astrophysics
11%
Grid usage
Environmental sciences
32%
Computational fluid
dynamics
Computational drug
design
Other
Usage of processor time by organization 3Q/2012
(total 2012 96 million cpuh)
2%
6%
8%
2%
University of Helsinki
1%
1%
Aalto University
26%
8%
University of Jyväskylä
CSC (Grand Challenge)
Tampere University of Technology
Lappeenranta University of
Technology
CSC (PRACE)
9%
14%
23%
University of Oulu
University of Turku
CSC (HPC-Europa2)
Other
Largest JY Projects 2012
Jyväskylän yliopisto
CPUh
Karoliina Honkala Nanokatalyysi metallipinnoilla
Nanotiede
Nanohiukkasten elektroniset, magneettiset,
Hannu Häkkinen optiset ja kemialliset ominaisuudet
Nanotiede
Olli Pentikäinen
Filamins (Fil)
Olli Pentikäinen
Tuomas Lappi
Laskennallinen rakennetutkimus
Biotieteet
RIALM-rapid identification of active ligans
molecules
Lääkeainesuunnittelu
Kvantti-ilmiöt ja niiden kontrolli elektronisissa
nanorakenteissa
Nanotiede
QCD suurilla energioilla relativistisissa
raskasionitörmäyksissä
Fysiikka
Jussi Toivanen
Robert van
Leeuwen
FIDIPRO-projekti
Time-dependent correlated quantum
transport throught nanostructures
…
….
….
Olli Pentikäinen
Esa Räsänen
Biotieteet
%
4690904.94
4.88 37.18
3277382.42
3.41 25.98
1249305.1
1.3
9.9
929121.7
0.97
7.36
657792.75
0.68
5.21
581023.32
0.6
4.61
365875.28
0.38
2.9
Fysiikka
200496.82
0.21
1.59
Fysiikka
196733.75
0.2
1.56
…
…
…
12616824.06 13.11
100
Yht.
25.01.2013
%
CSC presentation @ JYFL
FUNET and Data services
FUNET
– Connections to all higher education institutions in Finland and
for 37 state research institutes and other organizations
– Network Services and Light paths
– Network Security – Funet CERT
– eduroam – wireless network roaming
– Haka-identity Management
– Campus Support
– The NORDUnet network
Data services
– Digital Preservation and Data for Research
Data for Research (TTA), National Digital Library (KDK)
International collaboration via EU projects (EUDAT,
APARSEN, ODE, SIM4RDM)
– Database and information services
Paituli: GIS service
Nic.funet.fi – freely distributable files with FTP since 1990
CSC Stream
Database administration services
– Memory organizations (Finnish university and polytechnics
libraries, Finnish National Audiovisual Archive, Finnish
25.01.2013
CSC Gallery)
presentation @ JYFL
National Archives, Finnish National
CSC and High Performance Computing
25.01.2013
CSC presentation @ JYFL
CSC Computing Capacity 1989–2012
Standardized
processors
Cray XC30
max. capacity (80%)
capacity used
10000
Cray XT5
1000
Cray T3E
expanded
Cray T3E (224 proc)
(192 proc)
100
HP CP4000BL Proliant 465c
DC AMD
IBM SP Power3
Compaq Alpha cluster (Clux)
SGI upgrade
Convex 3840
SGI Origin 2000
IBM upgrade
SGI R4400
SGI upgrade
Murska
decommissioned
6/2012
Sun Fire 25K
IBM SP2
IBM SP1
HP Proliant
SL230s
HP DL 145 Proliant
10
Cray X-MP/416
HP CP4000
BL Proliant
Cray XT4 QC 6C AMD
Cray T3E expanded
(512 proc)
Cray C94
1
Cray XT4 DC
IBM eServer Cluster 1600
Two Compaq Alpha Servers
(Lempo and Hiisi)
Federation HP switch
upgrade on IBM
Cray T3E decommissioned
12/2002
Clux and Hiisi decommissioned
2/2005
Cray T3E
IBM SP2 decommissioned
(64 proc)
1/1998
Convex C220
1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
50. Top500 rating
1993–2012
100.
The Top500 lists
http://www.top500.org/
were started in 1993.
Cray T3E
IBM eServer p690
Cray XT4/XT5
Cray XC30
150.
250.
HP Proliant SL230s
HP Proliant 465c DC
200.
SGI Power Challenge
IBM SP2
SGI Origin 2000
300.
350.
Cray C94
400.
25.01.2013
HP Proliant 465c 6C
IBM SP1
Digital AlphaServer
450.
500.
IBM SP Power3
Cray X-MP
Convex C3840
CSC presentation @ JYFL
THE NEW DATACENTER
25.01.2013
CSC presentation @ JYFL
New site drivers
• Capacity limiting growth
• Costs increasing
– Tax
– Energy
– Carbon neutral
• New site focused on
efficiency and capacity
estimate
11GWh
CSCs Datacenter Energy Use 2005 - 2011
10
9
DC 2 max.
DC 2 Infra
DC 2 IT
DC 1 Infra
DC 1 IT
8
7
DC 1 max.
6
Energy GWh
– Super computers
– Managed hosting
– Diversity of requirements
2012
5
4
3
2
1
0
2005
2006
2007
2008
2009
+2,2GWh/y as leased capacity!
2010
2011
25.01.2013
CSC presentation @ JYFL
25.01.2013
CSC presentation @ JYFL
Power distribution
(FinGrid)
Last site level
blackout in the
early 1980s
25.01.2013
CSC presentation @ JYFL
CSC started ITI
Curve monitoring
early Feb-2012
The machine hall
25.01.2013
CSC presentation @ JYFL
Sisu (Cray) supercomputer housing
25.01.2013
CSC presentation @ JYFL
SGI Ice Cube R80
hosting Taito (HP)
Starting with one head unit (Vestibule) and two
expansion modules ; extra capacity can be increased
by introducing more expansion units.
Thanks to dozens of automated cooling fans, the
energy needed for cooling can be adjusted very
accurately as IT capacity is increased gradually.
Assembly in Italy
Internal temperature setpoint 27OC (ASHRAE) and
occasionally ASHRAE tolerated (27-30OC) during
possible summer heat waves.
As long as outdoor temperatures are less than 28OC,
Unit does nothing but free cooling. During heat waves
extra water and some chillers possibly needed.
During winter, the exhaust (warm) air is
re-circulated to warm up the incoming air.
25.01.2013
CSC presentation @ JYFL
SGI Ice Cube R80
25.01.2013
CSC presentation @ JYFL
Data center specification
2.4 MW combined hybrid capacity
1.4 MW modular free air cooled datacenter
– Upgradable in 700 kW factory built modules
– Order to acceptance in 5 months
– 35 kW per rack (extra tall) – 12 kW common in
industry
– PUE forecast < 1.08 (pPUEL2,YC)
1MW HPC datacenter
– Optimised for Cray super & T-Platforms
prototype
– 90% Water cooling
25.01.2013
CSC presentation @ JYFL
CSC NEW SUPERCOMPUTERS
25.01.2013
CSC presentation @ JYFL
Overview of New Systems
Phase 1
Cray
Deployment Dec. 2012
CPU
Phase 2
HP
Cray
Now
Intel Sandy Bridge
2x8 cores @ 2.6 GHz
HP
Probably 2014
next generation processors
Aries
≤10GB/s
FDR
InfiniBand
≤7GB/s
Aries
EDR
InfiniBand
(100 Gbps)
Cores
11776
9216
~ 40 000
~ 17 000
Tflops
244
(2.4x Louhi)
180
(5x Vuori)
1700
(16x Louhi)
515
(15x Vuori)
Interconnect
Tflops
total
25.01.2013
424 (3.6xCSCLouhi)
presentation @ JYFL
2215 (20.7x Louhi)
IT summary
Cray XC30 supercomputer (Sisu)
– Fastest computer in Finland
– Phase 1: 385 kW, 244 Tflop/s
– Very high density, large racks
T-Platforms prototype (Phase 2)
– Very high density hot-water cooled rack
– Intel processors, Intel Xeon Phi and NVIDIA Kepler
accelerators
– Theoretical 400 TFlops performance
25.01.2013
CSC presentation @ JYFL
IT summary cont.
HP (Taito)
– 1152 Intel CPUs (sockets)
– 180 TFlop/s
– 30 kW 47 U racks
HPC storage
– 1 + 1.4 + 1.4 = 3.8 PB of fast parallel storage
– Aggregate BW 20 -> 48.8 -> 77.6 GB/s
– Supports Cray and HP systems
25.01.2013
CSC presentation @ JYFL
Features
Cray XC30
– Completely new system design
Departure from the XT* design (2004)
– First Cray with Intel CPUs
– High-density water-cooled chassis
~1200 cores/chassis
– New ”Aries” interconnect
HP Cluster
– Modular SL-series systems
– Mellanox FDR (56 Gbps) Interconnect
25.01.2013
CSC presentation @ JYFL
CSC new systems: What’s new?
Sandy Bridge CPUs
– 4->8 cores/socket
– ~2.3x Louhi flops/socket
256-bit SIMD instructions (AVX)
Interconnects
– Performance improvements
Latency, bandwidth, collectives
– 0.25-3.7µs, ≤10 GB/s
One-sided communication
– New topologies
25.01.2013
Cray: ”Dragonfly”: Islands of 2D Meshes
HP: Islands of CSC
fat presentation
trees @ JYFL
Cray Dragonfly Topology
All-to-all network
between groups
2 dimensional
all-to-all network
in a group
Optical uplinks to
inter-group net
Source:
Robert Alverson, Cray
Hot Interconnects 2012 keynote
25.01.2013
CSC presentation @ JYFL
Cray environment
Typical Cray environment
Compilers: Cray, Intel and GNU
Debuggers
– Totalview, tokens shared between HP and Cray
Cray MPI
Cray tuned versions of all usual libraries
SLURM
Module system similar to Louhi
Default shell now bash (previously tcsh)
25.01.2013
CSC presentation @ JYFL
HP Environment
Compilers: Intel, GNU
MPI libraries: Intel, mvapich2, OpenMPI
Batch queue: SLURM
New more robust module system
– Only compatible modules shown with module avail
– Use module spider to see all
Default shell now bash (used to be tcsh)
Disk system changes
25.01.2013
CSC presentation @ JYFL
Core development tools
Intel XE Development Tools
– Compilers
C/C++ (icc), Fortran (ifort), Cilk+
– Profilers and trace utilities
Vtune, Thread checker, MPI checker
– MKL numerical library
– Intel MPI library (only on HP)
Cray Application Development Environment
GNU Compiler Collection
TotalView debugger
25.01.2013
CSC presentation @ JYFL
Performance of numerical libraries
DGEMM 1000x1000 Single-Core Performance
30.00
Turbo Peak (when only 1 core is used) @ 3.5GHz * 8 Flop/Hz
25.00
Peak @ 2.7GHz * 8 Flop/Hz
GFlop/s
20.00
Sandy Bridge
2.7GHz
Opteron Barcelona
2.3GHz (Louhi)
15.00
Peak @
10.00
2.3GHz * 4 Flop/Hz
5.00
0.00
ATLAS 3.8
RedHat 6.2 RPM
ATLAS 3.10
ACML 5.2
Ifort 12.1
matmul
MKL 12.1
LibSci
ACML 4.4.0
MKL 11
MKL the best choice on Sandy Bridge, for now.
(On Cray, LibSci will likely be a good alternative)
25.01.2013
CSC presentation @ JYFL
Compilers, programming
Intel
– Intel Cluster Studio XE 2013
– http://software.intel.com/en-us/intel-clusterstudio-xe
GNU
– GNU-compilers, e.g. GCC 4.7.2.
– http://gcc.gnu.org/
Intel can be used together with GNU
– E.g. gcc or gfortran + MKL + IntelMPI
mvapich2 MPI-library also supported
– It can be used that Intel or GNU
25.01.2013
CSC presentation @ JYFL
Available applications
Ready:
– Taito: Gromacs, NAMD, Gaussian, Turbomole,
Amber, CP2K, Elmer, VASP
– Sisu: Gromacs, GPAW, Elmer, VASP
CSC offers ~240 scientific applications
– Porting them all is a big task
– Most if not all (from Vuori) should be available
Some installations upon request
– Do you have priorities?
25.01.2013
CSC presentation @ JYFL
Porting strategy
At least recompile
– Legacy binaries may run, but not optimally
– Intel compilers preferred for performance
– Use Intel MKL or Cray LibSci (not ACML!)
http://software.intel.com/sites/products/mkl/
– Use compiler flags (i.e. -xhost -O2 (includes –xAVX))
Explore optimal thread/task placement
– Intra-node and internode
Refactor the code if necessary
– OpenMP/MPI workload balance
– Rewrite any SSE assembler or intrinsics
25.01.2013
CSC presentation @ JYFL
Modules
Some software installations are conflicting
with each other
– For example different versions of programs
and libraries
Modules facilitate the installation of
conflicting packages to a single system
– User can select the desired environment and
tools using module commands
– Can also be done "on-the-fly"
25.01.2013
CSC presentation @ JYFL
Key differencies (Taito vs. Vuori)
module avail shows only those modules
that can be loaded to current setup (no
conflicts or extra dependencies)
– Use module spider to list all installed
modules and solve the conflics/dependencies
No PrgEnv- modules
– Changing the compiler module switches also
MPI and other compiler specific modules
25.01.2013
CSC presentation @ JYFL
Disks at Espoo
25.01.2013
CSC presentation @ JYFL
Kajjaani
60-70% capacity
for users
Appl
LOT3 1+1+2PB=>
70% = 716 TB+
716 TB+
1 433 TB
Cloud
min.
300GB
SUPER 11
Scratch
WRKDR
Going to
oblivion...
Work
Appl
Work2
Appl
Work
Existing gear in Espoo
SAN
SAN
Lustre
servers
Lustre
servers
Murska
DDN
Infiniband
Global Lustre Servers
LOT2 Cluster capacity
LOT1
Infiniband
Louhi
Login
Nodes
Cloud capacity
SL8500
iSCSI
servers
Aurinkokunta
NFS
servers
LAN
NFS
Extreme
SSD?
Vuori
Login
Nodes
IDA
PROJ
Hippu1-4
Internet
NFS
LAN
????
Valopolku™
Slow but possible way of showing remote exports
10-15ms lag
SAN
DATA 11
IDA
PROJ
DATA 11
HA -NFS
servers
Backup
iRods Data
synchronisation
HA -NFS
servers
Existing Gear
Transported to K
Backup
DATA11 Lot2
(DMF)
SAN
VSP
25.01.2013
CSC presentation @ JYFL
Home
2TB!
General
HUS
Backup
spool
TTA
HUS
Kvault
irods
Evault
irods
Evault
irods
Tape
Robot
HUS
T-finity 5 PB=>
Backup
spool
Home
2TB!
10K
Disks at Kajaani
sisu.csc.fi
taito.csc.fi
login
nodes
login
nodes
Your
workstation
SUI
iRODS client
compute
nodes
compute
nodes
$WRKDIR
New tape
$ARCHIVE in
Espoo
$HOME
$TMPDIR
$TMPDIR
$TMPDIR
$TMPDIR
$TMPDIR
iRODS interface
disk cache (Kajaani &
Espoo)
icp, iput, ils, irm
$USERAPPL → $HOME/xyz
25.01.2013
icp
CSC presentation @ JYFL
Disks
2.4 PB on DDN
– New $HOME directory (on Lustre)
– $WRKDIR (not backed up), soft quota ~ 5 TB
$ARCHIVE ~1 - 5 TB / user, common between Cray
and HP
Disk space through IDA
–
–
–
–
1 PB for Univerisities
1 PB for Finnish Academy (SA)
1 PB to be shared between SA and ESFRI
Additional 3 PB available later on
/tmp (around 2 TB) to be used for compiling codes
25.01.2013
CSC presentation @ JYFL
Moving files, best practices
tar & bzip first
rsync, not scp
– rsync -P [email protected]:/tmp/huge.tar.gz .
Blowfish may be faster than AES (CPU bottleneck)
Funet FileSender (max 50 GB)
– https://filesender.funet.fi
– Files can be downloaded also with wget
Coming: iRODS, batch-like process, staging
IDA
CSC can help to tune e.g. TCP/IP parameters
– http://www.csc.fi/english/institutions/funet/networkservices/pert
FUNET backbone 10 Gbit/s
25.01.2013
CSC presentation @ JYFL
IDA storage service
Common storage service for research
Safe storage with quaranteed integrity and
duplication management for both data and
metadata
IDA service promise
 The IDA service quarantees data storage till at least end of year 2017.






By that timeline it will be decided if IDA or some other service solution
(for example Long Term Storage solution) continues to serve as data
storage. If some other solution is available, the data will be
automatically transferred to this service, no interaction needed from
users.
The IDA service quarantees at least 3 petabytes stporage capacity.
When transferring data to IDA, some metadata will automatically be
attached to the data.
During this time period, there are no costs to the user within agreed
service policy.
Owners of data can decide on data use, openness and data policy.
TTA strongly recommends clear ownership and IPR management for all
datasets
After 2017, metadata attached to datasets has to be more extensive
than minumum metadata.
Datasets served by TTA
Projects funded by Finnish Academy (akatemiahankkeet,
huippuyksiköt, tutkimusohjelmat and
tutkimusinfrastruktuurit)
– 1 PT capacity
Universities and Polytechnics
– 1 PT capacity
ESFRI-projects (tex. BBMRI, CLARIN)
Other important research projects via
special application process
Contact at JY: Antti Auer
SA
hankke
et 1 PB
Korkea
-koulut
1 PB
ESFRIt, FSD,
pilotit ja
lisäosuudet
1 PB
IDA Interface
ARCHIVE, dos and don’ts
Don’t put small files in $ARCHIVE
–
–
–
–
Small files waste capacity
Less than 10 MB is small
Keep the number of files small
Tar and bzip files
Don’t use $ARCHIVE for incremental backup
(store, delete/overwrite, store, …)
– Space on tape is not freed up until months or years!
Maximum file size 300GB
Default quota 2 TB per user, new likely up to 5 TB
New ARCHIVE being installed, consider if you
really need all your old files. Transfer from old to
new needed.
25.01.2013
CSC presentation @ JYFL
Use Profiles
Taito (HP)
Sisu (Cray XE30)
– Serial and parallel
upto about 256
cores (TBD)
25.01.2013
– Parallel up to
thousands of cores
– Scaling tests
CSC presentation @ JYFL
Queue/server policies
Longrun queue has drawbacks
– Shorter jobs can be chained
Apps that can’t restart/write checkpoint?
– Code you use to run very long jobs?
Large memory jobs to Hippu/HP big
memory nodes
– Think about memory consumption
Minimum job size in Cray
– Your input?
25.01.2013
CSC presentation @ JYFL
Documentation and support
User manual being built, FAQ here:
– https://datakeskus.csc.fi/en/web/guest/faq-knowledge-base
– Pilot usage during acceptance tests
User documetation’s link collection
– http://www.csc.fi/english/research/sciences/chemistry/intro
Porting project
– All code needs to be recompiled
– Help available for porting your code
List of first codes, others added later, some upon request
User accounts
– HP: recent vuori users moved automatically
– Cray: recent Louhi users moved automatically
– Others: request from [email protected] with current contact information
25.01.2013
CSC presentation @ JYFL
Grand Challenges
Normal GC call (deadline 28.01.2013)
– new CSC resources available for a year
– no bottom limit for number of cores
Special GC call (mainly for Cray) (deadline
28.01.2013)
– possibility for short (day or less) runs with the
whole Cray
– What do you need?
Remember also PRACE/DECI
– http://www.csc.fi/english/csc/news/customerinfo/DECI1
0callopen
25.01.2013
CSC presentation @ JYFL
How to prepare for new systems
Participate in system workshops
Try Intel/GNU compiler in advance, PGI upon request
Check if your scripts/aliases need fixing (bash)
A lot of resources available in the beginning: prepare
ahead what to run!
The traditional wisdom about good application
performance will still hold
– Experiment with all compilers and pay attention on finding
good compiler optimization flags
– Employ tuned numerical libraries wherever possible
– Experiment with settings of environment variable that control
the MPI library
– Mind the I/O: minimize output, checkpoint seldom
25.01.2013
CSC presentation @ JYFL
Accelerators
Add-on processors
Graphics Processing Units
– de facto accelerator technology
today
Fast memory
Many lightweight cores
Influencing general purpose
computing
Accelerator
Memory
25.01.2013
CPU(s)
PCIe
bus
CSC presentation @ JYFL
InfiniBand
node(s)
Evolution of CPU and GPU
performance
Memory bandwidth
Energy efficiency
180.00
16
160.00
14
140.00
12
Peak GFlop/s watt
Gbyte/s
120.00
100.00
80.00
60.00
8
2
20.00
0.00
0
2008
2009
2010
2011
GPU
CPU
6
4
40.00
25.01.2013
10
2012
CSC presentation @ JYFL
Future directions in parallel
programming
MPI-3 standard being finalized
– Asynchronous collective communication etc.
Partitioned Global Address Space (PGAS)
– Data sharing via global arrays
– Finally starting to see decent performance
– Most mature: Unified Parallel C, Co-Array
Fortran (in Fortran 2008), OpenSHMEM
Task Dataflow-based parallel models
– Splits work into a graph (DAG) of tasks
– SmpSs, DAGUE, StarPU
25.01.2013
CSC presentation @ JYFL
CSC PRESENT RESOURCES
AVAILABLE FOR
RESEARCHERS
25.01.2013
CSC presentation @ JYFL
Currently available computing resources
Massive computational challenges: Louhi
– > 10 000 cores, >11TB memory
– Theoretical peak performance > 100 Tflop/s
HP-cluster Vuori
– Small and medium-sized tasks
– Theoretical peak performance >40 Tflop/s
Application server Hippu
– Interactive usage, without job scheduler
– Postprocessing, e.g. vizualization
FGI
25.01.2013
CSC presentation @ JYFL
Novel resources at CSC
Production (available for all Finnish
researchers)
– Vuori: 8 Tesla GPU nodes
– FGI: 88 GPUs (44 Tesla 2050 + 44 Tesla 2090)
GPU nodes located at HY, Aalto, ÅA, TTY
Testing (primarily for CSC experts)
– Tunturi: Sandy Bridge node, cluster
Porting to AVX instruction set
– Mictest: Intel MIC prototype node
Several beta cards
25.01.2013
CSC presentation @ JYFL
Old capacity decommissions
Louhi decommissioned after new Cray is
up and running
– probably fairly short overlap
Vuori decommission TBD
Think ahead and plan if you have data to
transfer!
25.01.2013
CSC presentation @ JYFL
NX for remote access
Optimized remote desktop access
– Near local speed application responsiveness over high
latency, low bandwidth links
Customized launch menus offer direct access
CSC supported applications
Working session can saved and restored at the
next login
Further information:
http://www.csc.fi/english/research/software/freenx
25.01.2013
CSC presentation @ JYFL
NX screenshot
25.01.2013
CSC presentation @ JYFL
Customer training
Taito (HP)
– Taito cluster workshop(s) in Q1, 2013
Sisu (Cray)
– February 26 - March 1, 2013 (mostly for pilot
users, open for everyone)
– May 14 - May 17 (for all users, a PATC course,
i.e. expecting participants from other countries
too)
25.01.2013
CSC presentation @ JYFL
CSC Cloud Services
25.01.2013
CSC presentation @ JYFL
Three service models of cloud computing
Software
SaaS
Operating systems
PaaS
Computers and
networks
IaaS
25.01.2013
CSC presentation @ JYFL
Example: Virtualization in Taito
Taito cluster:
two types of nodes, HPC and cloud
HPC
node
HPC
node
Cloud node
Host OS: RHEL
Virtual machine
• Guest OS:
Ubuntu
Cloud
node
25.01.2013
CSC presentation @ JYFL
Virtual machine
• Guest OS:
Windows
Traditional HPC vs. IaaS
Traditional HPC environment
Cloud environment
Virtual Machine
Operating system
Same for all: CSC’s cluster OS
Chosen by the user
Software
installation
Done by cluster administrators
Customers can only install software to their
own directories, no administrative rights
Installed by the user
The user has admin rights
User accounts
Managed by CSC’s user administrator
Managed by the user
Security e.g.
software patches
CSC administrators manage the common
software and the OS
User has more responsibility:
e.g. patching of running
machines
Running jobs
Jobs need to be sent via the cluster’s Batch
Scheduling System (BSS)
The user is free to use or not
use a BSS
Environment
changes
Changes to SW (libraries, compilers) happen. The user can decide on
versions.
Snapshot of the
environment
Not possible
Performance
25.01.2013
Can save as a Virtual Machine
image
Performs well for a variety of tasks
CSC presentation @ JYFL
Very small virtualization overhead
for most tasks, heavily I/O bound
and MPI tasks affected more
Cloud: Biomedical pilot cases
Several pilots (~15)
Users from several institutions, e.g.
University of Helsinki, Finnish Institute for
Molecular Medicine and Technical
University of Munich
Many different usage models, e.g.:
Extending existing cluster
Services run on CSC IaaS by university IT
department for end users (SaaS for end users)
25.01.2013
CSC presentation @ JYFL
Conclusion
25.01.2013
CSC presentation @ JYFL
Conclusions
Performance
comparison
The most powerful
computer(s) in
Finland
25.01.2013
60
taito
50
Sisu
(estimated)
FGI
40
ns/day
– Per core
performance
~2 x compared
to Vuori/Louhi
– Better interconnects
enhance scaling
– Larger memory
– Smartest collective
communications
dhfr benchmark, 30k atoms, PME,
Gromacs
vuori
30
louhi
20
10
0
0
10
20
cores
CSC presentation @ JYFL
30
40
Sisu&Taito vs. Louhi&Vuori vs. FGI vs. Local Cluster
Availability
CPU
Sisu&Taito
(Phase 1)
Louhi&Vuori
1Q/2Q 2013
Available
Intel Sandy AMD Opteron 2.3 GHz
Barcelona and 2.7
Bridge, 2 x 8
GHz Shanghai / 2.6
cores, 2.6 GHz, GHz AMD Opteron
Xeon E5-2670
and Intel Xeon
FGI
Merope
Available Available
Intel Xeon,
2 x 6 cores,
2.7 GHZ, X5650
Interconnect
Aries / FDR IB
SeaStar2 / QDR IB
Cores
11776 / 9216
10864 / 3648
2 / 4 GB
16x 256GB/node
1 / 2 / 8 GB
Tflops
244 / 180
102 / 33
95
8
Acc. nodes
in Phase2
-/8
88
6
Disc25.01.2013
space
2.4 PB
110 / 145 TB
1+ PB
100 TB
RAM/core
CSC presentation @ JYFL
QDR IB
7308
748
2 / 4 / 8 GB 4 / 8 GB