View the PDF - EPCC - University of Edinburgh

Transcription

View the PDF - EPCC - University of Edinburgh
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
news
Issue 79 Summer 2016
Best in
breed
Novel
computing
brings
innovation to
the farmyard
Fortissimo: better
design through HPC
In this issue
Large Synoptic Survey
Telescope: data-intensive
research in action
Creating a safe haven
for health data
Join us at EuroMPI!
From the Director
Regular readers will notice that this
welcome is now in the singular
rather than the plural. I’d also like to
welcome our newest avid reader,
Alison Kennedy, the new Director of
the Science and Technology
Facilities Council’s Hartree Centre.
In all seriousness, I wish my friend
and long-time colleague Alison all
the best in her new role, which she
took up at the start of April. Alison
has been a key member of EPCC
staff for over 20 years and we all
wish her well.
I hope this issue of EPCC News
conveys the huge breadth of
EPCC’s activities. Many scientific
users know us mainly through the
ARCHER service (and all of the
previous national HPC services that
we have run). ARCHER has never
been busier – indeed we are well
aware of the challenges we face in
terms of meeting the expectations
of our many users. Busy systems
mean long queuing times – a clear
argument for more Government
investment in national HPC
services. However, any new
investment will take time and this
doesn’t help the immediate needs
of our many users who face long
turn-around times.
But EPCC does much more than
supply national HPC services. I
hope that you find the articles on
our many and varied industry
projects interesting. As the
Fortissimo project is showing, the
need for HPC by Europe’s small to
medium sized enterprises (SMEs)
continues to grow. With 122
partners this is our most complex
EC project but also only scratches
the surface of the potential demand.
We estimate more than 30,000
SMEs could easily benefit from HPC
in Europe today.
The DiHPC project provides a set of
resources and tools to help the
HPC community engender change,
implement best practice and
champion diversity.
5
A scaleable, extensible
data infrastructure
In partnership with industry
6
Working
for Scottish industry
HPC on demand
8
Easier HPC adoption
Supporting Scottish SMEs
12
Measuring power in
parallel technologies
Adept project achievements
14
Data-intensive research
in action
Preparing for the Large
Synoptic Survey Telescope
16
A safe haven for UK
health data
Health informatics
18
Exploiting parallelism for
research
The INTERTWinE project
19
Spreading the word
ARCHER Champions
Faces of HPC is a series of stories
about people who represent the
diversity of the HPC community,
championing role models for an
inclusive culture.
20
Software skills for
researchers
Software Carpentry
workshops
Diversity in HPC is a project developed by
EPCC and funded through the UK national
supercomputing facility ARCHER and the EPSRC.
22
UK Many-Core Developer
Conference
UKMAC 2016 review
23
HPC outreach
EPCC at the Big Bang Fair
And if you want further proof of the
breadth of HPC look to our example
on page 11 – which also explains
the attractive Scottish beast on the
cover!
Mark Parsons
EPCC Director
[email protected]
Read more about our work
www.hpc-diversity.ac.uk
[email protected]
+44 (0)131 650 5030
EPCC is a supercomputing centre based at The University of Edinburgh, which is a charitable body
registered in Scotland with registration number SC005336.
2
Better design for
business
Fortissimo helps European
SMEs access HPC services
Generating impact from
research
Speeding innovation from
research to industry
Contact us
www.epcc.ed.ac.uk
4
10
The Diversity in HPC (DiHPC) project is working to showcase
the diversity of talent in the high performance computing
community.
No personal characteristic should
be a barrier to participating in HPC.
This includes disability, sexuality,
sex, ethnicity, gender identity, age,
religion, culture, pregnancy and
family.
Contents
Join us at EuroMPI!
Now in its 23rd year, EuroMPI is the leading conference for
users and developers of MPI. Join us in the beautiful city of
Edinburgh for 4 days of networking, discussion and skills
building. The theme of this year’s conference is “Modern
Challenges to MPI’s dominance in HPC”.
Through the presentation of
contributed papers, posters and
invited talks, attendees will have the
opportunity to share ideas and
experiences and to contribute to the
improvement and furthering of
message-passing and related
parallel programming paradigms.
representation of women comes
from a recognition that diversity
fosters innovation and is a benefit to
society. Our partnership with WHPC
will help us ensure that the
conference is accessible and
welcoming to all and will encourage
us to challenge the status quo.
A panel will discuss the challenges
facing MPI, what the HPC
community needs from MPI to
adapt for the future, and if MPI will
survive as we proceed to Exascale
and beyond.
Keynotes
In addition to the conference’s main
technical program, one-day and
half-day tutorials will be held.
Women in HPC
partnership
We are delighted to announce that
EuroMPI 2016 will be the first
conference to work in partnership
with Women in HPC (WHPC).
EuroMPI 2016 is committed to
helping broaden diversity in the
HPC community and beyond. This
commitment to diversity and in
particular to addressing the under-
Dan Holmes
[email protected]
EuroMPI 2016 will host an exciting
programme of keynotes from across
the MPI community discussing the
pros and cons of using MPI and the
challenges we face. The speakers
for this year’s conference include:
How can MPI fit into today’s big
computing?
Jonathan Dursi, Ontario Institute for
Cancer Research.
MPI: The once and future king
Bill Gropp, University of Illinois in
Urbana-Champaign
The MPI Tools Information Interface
Kathryn Mohror, Lawrence
Livermore National Laboratory.
HPC’s a-changing, so what
happens to everything we know?
David Lecomber, Allinea Software.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
EuroMPI will run from 25-28
September in Edinburgh, UK.
Registration
www.eurompi2016.ed.ac.uk/
registration
Early bird registration closes 22
July 2016
For more information on the
partnership with WHPC, see
www.eurompi2016.ed.ac.uk/
diversity
3
Koenigsegg’s the One:1,
developed with the
assistance of Fortissimo.
Image Julia LaPalme.
Fortissimo
Better design through HPC
A consortium of Europe’s leading supercomputing centres and
HPC experts is developing the Fortissimo Marketplace, a
one-stop-shop where end users will access modelling and
simulation services, plus high-performance data analytics.
The advantages of using high
performance computing (HPC) in
modelling and simulation are well
established. However it has proved
more difficult for small companies
to gain these benefits compared to
larger ones. This is typically
because the cost of entry has been
too high: only larger companies
have been able to afford the
investments required to buy and run
HPC systems, and to provide the
necessary expertise to use them.
Fortissimo has learned from the
success of cloud computing by
offering HPC on-demand as a
pay-per-use service, so removing
any need for the end-users to buy
and run their own systems. This
dramatically reduces the risk for
first-time users of HPC, and allows
users to cut costs since they only
pay for the resources they use.
Access to HPC experts through the
Fortissimo Marketplace helps to get
users up and running more quickly.
The Marketplace development is
being driven by over 50
experiments involving various types
of stakeholders such as end-users,
simulation service providers and
software vendors. The experiments
are designed to prove the concept
of the Fortissimo Marketplace and
4
determine what features it should
have. They will also provide an
initial set of services that will be
available through the Marketplace.
Mark Sawyer
[email protected]
One of the services that will be
offered through Fortissimo is being
developed by Dublin City University
and EPCC. It uses discrete event
simulation to model the operation of
industrial manufacturing processes,
allowing them to be optimised to
improve business performance.
Fortissimo has attracted much
attention from industry, with a
number of software and service
companies interested in joining the
Marketplace. HPCwire awarded it
Best Use of HPC in the Cloud in
both Readers’ and Editors’ Choice
categories at the Supercomputing
2015 conference.
Evaluating the many possible
scenarios requires a lot of
computing power. The benefit of the
Fortissimo approach is that the
experts at discrete event simulation
do not need to own or run their own
HPC system, they simply access
the systems in the Fortissimo HPC
cloud when they need to. This
allows them to focus on their area
of expertise and to build up a
scalable business.
A prototype of the Marketplace was
released a few months ago and
work has been continuing to
validate the approach and add
features. An updated version,
intended to fully support the
services being developed by the
experiments, will be launched in the
near future.
Fortissimo is a collaborative
project that enables European
SMEs to be more competitive
globally through the use of
simulation services running on an
HPC cloud infrastructure.
www.fortissimo-project.eu
Fortissimo Marketplace
www.fortissimo-marketplace.com
Industry partnership: building a scaleable,
extensible data infrastructure
Modern genome-sequencing
technologies are easily capable of
producing data volumes that can
swamp a genetic researcher’s
existing computing infrastructure.
EPCC is working with the breeding
company Aviagen to build a system
that allows such researchers to
scale up their data infrastructures to
handle these increases in volume
without compromising their
analytical pipelines.
calculation. HDF5 is a data model,
library, and file format for storing
and managing data. It is designed
for flexible and efficient I/O and for
high volume and complex data.
To achieve the desired scalability
and reliability, the system uses a
distributed columnar database
where the data is replicated across
a number of compute and data
nodes. More compute nodes and
storage can be easily added as the
data volumes increase without
affecting the analyses that operate
on the data.
The pipelines use Aviagen’s inhouse queue management
framework to exploit parallelism by
distributing the tasks across a set of
available heterogeneous compute
nodes. Using this parallel
framework, we are implementing a
bespoke task library that provides
basic functionality (such as matrix
multiplication) so that a researcher
only need plug together the various
analytical operations they require.
The framework deals with managing
the distribution of the parallel tasks,
dependencies between tasks, and
management of the distributed
data.
The analytics code had to be
re-written and parallelised to allow it
to scale up as the volume of data
increases. The new analytical
pipelines operate on HDF5 extracts
from the data store, with the data
filtered at this stage to only include
the data relevant to the subsequent
This system combining the
columnar database and the parallel
analytics library will allow data
archiving and data processing in a
scalable, extensible manner.
Aviagen will be able to add more
data analysis functionality as
needed.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
Eilidh Troup
[email protected]
Amy Krause
[email protected]
“The collaboration with EPCC
promises to give us the ability to
handle increasingly large amounts
of data.”
Andreas Kranis, Research
Geneticist, Aviagen
Aviagen
http://en.aviagen.com
5
HPC: working
for Scottish
industry
Through EPCC, Scottish company Global Surface Intelligence
is using HPC to make global forestry and crop yield growth
predictions. Ronnie Galloway of GSi explains how.
Global Surface Intelligence (GSi) is
an Edinburgh-based company with
a global reach and expertise in
using Earth Observation (EO) via
satellite images to determine
commercially valuable information
on forest growth and crop yield
predictions.
GSi has developed bespoke
machine-learning software that
learns to recognise what satellites
are “seeing” when covering
important global assets such as
forestry and crops. By
understanding what is on the
ground, often across vast areas,
GSi provides agri-investors, forestry
management and investment
companies, farmers and crop
traders and government with
invaluable ongoing insight into the
value of those forests or crops.
The software developed by GSi is
estimated to be 100,000 times
quicker than other similar software
when run on conventional non-HPC
systems; the exponential benefits of
running GSi software on HPC
machines are huge.
GSi first partnered with EPCC in
2013 through the Scottish
Enterprise funded ‘Supercomputing
Scotland’ programme which
provided GSi with funding for EPCC
expertise and compute power to
parallelise and integrate GSi’s
6
software with HPC. Currently in
2016, the company has a
commercial relationship with EPCC
and enjoys the benefits of access to
1,500 cores on INDY (see opposite
page), allowing GSi to run different
jobs simultaneously. The vast data
ingested by the GSi-Platform is
stored efficiently at EPCC. The data
and compute infrastructures are
co-located at EPCC’s Advanced
Computing Facility.
Ronnie Galloway,
Consultant, Global Surface
Intelligence Ltd
[email protected]
High bandwidth, low latency
interconnect reduces the need for
copy-managing the data through
other means. This presents a huge
commercial advantage to GSi in
reducing time and effort to provide
EO analysis of land assets. At all
times, EPCC provides expertise and
advice to GSi in maximising
efficiencies of using HPC in EO and
big data management.
The GSi relationship with the
University of Edinburgh extends to
a Data Lab-funded project in 2016
employing a PhD student from the
School of Geosciences related to
the crop yield aspect of the
business. Allied to the close
working relationship between EPCC
and GSi, this typifies a relevant and
vital collaboration between industry
and academia that helps a local
SME tackle truly global challenges.
GSi
www.surfaceintelligence.com
The Data Lab
www.thedatalab.com
All images courtesy of GSi.
On-demand HPC
for industry
The INDY service used by GSi (see opposite) is part of EPCC’s
on-demand computing service Accelerator, which brings
supercomputing capability straight to our clients’ desktops.
Through a simple internet connection they gain cost-effective
access to an impressive range of high-performance computing
(HPC) resources including ARCHER, the national HPC service.
Accelerator is targeted at engineers
and scientists solving complex
simulation and modelling problems
in fields such as bioinformatics,
computational biology,
computational chemistry,
computational fluid dynamics, finite
element analysis, life sciences and
earth sciences.
The Accelerator model provides
direct access to HPC platforms
delivering the highest levels of
performance. Unlike cloud-based
services, no inefficient virtualisation
techniques are deployed. The
highest levels of data security are
provided, and the service is
administered directly by the client
using a range of administration and
reporting functions. The service is
fully supported with an integral help
desk. EPCC support staff are
available to help with usage
problems such as compiling codes
and running jobs.
INDY is a dual configuration LinuxWindows HPC cluster aimed at
industrial users from the scientific
and engineering communities who
require on-demand access to
mid-range, industry standard, HPC.
The system comprises 24 back-end
nodes and two front-end, login
nodes utilising high performance,
low latency interconnect. There are
four AMD Opteron, 16-core
processors per node giving 64
cores per node, and 1536 cores in
total. As standard, each back-end
node has 256Gbyte of shared RAM,
with two large memory back-end
nodes configured with 512 Gbyte
RAM to support applications which
have a requirement for a larger
shared memory resource. The
system has support for future
installation of up to two GPGPUs
cards (NVIDIA or AMD) per node.
INDY utilises IBM’s industry leading
Platform HPC cluster management
software, providing job-level
dynamic provisioning of compute
nodes into either Windows or Linux
depending on a user’s specific O/S
requirement.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
George Graham:
[email protected]
On-demand at EPCC
To discuss our on-demand
Accelerator service, contact
George Graham:
[email protected]
+44 (0) 131 651 3460 or +44
(0) 777 370 8191.
www.epcc.ed.ac.uk/facilities/
demand-computing
7
SHAPE: Making
HPC adoption
easier for SMEs
It can be challenging for SMEs to adopt HPC. They may have
no in-house expertise, no access to hardware, or be unable to
commit resources to a potentially risky endeavour. This is
where SHAPE comes in, by making it easier for SMEs to make
use of high-performance computing in their business, whether
to improve product quality, reduce time to delivery or provide
innovative new services to their customers.
Successful applicants to the SHAPE
programme get effort from a PRACE
HPC expert and access to machine
time at a PRACE centre. In
collaboration with the SME, the
PRACE partner helps them try out
their ideas for utilising HPC to
enhance their business.
So far, SHAPE has assisted over 20
SMEs (see the project website for
examples), and the third call for
applications has just closed, so
more companies will benefit from
this enabling programme.
SHAPE will continue in the next
phase of PRACE, and the plan is to
have six-monthly calls (the next
opens in June 2016), giving ample
opportunity for SMEs to investigate
what HPC can do for their business.
Albatern: producing
power from waves
Albatern is an innovative Scottish
SME of 15 engineers. Its wave
power generation product consists
of buoyant three-armed Squid
modules which can link with up to
three other Squids.
The Squid modules and their
link-arms contain mechanisms to
generate power, capturing the
heave and surge motion of the
waves via hydraulics. In this way
Albatern has developed a highly
scalable, modular wave power
8
generator. Albatern’s project,
supported by SHAPE, marked the
start of the development of a
physics code capable of simulating
and predicting the power of a
large-scale Wavenet array (100 or
more devices).
Wave energy prototypes are large,
expensive and funded through risk
capital. As a result, prototype
simulation also forms an essential
part of the device design process.
To progress beyond the limitations
of current, commercially available
software, it was proposed to
construct a new, modular solver
capable of capturing the behaviour
of large-scale Wavenet arrays.
Through SHAPE and with the
support of PRACE experts, Albatern
has prototyped a parallel multibody
dynamics solver, using the PETSc
open source numerical library and
scaled out on ARCHER, the CRAY
XC30 hosted by EPCC.
“Simulations demonstrating the
potential cost and performance
improvements gained through
deploying extremely large, coupled
wave energy arrays will be a
breakthrough for the industry,” says
Dr William Edwards of Albatern.
“PRACE has helped Albatern
develop inhouse software that will
directly aid expanding the scope of
their simulation capability. Albatern
is now in a position to write a
Paul Graham, EPCC
[email protected]
SHAPE (SME HPC Adoption
Programme in Europe) is a
pan-European initiative
supported by the PRACE
(Partnership for Advanced
Computing in Europe) project.
The Programme aims to raise
awareness and provide
European SMEs with the
expertise necessary to take
advantage of the innovation
possibilities created by highperformance computing
(HPC), thus increasing their
competitiveness. The
programme allows SMEs to
benefit from the expertise and
knowledge developed within
the top-class PRACE
Research Infrastructure.
multibody dynamics code that will
share common parts of the
simulation procedure, allowing
interchange of either the
simultaneous or sequential
methods.”
NEXIO: Amping up
electromagnetic
modelling
NEXIO SIMULATION is a French
SME that develops electromagnetic
simulation software called
CAPITOLE-EM to study the
electromagnetic behaviour of any
product during the design process,
before the manufacturing phase.
After a first step performed locally in
France using the HPC-PME
initiative, their PRACE SHAPE
project has enabled them, via
access to HPC resources and
expertise in HPC and numerical
simulation, to jump from a personal
computer version of this software to
an HPC version.
Electromagnetic simulation is used
increasingly often because of the
proliferation of communication
devices such as mobile phones and
modems. Studying the effects of
interferences between pieces of
equipment has become essential
for large industrial companies in
aeronautics, space, automotive, etc.
to improve the performances of the
transmitting and receiving systems,
or antennas.
NEXIO SIMULATION proposes
solutions for electromagnetic
simulation problems with larger
frequencies and model dimensions
that lead to linear systems with
millions of unknowns: one of the
biggest challenges that researchers
in this field encounter. Such
solutions call for special numerical
techniques which are able to highly
reduce the numerical effort and
complexity of the solution as well as
the necessary used memory.
“These techniques are usually
based both in physical and
mathematical properties,” says
Pascal de Resseguier of NEXIO
SIMULATION. “However, there is a
certain point where these methods
are not enough and we need to add
some more gain. There it enters the
era of parallelisation and HPC
systems.
“Parallel codes can extremely
reduce computational times if they
have a good scalability with the
number of cores. Getting to an
efficient and optimised parallel code
requires some expertise and
resources which are hard to reach
for a SME. We expect that half of
the future sales of CAPITOLE-EM
will come from the HPC version
developed through this SHAPE
project.”
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
To find out more, see the SHAPE
website or contact the SHAPE
team at [email protected]
SHAPE
www.prace-ri.eu/hpc-access/
shape-programme
PRACE
www.prace-ri.eu
Albatern
http://albatern.co.uk
9
Impacting on
industry
EPCC is engaged in two
collaborative projects
designed to generate impact
from research in science,
technology, engineering and
mathematics.
‘Accelerating impact by developing
advanced modelling techniques in
the multiphase flow to the chemical
process industry.’
Global equipment manufacturers in
the chemical and oil and gas
industry, such as Sulzer Chemtech,
the industrial partner in this project,
often rely on commercial
Computational Fluid Dynamics
(CFD) software tools for the design
of their equipment. These
commercial codes are currently
unable to handle complex twophase flows which exhibit
challenging interfaces between gas
and liquids such as travelling
waves. The formation of interfacial
waves, their frequency and
amplitude is particularly difficult to
model in industrial environments.
This project, led by Dr Prashant
Valluri, accelerates the impact of
world leading research at the
University of Edinburgh in the
modelling of complex flow systems
for industrial applications such as
distillation, absorption, carbon
capture and oil refining. It aims to
lead to new practices in CFD
modelling, disrupting industry’s
current reliance on empirical design
practice for chemical technology
equipment such as structured
packings. A new software tool
relying on rigorous highperformance computing simulation
10
of multiphase flow and transport
phenomena will be developed, with
expert feedback from users at
Sulzer, so that it can be routinely
used by industry in the future.
Carolyn Brock
[email protected]
Dr Valluri explains the importance of
TPLS, a high-resolution 3D Direct
Numerical Simulation code for
two-phase flows that we have
developed in collaboration with Dr
Valluri and Dr Lennon Ó Náraigh,
University College Dublin:
“Understanding multiphase flows
with rigorous simulations is crucial
for the accurate and economic
design of any industrial units. Until
recently, rigorous flow simulations
were mainly restricted to academic
environments and only empirical
simulation methods being so-called
‘design-ready’ despite tremendous
errors. However, over the past
decade, falling costs and faster
multi-thread processors have led to
cluster computing becoming more
widespread in industrial R&D units
and powerful supercomputing
clusters such as ARCHER
becoming more accessible.
“Industry is now getting ready to
embrace rigorous simulations not
only for accuracy but also for a
strong economic argument given
smaller trial-and-error
commissioning downtimes and
reduced physical pilot plant trials.
The funds for these projects
were awarded from the
Engineering & Physical
Sciences Research Council’s
(EPSRC) ‘Impact Acceleration
Account’ (IAA), which is
aimed at enhancing
innovation opportunities and
to encourage partnership
working between universities
and industry.
“Our IAA project with Sulzer is an
example. EPCC is at the heart of
TPLS Solver. Through a series of
HECToR/ARCHER and EPSRC
projects, we have been fortunate to
have EPCC by our side all along.
Their best practices in optimisation,
data management, code structures
and numerical strategies have given
TPLS its ultra-powerful bite making it the only two-phase flow
direct numerical simulation solver
bespoke for supercomputing
architectures with the choice of two
highly powerful interface capturing
algorithms.
“Now at version 2 with over 700
downloads since 2013, and many
more physical and computational
enhancements underway, we are
confident that with EPCC by our
side, industrial/commercial uptake
of TPLS will increase in the next
four years!”
‘Development of a hand-held device
for measuring semen, part 2:
transferring DDM prototype into
advanced commercial prototype.’
The second project focuses on
assessing bull semen motility. The
British market in bull semen is worth
around £50m a year, with 75% of all
dairy cattle breeding being by
artificial insemination. To date,
however, there is no easily portable
method of accurately and
objectively determining parameters
that characterise bull semen in an
on-farm setting, some of which are
part of crucial assessments in
maximising bovine conception rates
and thus herd efficiency. Dr Vincent
Martinez and Professor Wilson
Poon (Institute of Condensed
Matter & Complex Systems,
University of Edinburgh), have
pioneered the use of Differential
Dynamic Microscopy (DDM) for
characterising motile microorganisms.
In collaboration with RAFT Solutions
Ltd and using previous IAA funding,
they have validated the use of DDM
for accurately assessing bull semen.
Moreover, they also built a portable,
first-generation prototype and used
it successfully on-farm to
characterise clinical semen
samples.
Following this success, there is now
a need to develop the technology
into an advanced commercial
prototype/IP package to enable
subsequent clinical/industrial
validation work. This milestone
requires software and hardware
development as well as ‘voice of
the customer’ market validation.
The three-way collaboration
between ICMCS, EPCC and RAFT
Solutions Ltd will be vital to
delivering the next key milestone in
this project.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
Dr Vincent Martinez
[email protected]
Dr Prashant Valluri
[email protected]
EPSRC
www.epsrc.ac.uk
Institute of Condensed
Matter and Complex
Systems
www.ph.ed.ac.uk/icmcs
11
Adept: nearing
the finish line!
The Adept project has been working hard for over two years to
further understanding of how power is used in parallel software
and hardware, and we are now on the finishing straight.
Here we take stock of our achievements and reflect on how to
focus our efforts in the final phase. We also consider life after
the project ends: how do we want to exploit the technologies
we have developed and the knowledge we have gained? How
do we ensure a lasting legacy for Adept?
Parallel computing is no longer
limited to large-scale HPC systems,
and parallel technologies are
becoming critical to everyday lives.
Parallelism on every scale is in use
throughout society, from the HPC
machines in our labs to the
smartphones in our pockets. Small
and large businesses alike now
need sensible, affordable parallel
systems in order to remain
competitive, and there is a vast
array of different parallel commodity
hardware now available.
Investigating and increasing the
efficiency of such devices is
therefore no longer an abstract
concern but a real and pressing
need. Financial needs,
environmental concerns, system
requirements – all of these
considerations and more will affect
how systems are built in future.
Adept Power Measurement System
One of our key outcomes is the
sophisticated Adept Power
Measurement System (APMS).
This fine-grained measurement
infrastructure reads the current and
voltage from the powerlines that
12
feed the different components of a
computer system, eg CPU, memory
or disk. The APMS is capable of
measuring from multiple
components with the very high
resolution of 1 million samples per
second.
Adept Benchmark Suite
To complement the APMS, the
Adept project has also developed a
suite of benchmarks that can be
used to test and evaluate existing
systems. The benchmarks are
designed to be used for system
characterisation and they target
specific operations and common
computational patterns.
Mirren White
[email protected]
A lot of challenging
work remains to be
done, however the
finish line is now in
sight. We are certain
that Adept will deliver
on all its objectives
and more!
The suite consists of three different
types of benchmarks:
Micro benchmarks. Small single
purpose functions such as basic
operations on scalar data types,
branch & jump statements, function
calls, I/O operations, inter-process
communication, or memory access
operations.
Kernel benchmarks. Computational
patterns and kernels that largely
consist of the operations from the
micro benchmarks.
A set of Adept power measurement boards,
fully wired up and ready for deployment.
Application benchmarks. Small
applications that consist of multiple
computational kernels.
Our Benchmark Suite includes a
wrapper for Intel’s Running Average
Power Limit (RAPL) system, which
is an in-band method for reading
power and energy counters on
certain Intel CPUs.
Together, the Power Measurement
System and Benchmark Suite form
a powerful set of diagnostic tools to
allow in-depth analysis of an
application’s power use in every
aspect of a system.
Adept Performance and Power
Prediction Tool
The Adept project is not limited to
measuring power consumption, and
another important outcome of the
project is our Performance and
Power Prediction Tool.
Using detailed statistical modelling
that examines a software binary, we
are able to predict how well a CPU
and memory hierarchy system will
perform and how power efficient it
will be, even if we do not have
access to that system or even if that
system does not yet exist.
The Adept tool will impact on
software developers and system
designers by freeing them from
making poorly informed decisions
about how to implement changes to
their systems. It allows for the
design of smarter, cheaper, and
more efficient systems, because a
system’s performance and power
behaviours can be matched to a
specific workload. Giving owners
and developers the freedom and
flexibility to know how their
equipment will perform prior to
porting their workloads means
giving them the ability to make
better choices about what they
implement, how, and when.
In the final few months of the
project, we will be focusing on
improving the Adept tool wherever
possible to make its predictions
increasingly accurate, and we will
use our measurement infrastructure
to conduct a wide range of
experiments around power
efficiency techniques in software
development. But we will also focus
significant effort on the exploitation
of the project outcomes to ensure
the lasting impact of our research.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
The Adept Power Measurement system.
The Adept project focuses on
balancing power consumption
and performance in both parallel
software and hardware.
www.adept-project.eu
13
Large Synoptic
Survey Telescope:
data-intensive
research in action
A number of recent, significant discoveries have propelled
astronomy research into the spotlight. The discovery of dark
matter and dark energy at the beginning of the 21st century overturned our understanding of how the Universe works. And the first
observation of a gravitational wave earlier this year confirmed
Albert Einstein’s long-standing hypothesis precisely 100 years
after it was first published in his general theory of relativity.
This is an exciting time for
astronomy in the UK, a fact that is
reflected by our involvement and
leadership of some amazingly
ambitious new telescopes.
The European Space Agency’s
Euclid dark Universe programme
will launch a space telescope in
2020 to answer our most pressing
questions about the dark Universe.
The Square Kilometre Array (SKA)
radio telescope, coordinated from
Jodrell Bank, will be able to see
back to the early Universe to the
time when cosmological structures
such as galaxies and stars first
began to form when it commences
operation in 2022. And in Chile
construction is underway on the
Large Synoptic Survey Telescope
(LSST) – the most ambitious optical
telescope ever undertaken – which
should “see first light” in 2019.
While the outputs of LSST will
challenge astronomers for years to
14
come, the ambition of the LSST is
already creating significant
challenges for the engineers and
computational scientists involved in
its construction and future
operation.
At the heart of the telescope sits a
3.2 Gigapixel camera (that is more
than 100 times the elements of a
current top-of-the-range digital
camera), which is being designed in
part in the UK. Thanks to this
camera, the telescope will produce
more than 100 Petabytes of data
during a 10-year survey that will
image more than half of the sky with
unprecedented depth and
sensitivity.
LSST:UK consortium
UK astronomers have ambitious
plans for LSST to advance
understanding of dark energy, to
identify and study near-Earth
objects, to detect and follow
transient events, and to progress
George Beckett
[email protected]
LSST data mining sphere. The LSST team
has developed an innovative “overlapping
partitioning” method for storing enormous
amounts of information for rapid access. By
overlapping equally sized packets of information
in the partitioned sphere, searching for nearest
neighbour sources becomes quick and efficient.
The technique has been shown to work just as
efficiently with increasingly complex systems. The
improved algorithms resulting from this innovative
architecture will be available as open source
software that can be used by a broad spectrum
of fields to transform access to large databases.
Image: LSST.
Cut-away image of LSST camera showing inner
workings. Image courtesy of LSST.
supernova science. To support its
ambition, the community has
formed a consortium called
LSST:UK with representation from
every astronomy department in the
country and – with support from the
Science Technology and Facilities
Council – the consortium has
secured full membership of the
LSST.
LSST:UK Science Centre
Construction progress in Chile is
mirrored by scientific progress here
in the UK, as scientists make their
preparations in an £18 million
STFC-funded project called the
LSST:UK Science Centre (LUSC).
The pre-operations phase of LUSC,
which is led by the University of
Edinburgh, started in July 2015 and
will run for four years. During this
term, the infrastructure to host and
analyse LSST data (called the Data
Access Centre) will be designed and
science groups will define and
optimise the workflows that will be
run in the Data Access Centre.
Engagement with the international
community is vital during the
construction phase. LSST:UK is
already building strong relationships
with the core teams of scientists
and technologists in the United
States and France. Further, we are
looking towards collaboration
opportunities with peer activities in
Euclid, SKA, and the LHC,
exploiting the UK’s unique position
of being involved in all three of
these programmes.
The programme of work in the
lead-up to first light in 2019 is
ambitious and exciting. The volume
and rate of data generated by LSST
will break today’s databases and
analysis software, and will challenge
established astronomy practices
and expectations. This is dataintensive research in action.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
The Large Synoptic Survey
Telescope (LSST) project will
conduct a 10-year survey of
the sky that will deliver a
200-petabyte set of images
and data products which will
address some of the most
pressing questions about the
structure and evolution of the
Universe and the objects in it.
LSST:UK Science Centre
www.lsst.ac.uk
15
Creating a
safe haven
for health data
Safe havens allow data from electronic records to be used to
support research when it is not practicable to obtain individual
patient consent while protecting patient identity and privacy.
EPCC is now the operator of the new NHS National Services
Scotland (NSS) national safe haven in collaboration with the
Farr Institute of Health Informatics Research which provides
the infrastructure.
Enabling researcher access to
sensitive data sources is a complex
process. Data providers manage
their risk by making data supply
dependent on research projects
meeting specific information
governance, data stewardship and
system security requirements, in
some cases through audited
assessment. These system
requirements place a very
substantial burden on individual
research projects and in some
cases these requirements alone can
make projects unviable.
However, the whole supplier
risk-management process can be
streamlined, and in some cases
eliminated entirely, if research
projects use an appropriately
accredited safe haven facility to
broker access to the data.
Safe havens act as secure virtual
data rooms in which the data
suppliers deposit data for the
research projects to access it. The
16
practice of providing researcher
access to NHS patient and health
data has been pioneered in the UK
through governance initiatives such
as the Scottish Health Informatics
Programme (SHIP).
NSS safe haven
The new NHS National Services
Scotland (NSS) national safe haven
service implementation work started
in September 2015 with the live
service rolled out during December
and January 2016. Now fully
operational the safe haven is both
physical and remote. It offers a
secure file transfer and submission
service for data providers and a
range of access methods and
analytics platforms and tools for
researchers.
The standard service offered to
research projects is secure remote
browser-based access to a lockeddown virtual desktop MS Windows
system with MS Excel, SPSS, Stata,
SAS and R.
Donald Scobbie
[email protected]
The Farr Institute is a
UK-wide research
collaboration. Publicly
funded by a
consortium led by the
Medical Research
Council, the Institute
is committed to using
big data to advance
the health and care of
patients and the
public.
Development and operation of the
new NSS safe haven presented
new challenges for EPCC, although
the safe haven model is mature and
relatively well understood, with
expertise in it readily found in the
HPC community. This project
therefore prompted the
development of new capability
within EPCC, bringing security
management and secure data
stewardship as new core skills to
the system development team.
Implementing and operating the
extensive supporting infrastructure
(including enterprise products for
the virtual desktop infrastructure)
for the new safe haven has been
the key to delivery of the service
and evolution of the new security
environment.
Information governance
The information governance and
security regime of the safe haven
has now reached the standard
where NHS national data sets and
Department of Work and Pensions
(DWP) data can be hosted by the
service and the next goal is to host
the NSS national image archive for
research purposes. Information
governance in a safe haven
environment is very much the
primary concern and HPC a
secondary one.
EPCC is working closely with NSS
and the Farr Institute to extend and
enhance the new safe haven
service beyond its current basic
compute capability to provide
traditional HPC services within the
safe haven. A higher powered
compute cluster and petabyte-scale
storage services are being
developed alongside the safe
haven. The intention is to provide a
more capable, secure analytic
environment for health research that
continues to meet the data
stewardship and sharing security
needs of data providers such as the
NHS and DWP. These services will
be rolled out later this year.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
Farr NSS Safe Haven:
• 3 node hyper-V
hypervisor platform
• 46 Virtual Servers:
Windows and Linux
• 20 research partners
and institutions
• 103 registered
researcher users
• 62 active projects
www.farrinstitute.org
17
INTERTWinE:
boosting research by
exploiting parallelism
The first exascale computers,
capable of performing 1×1018
calculations per second using tens
of millions of CPUs, are likely to be
produced within the next few years.
However, current versions of
scientific software cannot produce
enough concurrent tasks to keep
such a high number of CPUs busy
at once, even if there are enough
tasks waiting to be processed. Such
inefficient use would mean that an
exascale-capable machine would in
fact be unlikely to achieve exascale
performance.
The INTERTWinE project is
addressing this by helping scientists
to find and exploit the parallelism
that already exists within their
software. By working with real
software and popular programming
techniques, we ensure the focus is
aligned to scientists’ pressing needs
and applications.
We have identified a number of key
parallel programming models that
have been widely adopted in current
scientific software. However in order
to achieve large-scale parallelism,
these often need to be used
together, and INTERTWinE focuses
on this interoperability of
programming models.
EPCC is contributing to various
areas of the technical work. For
instance, we are investigating how
to combine a thread-based model
18
with a distributed-memory model
for off-node parallelism. In order to
make this transparent to the user,
yet highly performant, we are
focusing at the runtime level; for
example by using a directory, which
knows where data is located, to
hide explicit data movement and
using a cache to limit the amount of
communication in the first place.
It is important that the
improvements to the technologies
meet the requirements of the
application developers. Another of
EPCC’s technical contributions is in
the optimisation and improvement
of existing parallel applications
using these new technologies.
EPCC is currently focusing on
Ludwig, a lattice Boltzmann code
for complex fluids, and investigating
the best mix of programming
technologies in order to achieve
good performance and scalability.
The lessons learned from this work
are not only fed back into the
programming interoperability work,
but also into best practice, training
and the development of standards
to meet interoperability demands. In
addition to this we are working with
the relevant standards bodies to
better understand and support the
particular requirements of
interoperability with other
programming models.
Catherine Inglis
[email protected]
The INTERTWinE team in Barcelona.
To subscribe to news from
INTERTWinE, sign up at:
www.intertwine-project.eu/
newsletter.
INTERTWinE is led by EPCC
and funded by the EC Horizon
2020 Research & Innovation
programme for 3 years from
October 1, 2015.
www.intertwine-project.eu
ARCHER Champions:
spreading the word
ARCHER Champions began with a vision: every research
organisation that could benefit from ARCHER should have
someone local who knows about the routes to access
ARCHER and who can help potential users to get started.
We want Champions to tell us how
we can improve support for them
and their local users, and how to
start joining up all the HPC facilities
and the people with the expertise
around the UK.
The Engineering and Physical
Sciences Research Council agreed
that these ideas were worth funding
and so we were able to launch the
ARCHER Outreach project and, as
part of that, ARCHER Champions.
We consulted experts around the
country on the best and most useful
way forward and, based on their
suggestions, in March we organised
a meeting in Edinburgh to gather
HPC experts, ARCHER users and
interested researchers to share our
ideas on ARCHER Champions.
Members of the ARCHER Team
began by outlining what ARCHER
is, what it offers researchers, how it
fits into the National HPC
infrastructure, how to access
ARCHER, the training available and
the user support structures.
We invited discussion on the
obstacles to accessing HPC
facilities (both ARCHER and others),
what the ARCHER team should do
next, and the concerns, frustrations
and uncertainties of new users. We
even managed to bust a few
misconceptions about problems
with using ARCHER.
The name “ARCHER Champions”
was reviewed. This was always
intended to imply “Enthusiasts
championing the use of ARCHER”
rather than “Supreme ARCHER
users” and this discussion
reassured some of those present
who felt they were not (yet)
champion users.
The meeting had a terrific
atmosphere of positivity, bringing
together lots of enthusiasm and
experience, and has provided us
with a wealth of ideas for taking
ARCHER Champions forwards. In
the next few weeks we will ensure
all the Champions have access to
ARCHER, with a budget, so that if
they are not already users then they
can experience using ARCHER for
themselves and be able to
demonstrate to others.
We will also forge further links with
other HPC networks such as
HPC-SIG and the Research
Software Engineers with a view to
co-locating a future Champions
meeting and continuing to provide
our Champions with resources and
information.
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
Jo Beech-Brandt
[email protected]
We would like to thank
everyone who has helped
get ARCHER Champions
off to such a great start.
To get involved, email:
[email protected]
See our webpage for details
of all current Champions and
the resources shared at the
meeting.
We will continue to add
resources and information
about future events.
www.archer.ac.uk/community/
champions
19
Software Carpentry
Teaching researchers the software
development skills essential to their work.
Software Carpentry (SC) is an
international collaboration offering
highly-interactive two-day
workshops. The SC model is based
on a community of certified
instructors who teach at the
workshops, and contributors who
maintain the lessons materials.
The lessons materials are all
available under Creative Commons
BY licence and maintained by the
community itself. They are used as
the modular bricks to build the
typical SC workshop curriculum,
which must include SC’s core
topics: automating tasks using the
Unix shell; structured programming
in Python, R, or MATLAB; and
version control using Git or
Mercurial. SC is about methods and
practices, rather than specific tools.
SC has enjoyed a steady increase in
popularity, thanks to an engaged
community that has succeeded in
introducing this format of training to
academic departments all over the
world. This growth led to the
creation of the Software Carpentry
Foundation, which holds the reins
of the initiative and whose main
20
concern is currently ensuring a
sustainable business model.
SC has a strong international
vocation and is subdivided into
regional administrations, which
interface with the groups involved in
the training (host/learners and
instructors/lesson-maintainers).
EPCC has always played a crucial
role in promoting SC across the UK.
A number of active SC instructors
are based at EPCC and we host the
UK regional administration (as part
of the Edinburgh branch of the
Software Sustainability Institute),
which coordinates most SC
workshops in the UK. ARCHER (the
UK supercomputing facility, hosted
by EPCC) has, since 2014 used the
SC format as a regular part of its
training.
Data Carpentry
A noteworthy development has
been the birth of Data Carpentry
(DC), a sibling project which shares
most of SC’s operations. DC
focuses on introductory
computational skills for data
management and analysis, and
Giacomo Peru
[email protected]
Instructors come from a range
of backgrounds and are often
researchers with sound
experience of research
software development and a
clear sense of the pitfalls.
They are certified after an
extensive online course or an
intensive two-day face-to-face
one. SC instructors are
volunteers who offer to teach
at workshops for free and
sometimes during their work
leave.
Katy Wrathall, Flickr
targets a less experienced audience
than SC, offering a curriculum which
features spreadsheets, data
cleaning (OpenRefine), R,
visualisation with R and SQL. DC is
enjoying good success as learners
gain direct, tangible benefits.
Future developments of both SC
and DC are likely to come from
efforts to establish them as more
regular training models within
academic departments (and in
Centres for Doctoral Training), as
well as from the development of
lessons that are more domainoriented, eg based on the use of a
domain-specific sample dataset
throughout the course. For example,
the development of HPC Carpentry
is moving in this direction.
What participants say
Alexander Konovalov, University of
St Andrews (learner)
“The course I attended covered
many aspects of delivering handson training to novice learners
(presumably scientists with no
formal training in programming).
Acquiring such skills is very
important to improve researchers’
productivity and facilitate
collaboration, and I hope to
contribute by recommending and
delivering software carpentry
training in my domain.
“Techniques that I have practiced
here will certainly help me in
teaching computer science modules
as well.”
Aleksandra Pawlik, University of
Manchester (instructor)
“I have taught on SC and DC
courses since 2013 and have met a
very wide range of audiences. I
recommend the courses to all
researchers whose work is heavily
dependent on any type of software
and/or deals with large datasets. I
would also recommend it to other
professionals such as librarians and
administrators, and we do teach
them as well.
“In my experience learners leave the
course with the feeling of having
learned very relevant skills and
capable of significantly improving
both the quality and the quantity of
their workflows.”
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
SC was founded in 1998
by Greg Wilson, formerly
of EPCC, and arose from
the growing awareness
that something should be
done to link domain specific
knowledge-bases with
software development and
computational training.
http://software-carpentry.org
21
UKMAC
2016: UK
Many-Core
Developer
Conference
Edinburgh hosted the UK ManyCore Developer Conference in May
2016. This informal day of talks
spanning the whole landscape of
accelerated, heterogeneous and
many-core computing brought
together academic and industrial
researchers striving to improve the
programmability of modern
computing systems, which are
becoming increasingly powerful at
the expense of complexity.
The informal nature of the UKMAC
series provides invaluable
opportunities for participants to
meet colleagues and swap stories
of many-core successes and
challenges.
A highlight of the day was the
discussion provided by keynote
speaker Andrew Richards, CEO of
Codeplay Software, an Edinburghbased company that develops
compilers for many-core parallel
systems and also works on
associated parallel programming
models and standards.
22
Andrew emphasised the increasing
importance of parallel computing,
particularly in relation to the recent
explosion in machine learning
usage within mainstream markets
such as online services and selfdriving cars. He also gave his
thoughts on how to address the
challenges of performance
portability (the ability of software to
run well across different hardware
architectures) and composability
(the ability of different software
components to interoperate
effectively).
The remainder of the day covered a
range of topics, including
experiences with exploiting GPUs
for scientific applications,
frameworks to ease
programmability of FPGAs for
image processing algorithms, and
work to enable applications written
in high-level languages such as
Java to utilise modern many-core
devices.
Alan Gray
[email protected]
UKMAC, in which EPCC had
an organisational role, was
held at The Informatics Forum
in Edinburgh, which was
bathed in sunshine as spring
finally arrived. This was the
7th event in the series and the
first in Scotland, with previous
meetings in Cambridge,
Oxford, Imperial, and Bristol.
The presentation slides are
available at:
http://conferences.inf.ed.ac.
uk/UKMAC2016
The Big Bang Fair
EPCC has an experienced outreach team and under ARCHER
we have increased the scale of our activity, enthusing even
more children about computational science and
supercomputing. However the Big Bang Fair was a step up
again. It is the UK’s largest celebration of science, technology,
engineering and maths for young people, with around 70,000
people attending over 4 days.
Our stand presented three main
activities, with Wee Archie, our mini
supercomputer, particularly popular.
components to buy must be
balanced against running costs and
the income generated from clients.
Wee Archie
The leader board proved to be a lot
of fun, and the game also allowed
us to demonstrate the main
components of a high performance
computing system (HPC) and to
highlight some of the challenges of
running such a system.
We used Wee Archie to run an
enhanced version of our dino-racer
demo, with children able to build
their own dinosaurs on the system.
Wee Archie comprises 18
Raspberry Pi 2s, a network switch,
a power supply unit (PSU), and
Ethernet cables in a transparent
case. The LED lights on each of the
system’s nodes show how the
workload on a parallel system is
balanced, with some nodes
carrying out more work than others.
Wee Archie is an excellent tool for
explaining the basics of a parallel
computer and how the components
all fit together.
Build your own supercomputer
This game, which we presented on
iPads, allows players to design,
build and operate their own
supercomputer. As with a real
system, decisions about the type of
Lorna Smith
[email protected]
Post sort demo
A simple but fun demo, the post
sort introduces parallel algorithms
in a practical way. By sorting a
series of envelopes while working
together, the children learned about
parallelism and the different
possible bottlenecks.
How did we do?
So how did it go? The event was a
great success, the booth was
constantly busy, and people
generally went away with a better
understanding of what HPC is and
why it is important. Overall this was
a great event for us. 2017 here we
come!
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh
EPCC Outreach
www.epcc.ed.ac.uk/outreach/
discover-and-learn
Wee Archie
Wee Archie is a portable,
functional cluster developed
by EPCC to demonstrate
applications and concepts
relating to parallel systems.
www.epcc.ed.ac.uk/outreach/
discover-and-learn/facts-andfun/wee-archie
23
Master’s degrees in High
Performance Computing (HPC)
and in HPC with Data Science
From EPCC at the University of Edinburgh
EPCC is the UK’s leading supercomputing centre. We are a major
provider of HPC training in Europe, and have an international
reputation for excellence in HPC education and research.
Our two MSc programmes have a strong practical focus and provide access to
leading-edge HPC systems such as ARCHER, which is the UK’s largest, fastest
and most powerful supercomputer.
Through EPCC’s strong links with industry, all students are offered the
opportunity to undertake an industry-based dissertation project.
The University of Edinburgh is consistently ranked among the top 50 universities in the world.*
* Times Higher World University Ranking
Apply now
www.epcc.ed.ac.uk/msc
24