Page 1

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

news Issue 83 SUMMER 2018

Addressing population-level health challenges Health Data Research UK will build on Scotland’s internationally-recognised wealth of health-related data

Industry update: our latest collaborations

In this issue World-Class Data Infrastructure New exascale projects European Open Science Cloud Our MSc programmes

From our Director I hope that as you leaf through this issue of EPCC News you’ll see what a broad range of activities EPCC is involved in. We’re going through one of our expansion phases with lots of new opportunities presenting themselves – many but not all related to the Edinburgh & SE Scotland City Region Deal. A key part of this is our office move in August – the first time we’ve done this in 28 years. We’re looking forward to taking possession of our floor in the new “Bayes Centre for Data Science and Technology”, located beside the Informatics Forum in central Edinburgh. We all have many memories of the James Clerk Maxwell Building – I first came to the building in 1989 and, apart from a few summers at CERN during my PhD, it’s been my workplace for almost 30 years. We all have many memories of key



EPCC “moments” too. It will be strange to leave but our new location presents enormous opportunities for EPCC and we’re looking forward to the move… but not to clearing out our offices! By the time you read this I very much hope the completion of the City Deal negotiation will have been announced. While waiting for this to happen we have not been idle – a strong set of initial projects is already in preparation. At the same time, Japan and the USA have spoken more clearly about the delivery timelines for their first exascale systems – 2021–2022. I wonder when the UK will be farsighted enough to invest in its own exascale system? Mark Parsons EPCC Director

Privacy statement Information about you: how we use it and with whom we share it EPCC processes the personal data of subscribers to EPCC News to ensure each individual receives relevant information and to ensure we use resources in the most efficient and effective way. The information you provide will be used by EPCC to: • Keep you informed about EPCC, its activities and interests

•Provide you with any information you have requested and for the promotion of benefits and services •Ensure we only communicate with you about events, opportunities, or services of interest to you. We are currently using information about you because you have previously given us consent to be added to the EPCC News mailing list. To opt out of the mailing list at any time, send your request to

Cover image: ©

New staff EPCC continues to grow European Extreme Data & Computing Initiative Boosting HPC competitiveness


World-Class Data Infrastructure

A unique combination of computing & data resources


Health Data Research UK


Preparing for exascale


Industry projects update


Machine learning for industry grand challenges

Applying data science to population health challenges Our new EU projects

A roundup of recent work

Applying ML to oil & gas


Modelling room acoustics


HPC visitors’ programme


Digitising UK plc


New DiRAC system arrives

Ray-tracing for sound

HPC-Europa3’s first year

EPCC to install HPE Apollo 70

Tesseract will contribute to UK e-infrastructure


Modelling particles in the global atmosphere UK Met Office collaboration


Supporting Big Science


EPCC’s MSc programmes


Moving on...


Exascale events review

European Open Science Cloud launches Changes for the coming year EPCC’s move to the Bayes Review of workshops

Contact us

+44 (0)131 650 5030

Twitter: @EPCCed

EPCC is a supercomputing centre based at The University of Edinburgh, which is a charitable body registered in Scotland with registration number SC005336.


New staff at EPCC

Some of the new faces at EPCC: (clockwise from top) Thomas, Rosa, Paul, Calum, Jane, Oliver, and Anna.

A warm welcome to our new colleagues! Thomas Blyth, Business Development Manager

Jane Kennedy, Applications Developer

Calum Muir, Data Centre Manager

Oliver Brown, Applications Developer

Ruairidh MacLeod, Applications Developer

Anna Roubíčková, Applications Developer

Paul Clark, Director of HPC Systems

Daniel McManus, Computing Support Officer

Rosa Filgueira, Data Architect

Magnus Morton, Applications Developer

Andreas Vroutsis, Applications Developer (Data Science)

The European Extreme Data and Computing Initiative The European Extreme Data and Computing Initiative (EXDCI) was created to bolster the global competitiveness of HPC in Europe. EPCC led the work on Talent Generation and Training for the Future. A lack of HPC skills in the European workforce has often been highlighted as a major threat to Europe’s competitiveness. Three key obstacles are: • Lack of awareness of HPC in general, and of the jobs it can lead to • Lack of training opportunities, especially for those who do not yet have a specific need for, or interest in, HPC • Poor visibility of job opportunities in the sector. A report on Promotion of HPC as a Career Choice, an HPC Training Roadmap, and a report on an HPC Training Providers’ Forum explored

these issues in depth, giving recommendations on how to counter them. The reports are available from the EXDCI website.

Catherine Inglis EPCC Project Manager

To inspire young people to consider working in HPC, a series of Career Case Studies was produced. These feature first-hand stories of people passionate about their work, and highlight the wide variety of jobs open to those with HPC skills.

Under EXDCI’s new funding, EPCC will organise a workshop to bring together HPC professionals involved in outreach and school computing teachers, to identify ways to make young people more aware of HPC as a career choice.

A Training Portal and Job Portal were created to provide central lists of opportunities in Europe. To sign up to post job vacancies or training announcements from your institute, go to:

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh EXDCI was funded by the EC’s Horizon 2020 Research and Innovation programme for 30 months from 1 September 2015.


The World-Class Data Infrastructure: a fundamental enabler for data-driven innovation

Data innovations could lead to improved visitor experiences during major public events. Edinburgh Festival image: Paul Dodds.

EPCC’s Director Professor Mark Parsons explains why the World-Class Data Infrastructure will help position the City Region as data capital of Europe. The unique thing about the WorldClass Data Infrastructure (WCDI) is not necessarily huge amounts of computing power. For example, we already host a number of the UK’s national supercomputers at EPCC – we’re not short of compute power. What we’re investing in with the WCDI is combining the computing and data resources to create a facility that will allow organisations to innovate. This involves the storage and services to manage and present data to users. We’re proposing to host, manage and deliver datasets and services for many different users. It’s going to be very novel. Certainly in Europe, there aren’t any facilities with as broad a data remit as the WCDI. 4

It will be a fundamental enabler for many data science projects, be they for industry, purely academic, or a mixture. Scotland’s very lucky that it already has EPCC, which is known around the world – and is certainly in the top two or three of European supercomputing centres. But we believe WCDI will facilitate new products, services, and scientific studies by bringing together regional, national and international datasets. For example, Festivals Edinburgh want visitors to come to the city and have a better experience during the many festivals. I hope we can help by using data we store in the WCDI on how people move around the city during the Festival. The data we host will have different governance requirements, from open data up to highly secure data with limited access. Subject to permission from the data owners and following robust governance processes, one of the key things we’ll be able to do is take different datasets and link them together to

Mark Parsons EPCC Director

The University of Edinburgh is set to play a key role in the Edinburgh and South East Scotland City Region Deal, delivering the deal’s Data-Driven Innovation programme. Underpinning new data innovation hubs across the University will be an exciting new facility for the secure and trustworthy hosting and analysis of huge and varied datasets.

An artist’s impression of the Bayes, which is under construction as we go to press. It will be home to EPCC and other world-leading data science and artificial intelligence teams from August 2018.

deliver new insights. Most banks over recent years have created well-run data warehouses, but they tend to only store retail and trading data. They make decisions based on that, writing models and analysing it to predict trends. What many organisations want to do is analyse their data alongside other data to gain new insights about their markets and customers, and to develop better products and services.  However, it is difficult for a single organisation to do that because it means bringing different datasets together from multiple sources.  Of course, ethics and governance are vital issues for us. Scotland is a world leader in how it handles access to citizens’ data. For example, we have established a system for using suitably pseudoanonymised medical records for use in research to improve health outcomes. In setting up the necessary infrastructure for governance and ethics for healthcare data, we’ve learnt a huge amount – which puts

Scotland in a strong position. For the WCDI within the context of the City Region Deal, the idea is to apply these approaches more broadly across all public sector data in Scotland. 

The University of Edinburgh and the City Region Deal The University of Edinburgh has partnered with Heriot-Watt University to deliver the City Region Deal’s Data-Driven Innovation Programme, to increase the contribution of graduate talent and academic expertise within South East Scotland. Five DDI ‘hubs’ will be created across the two universities: the Edinburgh Futures Institute, Bayes Centre, Usher Institute, Robotarium, and Easter Bush campus – all of which will be supported by the WCDI facility. Over 10 years, academic staff will collaborate with 10 industries across the private, public and third sectors to ensure citizens, businesses, and region as a whole benefit from the data economy.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

EPCC’s move to the Bayes While we’ve been happy in our current home, there are some good reasons for our move later this year. Space. When EPCC was founded in 1990 there was just a handful of staff and plenty of room. We now number 100. Visibility. EPCC is spread over several floors and we live a pretty anonymous existence! The move to the heart of the city centre will boost our visibility. Location. The Bayes will put us within walking distance of lots of Edinburgh’s great cafes and restaurants. Transport links will be much better too. It has been an exciting 28 years for EPCC and we look forward to this new chapter in our history.



Health Data Research UK The application of cuttingedge data science to health and medical data in order to address population health challenges is an exciting and fast moving new field of research. Health Data Research (HDR) UK, a pioneering national institute, was formed in April 2018 to support world-leading research in this area. The institute is a joint investment led by the Medical Research Council, together with eight public and charitable organisations. HDR UK brings together the country’s leading expertise in data analytics, computing and statistics with health data, medical research and genomics, with the aim of developing cutting-edge analytical tools and methodologies to address the most pressing health research challenges.

National Data Safe Haven and EPCC’s role Scotland has an internationally recognised wealth of linkable, health-related data (both routinely collected and consented data, such as Generation Scotland), covering 5.4 million people from cradle to grave. The data is arguably the most extensive in the UK for 6

geographic coverage and phenotypic diversity, linkable via the Community Health Index (CHI) number, and with far greater retrospective reach than elsewhere in the UK. Scotland also has the UK’s only national prescribing/ dispensing and hospital imaging datasets. EPCC hosts much of this data in the National Data Safe Haven (NDSH), a highly secure computing environment with high-performance storage and compute infrastructure. EPCC ensures not only that the data is securely accessible for ethically approved research, but also that the infrastructure can cope with the demands of large scale, very complex data analytics. This activity started as part of the Farr Institute, and is now continuing as part of HDR UK. The NDSH allows approved researchers to access and analyse this health and medical data, and it provides the opportunity for unprecedented insight into the health of Scotland’s population. Being able to perform statistical analysis across multiple different large-scale datasets is opening new avenues for innovative research that will be of significant benefit to public health, not just in Scotland. EPCC is thrilled to be part of this initiative, and we are looking forward to supporting the groundbreaking research that will emerge.

Michèle Weiland EPCC Project Manager

Scotland Substantive Site There are six HDR UK sites in total, comprising a total of 22 universities and research institutes. One of these sites is in Scotland, led by the University of Edinburgh, in collaboration with the Universities of Glasgow, Dundee, Aberdeen, St Andrews and Strathclyde. Find out more at:

The VESTEC project will support decision-making during natural hazard emergencies such as wildfires. ©

New exascale projects at EPCC The EU has recently funded a significant number of new projects to continue to drive forward Europe’s readiness for the transition to exascale computing. As a result EPCC will add another three projects to its portfolio of exascale activities. VESTEC EPCC has been awarded €350,000 as part of the successful VESTEC proposal. VESTEC, led by the German Centre for Aerospace Research, will develop new models for “urgent supercomputing” and real-time data feeds with the aim of turning supercomputers into powerful decision-support tools for natural hazard emergencies such as wildfires, mosquito-borne disease outbreaks and dangerous space weather events. EPiGRAM-HS EPCC has been awarded €500,000 as part of this project to address the heterogeneity challenge of programming exascale supercomputers. Led by the Royal Institute of Technology in Stockholm (KTH), EPiGRAM-HS will improve

programmability, extending a set of existing programming models to exploit accelerators, reconfigurable hardware and heterogeneous memory systems. SAGE-2 SAGE-2 (Percipient StorAGe for Exascale Data Centric Computing 2) will create a next-generation data storage system to enable extreme scale computational workflows. The project will build on the Mero object store to create a system where data-centric computing can be undertaken, moving compute to data rather than data to compute. EPCC has been awarded €310,000 to work on integrating new nonvolatile memory technologies with the object store and storage hierarchy, and enabling byte-level access to data for applications.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

Lorna Smith EPCC Group Manager

EPCC has been at the forefront of research into exascale systems, the next generation of HPC, for a number of years now. Our involvement includes key projects such as CRESTA1 and NextGenIO2, which have explored both hardware development and the preparation of software and applications. [1] cresta-collaborative-research-exascalesystemware-tools-and-applications [2]



Image: ALP

Industry projects update 2018 has been a busy year for EPCC so far, with several new collaborations with national and international companies. Here are some examples of our latest research-focused projects. We anticipate more success stories soon! RIGOCAL (RIGOrous CALculation) is a privately owned engineering company based in Aberdeen. It offers a range of services, including marine mammal observation. We are working with Rigocal on the research and experimental development of neural networks and supporting frameworks for the identification of marine mammal species from video images provided in a restricted set of circumstances. In March we completed a project with Edinburgh-based Oil & Gas SME Artificial Lift Performance (ALP) to determine the feasibility of predictive models for electrical submersible pump (ESP) operations using new and novel machine learning techniques. This project used large-scale analytics on historical behavioural data utilising machine-learning technologies to predict future ESP operating performance and to define optimal intervention schedules. Based on the success of these two projects EPCC is now defining follow-on R&D collaborations with both companies. Working with technology start-up 8

DeepMiner, we are providing software expertise to integrate machine-learning and back-end functions into a robust multi-user interface which can cope with ultra-high volumes of queries and process large amounts of data quickly.

Thomas Blyth EPCC Business Development Manager George Graham EPCC Commercial Manager

In May EPCC started a 12-month collaborative research and development project with Rock Solid Images (RSI), an industry leader in the interpretation and integration of seismic data in hydrocarbon-producing basins around the world (see article opposite).

Event: Advanced technologies for Industry 4.0

This project will investigate novel approaches to petrophysics and rock physics analysis using machine learning, with the goal of improving both the quality and turnaround time of the analysis.

EPCC is perfectly positioned to provide HPC and high performance data analytics technologies to Scottish industry.

The exploration and production sector of the oil and gas industry is moving rapidly towards real-time access and analysis of all subsurface information and this project will develop and optimise new models to do this using machine learning.

Edinburgh, Sep 19, 2018.

This half-day workshop will explain how Scottish engineering and manufacturing companies can benefit from the expertise and support that is available.

Machine learning for oil & gas exploration

RSI’s global rock physics experience. Image: Rock Solid Images.

We are working on a machine-learning project with Rock Solid Images (RSI), a geoscience consulting firm that provides borehole characterisation with the goal of reducing exploration drilling risk for oil and gas companies. RSI is one of the main players in the interpretation of seismic data with well log data and it has built its business on using advanced rock physics methods combined with sophisticated geologic models to deliver highly reliable predictions of where oil and gas might be found.

customers to use between 10 and 100 wells per region, whereas there is raw data available for thousands or even tens of thousands of wells in some locations... but with the current, manual interpretation, process these would take centuries to interpret!

The oil and gas industry is awash with subsurface data and the investigation of a single well produces very many data fields, with values for each of these recorded at many depths down the well. This raw data then needs to be interpreted before it is useful but crucially interpretation is manually intensive and it takes over a week for an experienced petrophysicist to interpret each well. The resulting interpreted data is then fed into RSI’s rockAVO™ software, which enables its customers to both understand the rock physics of existing wells and also predict geology elsewhere in a region.

This project will focus on optimising the process of petrophysical interpretation by using machine learning. Pattern recognition underlies the action performed by the experienced petrophysicist, so a key question is whether one can leverage machine learning approaches to bring down the interpretation time from a week to a matter of a few hours or even minutes. To this end we will develop models that tackle the different steps in their petrophysical workflow.

The central aim for RSI’s customers is to make informed decisions about where to drill and rockAVO™, along with the interpreted well data, is a key part of this. In this process the more wells you have, the more accurate a prediction can be made. But the manual interpretation time of each well fundamentally limits the number that can be used. It is currently common for RSI’s

It is really interesting not only to learn more about this industry but also see how it is becoming very interested in using machine learning techniques to address grand challenges. It is clear to me that, while the application of machine learning in this industry is still at an early stage, there is a real momentum behind obtaining more value from data and an understanding that, if they get it right, it could be a game changer.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

Nick Brown EPCC Data Architect

One of RSI’s key aspirations is to predict the geology of remote and difficult-toreach locations from well explored areas of known similar geology. This is an open question and likely to raise considerable challenges. We won’t necessarily solve all of them in this project, but as our machine-learning models mature we will explore the portability of them to other areas. For instance, how accurately can the models that we have trained based on the geology of midNorway predict rock properties in the Barents Sea? This 12-month project, called Streamlined WorkflOws for Optimal Petrophysics (SWOOP), is funded by the Oil and Gas Innovation Centre (OGIC).


Ray tracing simulation of sound propagation paths in a large concert hall.

High-performance ray tracing for room acoustics Last August EPCC’s James Perry, Kostas Kavoussanakis and I started work on the Auralisation of Acoustics in Architecture (A3) project. One of its goals was to explore the use of ray-tracing techniques to model the sound qualities of a room. Such a tool could help optimise the acoustics of an existing or future concert hall, improving the audience’s listening experience. It could also help recreate the sound characteristics of ruined historical spaces.

wave-based methods for acoustics simulation, which are valid for all frequencies), it is useful as its computational costs can be significantly less than brute-force wave-based methods1. Then again, ray tracing for room acoustics is by no means trivial. Using only the strict principles of classic geometric acoustics2, the computational costs of ray tracing in room acoustics scales primarily with the reverberation time of the room (the time it takes a sound to decay by a factor of 1000 in amplitude), and in a typical concert hall this might mean firing billions of rays, and each ray may bounce through the room hundreds of times.

Ray tracing in acoustics bears many similarities to ray tracing in graphics, primarily in that the brunt of the work involves finding the nearest triangle (in a surface mesh) that intersects a given ray, and this is generally accomplished using acceleration structures such as bounding-volume hierarchies (BVH) or voxel grids. In graphics the goal is to generate an image, whereas in architectural acoustics the goal is to generate an impulse response, which allows one to analyse and recreate the acoustics of a room such as a concert hall.

Because this team has previously worked together through the five-year ERC-funded NESS project3, the collaboration in this project has been extremely productive, with prototype codes written in Matlab by myself, and initial ports to C++ by James Perry, and further back-and-forth testing and optimisation by both of us. Through many optimisations, including efficient parallelisation over multiple CPUs and advanced vectorisation, we were able to ray trace a concert hall in a matter of minutes, as opposed to the days required by Matlab prototype codes.

While ray tracing is strictly a highfrequency approximation (unlike 10

Brian Hamilton Acoustics and Audio Group, University of Edinburgh

Our aim is to get this new ray-tracing tool (and other wave-based tools developed as part of the parallel ERCfunded Wave-based Room Acoustic Modelling project) into the hands of practitioners, to study how auralisation can be utilised in early phases of architectural design.

References 1. B. Hamilton, “Simulating the acoustics of 3d rooms.” blog/2014/11/24/simulating-acoustics-3d-rooms. Accessed: 2018-05-04. 2. L. Cremer and H. A. Müller, Principles and Applications of Room Acoustics, vol. 1. Chapman & Hall, 1982. 3. K. Kavoussanakis, “Ness (next generation sound synthesis project) bows out.” https://www. Accessed: 2018-05-04. The nine-month A3 project was funded by a grant from the CAHSS Challenge Investment Fund.

The HPC-Europa consortium on the steps of the Torre Girona chapel (home to MareNostrum) during the project meeting at Barcelona Supercomputing Centre in May.

HPC-Europa3: the first year It is hard to believe that the first year of HPC-Europa3 has gone by already! In part this is because our first visitors only started arriving at the very end of 2017 – a full 5 years after the end of the previous programme – after a considerable set-up phase. EPCC has now welcomed 12 HPC-Europa visitors from institutes in the Czech Republic, Denmark, Hungary, Italy, the Netherlands and Spain. These visitors have been working on innovative research in areas as diverse as optimising food processing at high pressure/high temperature, using aerial images to detect floating objects in the sea (eg oil spills or shipwrecked people), and designing novel inhibitors of galectin proteins, with potential to be used as anti-cancer and antiinflammatory therapies. HPC-Europa offers an excellent opportunity for academics anywhere in the UK to host European collaborators in their department for up to 3 months – visits are not restricted to research

departments at Edinburgh University. Some of our 12 visitors to date have been hosted by groups at nearby Edinburgh Napier, and further afield at Brunel, Cardiff, King’s College London, Oxford, STFC Daresbury, and University College London. Further, UK-based academics can use HPC-Europa3 to visit research groups in Finland, Germany, Greece, Ireland, Italy, the Netherlands, Spain and Sweden. The programme is open to anyone from postgraduates to full professors. Travel expenses are covered, along with funding for accommodation and subsistence. Applicants should be able to demonstrate that they require access to HPC facilities to carry out the proposed work, and show evidence of a potentially fruitful scientific collaboration with their chosen host. Closing dates There are 4 closing dates for applications per year, and applications may be submitted at any time. The next closing date is 13th September.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

Catherine Inglis EPCC Project Manager

Further information The HPC-Europa team welcomes questions from anyone interested in applying for or hosting a visit. Please contact us at: A recent ARCHER webinar provided information about the programme: You can also read some EPCC visitors’ reports on our blog: HPC-Europa is funded for 4 years, until April 2021, under the European Union’s Horizon 2020 research and innovation programme.


EPCC to install new HPE Apollo 70 system as part of collaboration to advance digitisation of UK economy We are very pleased to announce our involvement in the Catalyst UK programme, a collaboration between HPE, ARM, SUSE and three UK universities (Edinburgh, Bristol and Leicester). Thanks to considerable investment by the three companies, EPCC will soon install a 4,096-core HPE Apollo 70 system. This system will use the new Cavium ThunderX2 ARM 32-core processor. For a number of years, the ARM processor, which powers more devices than any other processor technology on the planet, has been touted as a contender for the next generation of HPC systems. The ThunderX2 from Cavium represents the first serious attempt to produce an HPC-grade multi-core processor based on the ARM instruction set and design. If ARM is to become a serious contender in the HPC world, it’s crucial that there is a fully optimised and well-tested software stack to support users and their application codes. EPCC’s focus will be on porting many of the UK’s key computational science applications – many of the applications that run on the National HPC Service, ARCHER, today – to the Apollo 70 system to explore its performance and identify how best to compile and optimise codes for this new processor. There is of course huge expertise in writing highly-optimised software for the ARM core today, but most of this experience is in 12

mobile applications rather than numerically intensive simulation codes. EPCC will contribute two PhD studentships to the endeavour along with a programme of software porting by EPCC staff and all of the running costs of the system. The system itself will be installed at EPCC’s state-of-the art data centre, the Advanced Computing Facility. It is our intention to run a general user service on the system once it has been installed and configured. There will be more news on this over the summer. Catalyst UK The key focus of the Catalyst UK programme is to investigate and showcase the potential of Armbased HPC installations. This is one of the current approaches to overcome the limitations of traditional computer architectures and offer a better priceperformance ratio for modern workloads and applications. This includes AI, which needs to process large amounts of data and requires extremely high memory bandwidth, and exascale computing, which requires HPC systems to be hundreds of times faster and more efficient than today’s fastest supercomputers.

Mark Parsons EPCC Director

Following HPE’s purchase of SGI in 2017 we’ve built a new, strong relationship with the merged companies. This is a very exciting time in HPC as we look to the exascale. New technologies are needed to reach an Exaflop and ARM-based processors may be a key component. We look forward to installing the system and exploring the performance of a large-scale, Infinibandbased ARM system for the first time.

Photographs of Tesseract by Craig Manzi, EPCC.

Unwrapping Tesseract: the new DiRAC system hosted by EPCC DiRAC is part of the UK Government’s investment into the UK’s e-infrastructure to improve computing, software development, data storage, networking and distributed computing networks. In January 2018, EPCC’s Advanced Computing Facility (ACF) retired the Science & Technology Facilities Council’s (STFC) DiRAC BlueGene/Q service. In March it was replaced by the STFC’s new Extreme Scaling system, an 844node Hewlett Packard Enterprise 8600 “Hypercube” supercomputer. Each node has two Xeon scalable processor 4116 “Silver” nodes, and 96 GB of memory. In total, the new system has 20256 Intel Skylake computing cores, and is interconnected using 100 Gbit/s Intel Omnipath interconnect. The Hewlett Packard 8600 is based on the former SGI ICE-XA technology, with tightly coupled

network and energy-efficient water cooled technology. Like its predecessor, the system is uniquely optimised for Cartesian PDE problems that arise in computational particle physics simulations. Unlike other 8600 installations, the Edinburgh system uniquely makes use of an exact power of two (16) nodes on each leaf switch, assisting the science by placing the Cartesian problems precisely on the underlying hypercube network topology. The system’s name, Tesseract, highlights this unique use of topology and customisation to the target problems. In addition to carrying out the critical infrastructure installation and system configuration work required to install and operate the system, the ACF and HPE teams have assisted STFC scientists in demonstrating that large computing jobs can run all the interconnect links concurrently at wire speed, making for a highly scalable simulation platform for the target class of problem.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

Peter Boyle School of Physics & Astronomy, University of Edinburgh

Commercial on-demand access Tesseract is available for commercial use. Contact George Graham, EPCC’s Commercial Manager, to find out more: +44 (0) 131 651 3460 +44 (0) 777 370 8191


Improving the scalability of an emergency response model The UK Met Office’s NAME model can simulate the progress of billions of particles through the global atmosphere, but requires major computing resources to do so. Thanks to work by EPCC, NAME can now run on large parallel systems and promises more accurate forecasts of hazards. Early on 26th April 1986, Reactor Number 4 at Chernobyl – in what is now Ukraine – suffered a catastrophic failure, causing the release of a large radioactive plume into the atmosphere. Weather conditions at the time caused the contents of the plume to be swept over north-western USSR, and towards Scandinavia and other parts of northern Europe. This resulted in widespread public health concerns, particularly in upland areas where high rainfall led to significant accumulation of radioactive isotopes such as iodine and caesium on farmland and hence in grazing animals. One further result of the Chernobyl disaster was the development of what was then referred to as the “Nuclear Accident Model” at the Met Office. Originally intended to simulate the dispersion of radioactive species, interests have expanded to include routine air quality forecasting, and predicting the spread of pollutants, airborne viruses and volcanic ash. Notable occasions when the model has been brought into operational use include the outbreak of foot-and14

mouth disease in the UK in 2001, the eruption of Eyjafjallajökull and the consequent disruption to commercial air traffic in April 2010, and during the Fukushima nuclear power station disaster following the tsunami in Japan in 2011. This widening of the scope of interest has meant the model has been renamed Numerical Atmospheric dispersion Modelling Environment (NAME)1. NAME is a “Lagrangian” model which represents atmospheric dispersion by tracking simulated ‘particles’. Movement through the atmosphere is typically driven by numerical weather prediction data (either historical data or operational forecast) while particles also undergo a random motion to represent small-scale turbulence which is not resolved. Particles are generated by a specified source or set of sources which may be natural (eg a volcano or a fire) or man-made (eg a factory or other known source of pollution). Particles may be removed from the atmosphere by a number of different processes such as fall-out

Kevin Stratford EPCC Software Architect Ben Devenish UK Met Office


owing to gravity and impact with the ground, and ‘wash-out’ by rain. The main computational cost in NAME is therefore representing a large number of simulated particles which carry a significant payload of information (position, velocity, chemical content, and so on). To move forward in time, the properties of each particle must be updated using the relevant dynamics, chemical rate equations, radioactive decay laws and so on. As the statistical accuracy of the results of any given simulation depend upon having a large enough number of particles in any relevant area of interest, millions or even billions of particles might be required for foreseeable applications. This places a strain on both memory resources and time-to-solution when using an existing (thread-) parallel version of NAME. To allow NAME to be used on larger parallel machines, work by Matt Rigby of the University of Bristol, together with the Met Office and EPCC2 has developed a distributed memory implementation of NAME. A number of key challenges have been overcome to produce an

operational version: • Particles are split up between the computational resources in a way that balances the amount of work given to each, and also allows all particles in a given geographical area to interact for the purposes of chemistry. • Particles may need to be transferred between resources as they move around the computational domain, requiring communication via the message passing interface. • Appropriate results must be aggregated and output for user analysis. All this work allows significantly higher numbers of particles to be modelled. This should allow workers at universities and government agencies to use NAME in conjunction with higher resolution meteorological data which are now becoming available3. One can hope that this would help mitigate the effects of accidents such as Chernobyl and eruptions such as Eyjafjallajökull.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

The eruption of Eyjafjallajökull in Iceland in April 2010 led to disruption of air traffic around Europe. The Met Office’s NAME model was used to forecast the spread of the plume of ash in the atmosphere. Image:©

Notes/references 1. research/modelling-systems/ dispersion-model 2. The work was funded under the embedded CSE programme of the ARCHER UK National Supercomputing Service ( 3. NAME is available under license from the Met Office. The distributed memory version will be available in a future release.


The European Open Science what? The Square Kilometre Array, an international initiative to build the world’s largest radio telescope, will generate 300 petabytes of scientific data per year by 2024. Storing and processing data on this scale is too big a job for any one research organisation. As Michael Wise, Head of Astronomy at ASTRON in the Netherlands, said: We cannot do this alone, we simply have to collaborate. ‘Big’ isn’t the only aspect of the data-driven challenge facing science today. It is also complex – for example, how do you launch, track and manage a computational model of a Tokamak nuclear fusion reactor which involves multiply connected simulations across desktop, cloud and highperformance computers? Sensitive data presents another challenge – linking personal medical data with behavioural data from supermarket loyalty cards could provide insights into the emergence of dementia and other neuro-degenerative diseases, but doing that across borders demands the utmost care. European Open Science Cloud Launched last year, the European Open Science Cloud (EOSC) is an EU research initiative that seeks to 16

lay the international computing foundations for tomorrow’s big science. It builds on the ideas of open access to scientific methods and results (software, papers and data), and focuses strongly on open data.

Rob Baxter EPCC Group Manager

EOSC envisages a rich, everexpanding suite of computational services on top of a layer of findable, accessible, interoperable and re-usable research data (the FAIR principles). And while EOSC’s push towards openness and interoperability may be new, the data, services and resources that form the heart of this cloud are not.

To tackle some of society’s biggest challenges we need to support large, complex, mission-led, scientific endeavours, and to do that we need an interconnected, interoperable, frictionless platform of compute and storage resources at an international level.

Europe’s scientific computing infrastructure (e-infrastructure) has coalesced into a number of specialised initiatives (EUDAT focuses on storage and data management; EGI on highthroughput and cloud computing; GÉANT on networking) and a number of challenges must be met before frictionless interoperability is achieved. The biggest win for EOSC would be single sign-on, followed by uniform access to data. One of the reasons the Web works so smoothly for many is the ability to “log in with Facebook” or “sign in with Google”, a widely-accepted authentication token that provides access to a broad range of services. Single sign-on is even more useful in a

The European Open Science Cloud may be the answer.

An example of mission-led science: the European Incoherent SCATter Scientific Association (EISCAT) is an international scientific organisation that conducts ionospheric and atmospheric measurements with radars, including observing the effects of the aurora borealis. Image: Craig Heinselman, EISCAT director.

cloud environment. It would be a major achievement if researchers or their computational proxies could log in once to the “Science Cloud” then access multiple European resources. How to achieve this is well understood, but there are perhaps too many ways to do it: interoperability between numerous existing authentication and identity systems is the stumbling block. This illustrates EOSC’s main challenge: the solutions it needs to put in place are more political than technical, and need agreement between multiple stakeholders. A uniformly-accessible data layer is regarded as the foundation of EOSC, and again the principal challenge is one of agreement between stakeholders. Following the FAIR principles, data in EOSC will first need to be well described with an agreed basic metadata record to support search and cataloguing (there are plenty of lessons here from library science). Web services will play a big role in accessibility, and standard, open data formats will support interoperability: EOSC data should achieve at least a “three star” rating on the five-star open data scale. Reusability is underpinned by the “O” in EOSC; where data can be shared freely they will be available under unambiguous open licences – public domain or simple attribution under a scheme like

Creative Commons. (One of EOSC’s challenges will be enabling international public health research using restricted data.) EPCC’s role EPCC has long been a partner in two of the major underpinning e-infrastructures: the highperformance computing alliance PRACE, and the data infrastructure EUDAT. We also have roles in two new EOSC projects: EOSC-hub and eInfraCentral. EOSC-hub, which started this year, is regarded as one of the cornerstone projects of EOSC, bringing together service providers from the EUDAT and infrastructure organisations with software providers from Indigo DataCloud and a significant number of scientific research infrastructure users from across Europe. EOSC-hub will blend EUDAT’s data services with EGI’s computational services to create an “EOSC 1.0”, a blueprint for e-infrastructure in Europe for the next decade. It will do this in tandem with OpenAIREAdvance, the new phase of the open access initiative, and eInfraCentral. eInfraCentral focuses on the “findability” aspects of EOSC, working with e-infrastructure service providers to build a common catalogue of everything, from data to HPC services.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

There’s a lot of work ahead, but the hope is that by 2021 European e-infrastructure will be prepared for whatever big science can throw at it.

Links EGI eInfraCentral EISCAT EOSC EUDAT FORCE11 FAIR data principles fairgroup/fairprinciples OpenAIRE Square Kilometre Array


EPCC’s MSc programmes: all change

The 2018/19 academic year will see major changes for our MSc programmes in High Performance Computing (HPC), and HPC with Data Science. Our current cohort of students will be the last to be based at the original home of the MSc programmes here on the King’s Buildings’ campus. This is also the last year that the programmes will be under the governance of the School of Physics & Astronomy. From 2018/19, the programmes’ teaching will be based in the University’s Central Area campus, student study space will be colocated with EPCC in the new Bayes Centre, and the programmes’ governance will be under the School of Informatics. Our students’ dissertation projects, undertaken in collaboration with industrial partners, have established strong links between the academic and business spheres. The move to the Bayes, with its mix of multidisciplinary academic expertise and industrial tenants, will encourage the further expansion of these relationships. 18

New faces For 2018/19 we will be welcoming various members of EPCC staff into roles within the MSc programmes: both long-time staff and very recent hires to give a range of perspectives. Our new personal tutors include an MSc alumna and a second-generation EPCC member of staff.

Innovation and curriculum change The MSc programmes team undertook a curriculum review, which identified three major changes which will be enacted for the 2018/19 academic year: Design and Analysis of Parallel Algorithms This former School of Informatics course previously run by Dr Murray Cole was very popular with EPCC students and will return to the University’s curriculum as an EPCC


Ben Morse EPCC MSc Programme Officer

For more information about our MSc programmes, or to apply, see our website:

EPCC’s 2018 Student Cluster Competition team: (left to right) Wilson, Linda, Iñaki, Manos and Spyro.

course with Dr Daniel Holmes as Course Organiser. Its return has enabled changes and improvements to many other EPCC courses and, while not a compulsory course, it will be a highly-recommended option for both HPC, and HPC with Data Science. Parallel Programming Languages This has been the most popular optional course due to its introduction to Fortran and initial exploration of more advanced topics. It will now become Advanced Parallel Techniques, and will be one of our most advanced courses, moving to the second semester with a focus on advanced and upcoming technologies for parallel computing. The everpopular Fortran teaching by Dr Fiona Reid will move to programme level as part of student induction to both MSc programmes.

HPC Ecosystem This compulsory course on the MSc in HPC will be discontinued. Much of its material will be retained within other courses on the programme and its popular Guest Lectures will still take place at programme level. There is no direct replacement as a compulsory course, allowing HPC students a greater choice of optional courses within EPCC.

Student Cluster Competition

Further changes to the curriculum include the renaming of some courses, including Parallel Numerical Algorithms becoming Numerical Algorithms for High Performance Computing, and Advanced Parallel Programming becoming Advanced Messagepassing Programming. An extension of the Project Preparation course into Semester 1 will allow for a more longitudinal induction to the dissertation process, and a more structured approach to project selection and study skills support.

Coached by Manos Farsarakis, now the HPC Architectures Course Organiser on the MSc and the captain of Team EPCC ’14, which won the ISC14 Highest LINPACK award, the team will be aiming high.

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

TeamEPCC returns to take on competitors from all over the world in the Student Cluster Competition at ISC in Frankfurt in June 2018.

You can hear how they’re getting on by following @TeamEPCC on Twitter. We look forward to reporting back on the competition in the next issue of EPCC news. 19

Moving on... To mark our move to the Bayes, staff and old-timers reflect on their time at the James Clerk Maxwell Building (JCMB), EPCC’s first home. It is hard to imagine EPCC outside of JCMB. This quirky building has greatly influenced the culture of the Centre. The 22’ and 34’ corridors, the Pig Pen office and the occasional adornments such as the reduced-scale model of the HPCx installation in a large, mushroomshaped display case (..!). I loved the rather large, heavy but also very stylish curved Data Vault in the coffeeroom. It looked like a fancy

bullet-proof cocktail bar and after it was decommissioned it was used to store fresh laundry for our EU visitors. A great perk of JCMB was also to have a student café with great bacon rolls and Irn Bru for those special end-of-week mornings… As Calvin Harris might say, it was acceptable in the 90s. Jean-Christophe (“JC”) Desplat, Director, Irish Centre for High-End Computing (ICHEC)

What will I miss about JCMB? I will miss the gorgeous view of Craigmillar Park Golf Club from my office window and my lunchtime walk, and I will also miss the people who will remain at JCMB when we move out. In particular I will miss the JCMB Servitors who have served EPCC so well over the years. Maureen Simpson, EPCC Director of Operations

I have been based in JCMB twice, first working for Thinking Machines and then I returned with Cray. The early years were a little different with a range of interesting machines under one roof: Meiko Computing Surfaces, DAP, TMC CM-200, Hitachi and then the T3D and T3E. So what do I remember... • Julie’s Café for Friday fish & chips. • The old Training Room with Sun Rays whose desktops moved with you. This room was previously a lab 20

with a graphics monitor directly connected to the Connection Machine. • The never-ending number of lost people trying to get out of JCMB. • The day the underwood wiring in a nearby office burnt out and filled the room with smoke. Luckily it didn’t take long for the fire brigade to arrive. Harvey Richardson, Senior Research Engineer, Cray

I first came to JCMB in 1994, to do the MSc in Computer Science. I remember visiting the Course Organiser at the end of my first day, asking for instructions on how to get out of the maze! In 1998 I returned for my EPCC interview for the post of Applications Consultant and duly got lost again. I look ahead to the move to the Bayes with optimism. It promises to be a hub of activity, with EPCC at the centre of knowledge transfer; that’s what I signed up for. What’s not to like? Kostas Kavoussanakis, EPCC Group Manager

For me, the move to the Bayes Centre is a reminder of how much EPCC has grown prior to my joining in January. My perfectly normal-looking office has the affectionate nickname ‘Lead Underpants Room’ because of the materials that were stored here before it was needed as office space to accommodate some of the new developers who have joined recently. Although I’m fond of my office, it will be nice to move into a room with a less dubious past. Jane Kennedy EPCC Applications Developer (Data Science)

Clockwise from top left: David Wallace holds a board from a Meiko Computing Surface; Cray J90 & T3D; T3D; Arthur Trew (right) with “bulletproof cocktail bar”, the Data Vault; Thinking Machines CM-200; Cray T3D & TMC CM-200; Meiko Computing Surface; Meiko CS-2; ICL DAP. Opposite: JCMB.

My abiding memory of JCMB was its awkwardness as a place to house and support national HPC operations. That we ever got the Cray T3D, J90, T3E and IBM tape-library into the building, much less powered up and operational, was more than a minor miracle! Other memories: • Trying to explain to visitors that the main entrance was on Level 2. • An envelope I had on my noticeboard for many years addressed to “Dr JCM Building”. • The electrical provision in (what

Frankly, JCMB isn’t going to set architectural hearts racing. But for many of us, as students or staff, it has been a truly significant place. I did my Computer Science degree there, spending many hours toiling in the basement – concluding that my skills did not lie in cutting code. Later I returned to JCMB as the Commercial Manager at the Edinburgh Parallel Computing Centre (as we were known in 1990). The 34 corridor became our home – hosting a Summer Student programme each year and a whole array of industry projects and visitors. Before the Advanced Computing Facility was a twinkle in EPCC’s eye, we welcomed the

were used as 3-person) offices: two unswitched dual sockets flat on the floor under the radiator at one end. • When the weather was right (no wind, 98% humidity) the mushroom-shaped steam cloud from the original wooden-framed “wet” open cooling towers on the main roof could seen right across south Edinburgh. Mike Brown, former Director of HPC Operations at JCMB, later Building Manager of the Advanced Computing Facility

arrival of, what were at the time, major milestones on the EPCC journey with esoterica such as the Connection Machine and the Cray T3D. Even now, entering the doors at JCMB reminds me of those days and I am sure it holds similar memories for many others who have worked at EPCC. EPCC may say farewell to JCMB as it progresses to its new home in the Bayes, but I am sure for many JCMB will remain EPCC’s spiritual home for some time to come. Kevin Collins former EPCC staff, now Assistant Principal Industry Engagement, University of Edinburgh

The newsletter of EPCC, the supercomputing centre at the University of Edinburgh

I first entered JCMB as an undergraduate back in 1991. Back then EPCC would have been just beginning and I wasn’t even aware of its existence, but I remember the cafe: giant burger + chips for 80p! Fiona Reid EPCC Applications Consultant

JCMB and Physics gave birth to, and nurtured, EPCC but the time has come to leave home. This will enable closer collaboration with the Informatics and Mathematics communities, and hence working with a wider range of companies and users. EPCC’s links to its traditional areas are now so strong that I see only advantages in moving to the University's new Bayes Centre. If my analogy with growing up makes it sound that EPCC is becoming middle-aged then think again; I have never known a time of as much promise or desire to grasp the opportunities. Together, EPCC and Bayes will form the basis of a strengthened data- and HPC-driven knowledge economy and I have great hopes for the future. Arthur Trew, former EPCC Director, now EPCC Chairman


Review: INTERTWinE exascale workshop The Horizon 2020 project INTERTWinE is addressing scalability issues for hybridised parallel software, to help scientists get ready for the arrival of the first exascale supercomputers. The project has been running for two-anda-half years and, with just six months to go, it was an ideal time to disseminate INTERTWinE’s key outputs to the computational research community. INTERTWinE organised an exascale Applications Workshop to coincide with EASC 2018 in Edinburgh (see opposite page). The workshop brought together experts from parallel programming, scientific computing, and HPC research to consider the state of the art in hybrid programming and the next steps to realising a practical solution for exascale. The first exascale supercomputers, which are expected to be in service in the early 2020’s, will be highly parallel and rely on a deep hierarchy of nodes, sockets, cores and vector units to achieve a quintillion calculations per second. To program an exascale computer, one will need tools that address parallelism at all levels of the hierarchy. No ‘silver bullet’ solution exists and it is unlikely one will appear soon enough. Thus researchers are likely to have to rely on existing parallel programming models (APIs) used in combination (so-called hybridisation). However, topical attempts to employ hybridisation are frequently stifled by limitations in the way that APIs interoperate. INTERTWinE has worked to eliminate these limitations, and this effort formed the backbone of the Exascale Applications Workshop. 22

Workshop themes The first theme considered asynchronous execution, a common issue for supercomputer software, requiring additional synchronisation at the interface between APIs to work around gaps in each API’s knowledge of the progress of the computation as a whole. INTERTWinE presented work on a new construct called MPI_TASK_ MULTIPLE that boosts the options for engaging the ubiquitous MPI library within a task-parallel application. The second session considered interoperability more generally, and examined recent innovations to help programmers more tightly couple different APIs. INTERTWinE’s work on GASPI Shared Notifications, which eliminates unnecessary intra-node communications, was showcased along with early results from its application to spaceweather modelling. The third session considered the critical though often overlooked topic of third-party libraries. Library use is critical for any serious software. However, it is particularly difficult to integrate parallel libraries alongside homegrown parallel code, without either over-subscribing or under-utilising resources. Continues opposite.

George Beckett EPCC Project Manager

EPiGRAM-HS While INTERTWinE completes in the third quarter of 2018, the legacy of the Exascale Applications Workshop will live on. The soon-to-start EPiGRAMHS project (part of Horizon2020) will take over the organisation of the event, with plans to run the next workshop in 2019. Details will be published on the EPiGRAMHS website, which will launch in October 2018.

EASC: not your average exascale conference In April, EPCC hosted EASC 18, the Fourth Exascale Applications and Software Conference. EASC has gone from strength to strength, and now attracts people from all over the world – the US, Japan, EU, and of course the UK. Our keynote speakers set the tone, discussing advances in memory technologies, automatic hardware compilers, the readiness of particular large-scale scientific applications for exascale and the use of HPC in racing-car design. Many of the talks mirrored these discussions with a strong focus on application enablement.

The poster session proved interesting, with a diverse range of presentations around the exascale theme – from weather applications, through deep learning, to educational outreach to school children. Overall it was a friendly, engaging and high-quality programme, topped off by a meal with Scottish flavour – a piper and Address To A Haggis proving very popular.

Lorna Smith EPCC Group Manager

This year’s conference was held in the John McIntyre Conference Centre, which is overlooked by Arthur’s Seat. Towering over the city, this ancient volcano provided an impressive backdrop to a conference discussing the challenges of developing HPC systems on an impressive scale.

INTERTWinE workshop (continued) INTERTWinE has proposed a new runtime and API that allows resources to be loaned or borrowed by different computational kernels to maximise the exploitation of CPU resources in all phases of a simulation. The final session tackled the emerging domain of distributed tasking. Task-based programming is gaining popularity as an approach to parallelisation of algorithms with less regimented structures, as is commonly found in data science and data analytics. Usually tasking

is limited to the node, but INTERTWinE has invested time in developing emerging solutions that work at the multi-node level, as a step towards achieving wholesystem scale. The workshop was a huge success, attracting 31 participants for a relatively specialised and advanced programme of talks. There was strong representation from both industry and academic research, with participants from across Europe and North America.

Managing your EPCC subscriptions To subscribe to the print or electronic version of EPCC News, to join our Events mailing list, or to be removed from any EPCC mailing list, simply email your request to

+44 (0)131 650 5030

Twitter: @EPCCed

EPCC is a supercomputing centre based at The University of Edinburgh, which is a charitable body registered in Scotland with registration number SC005336. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh


Image: Paul Dodds

Study HPC in the heart of the city

Master’s degrees in High Performance Computing (HPC) and in HPC with Data Science EPCC is the UK’s leading supercomputing centre. We are a major provider of HPC training in Europe, and have an international reputation for excellence in HPC education and research. Our MSc programmes in High Performance Computing (HPC) and HPC with Data Science have a strong practical focus and provide access to leading edge systems such as ARCHER (the UK’s National HPC Service), and Cirrus (an EPSRC Tier-2 National HPC facility). MSc students have the opportunity to undertake their dissertations as an industrial project, building on EPCC’s strong business links. Recent project partners range from start-ups to multinationals.

“Studying the MSc in HPC at EPCC has given me the benefit of a thorough practical grounding in supercomputing and the once-in-a-lifetime opportunity to participate in the Student Cluster Competition at ISC High Performance 2018, alongside the wider opportunities afforded by the student experience at the University and in the city of Edinburgh.”

Programmes are taught in the heart of Edinburgh, with brand new student facilities for 2018/19. Optional course choices include modules from the School of Informatics and the wider College of Science and Engineering. Our graduates are in high demand in both academia and industry in the UK and abroad. The University of Edinburgh is ranked in the top 30 universities in the world by both Times Higher Education World University Rankings 2018 and QS World University Rankings 2018. “Modules covered the full range of HPC and Data Science skill sets from core ‘best-practice’ ways of working to the latest technologies. These were well-structured and delivered at a good pace by lecturers who were more than happy to engage in discussion in response to questions.” Dr Andy Law, Roslin Institute, 2017 MSc in HPC with Data Science graduate

Wilson Lisan, 2017/18 MSc in HPC student


Profile for EPCC, University of Edinburgh

EPCC News 83  


Recommendations could not be loaded

Recommendations could not be loaded

Recommendations could not be loaded

Recommendations could not be loaded