Collaborate Connect Contribute 2016
Table of Contents
A NOTE FROM THE CHAIR
WHEN WORDS COLLIDE
INSPIRING CULTURAL COLLABORATION
ACCESS TO FURTHER COLLABORATION ON CLIMATE
SOLVING ENDOCRINE DISORDERS WITHOUT BORDERS
BIODIVERSITY AND CLIMATE CHANGE VIRTUAL LAB STUDY
TRACKING THE GLOBAL SUPPLY CHAIN
A CHANGING TIDE
20 24 28 32
TAKING THE ‘IT’ OUT OF BIOINFORMATICS
SCALING CHARACTERISATION FOR IMPROVED OUTCOMES
THE SKY’S THE LIMIT
CONTRIBUTING THROUGH COLLABORATION
PROTECTING AUSTRALIA’S SHARK POPULATIONS
A GAME CHANGER FOR GEOPHYSICS
36 40 44 48 NECTAR CLOUD
DATA FOR INTELLIGENCE
A MASSIVE LEAP FOR DATA CAPTURE AND PROCESSING
54 58 62
INTRODUCTION ASSOCIATE PROFESSOR GLENN MOLONEY, DIRECTOR, NECTAR
The National eResearch Collaboration Tools and Resources project (Nectar) is a $61 million investment by the Australian Government in eResearch infrastructure under the National Collaborative Research Infrastructure Strategy (NCRIS). Since joining Nectar in early 2011, it has been rewarding to work with Australian researchers to develop this innovative infrastructure which now supports diverse needs across the breadth of Australian research. Australian research institutions and organisations have co-invested over $54 million in the Nectar programs; which enhance the impact of Australian research by enabling researchers to more easily collaborate and share ideas and research outcomes with colleagues and industry in Australia and around the world. Nectar Virtual Laboratories (Nectar Labs) are innovative domain oriented online environments that draw together research data, models, analysis tools and workflows to support collaborative research across institutional and disciplinary boundaries. Nectar Labs are built and led by the Australian research sector and used by over 7,500 Australian and international researchers. The Nectar Labs have also leveraged existing eResearch infrastructure provided by the Australian National Data Service
(ANDS), Research Data Services (RDS), the Australian Access Federation (AAF), the Australian Research and Education Network (AREN), the National Computational Infrastructure (NCI) and the Pawsey Centre. The Nectar Research Cloud (Nectar Cloud) is a single integrated cloud operated by eight national partners to allow Australian researchers to store, share, and analyse data, remotely, rapidly and autonomously. Nectar Cloud usage has grown since 2012 to over 7,500 registered users and supports over 1,200 project allocations from 54 research organisations across all fields of research.
Nectar is a world leader in deploying highly innovative cloud computing technology for the benefit of research â€“ providing opportunities for federation with emerging international research clouds. The Nectar Cloud is supporting the NCRIS mission to deliver world class research facilities so that Australian researchers can solve complex problems here in Australia and around the globe. As lead agent for Nectar, the University of Melbourne has coinvested over $500,000 to enable this work. For more information about Nectar please visit nectar.org.au/about/ 7,598
Nectar Cloud usage has rapidly grown since 2011 to over 7,500 registered users.
2011-12 2012-04 2012-07 2012-10 2013-01 2013-04 2013-07 2013-10 2014-01 2014-04 2014-07 2014-10 2015-01 2015-04 2015-07 2015-10 2016-01 2016-03
A NOTE FROM THE CHAIR PROFESSOR KATE AUTY, CHAIR, NECTAR PROJECT BOARD
The Australian Government enables Australian researchers to continue to make significant impacts at both national and international levels through a number of programs, including investment in Australia’s cutting edge collaborative research infrastructure network. As one of these initiatives, Nectar has been highly innovative in creating valuable outcomes for the Australian research community, facilitating research communities to support improved collaborative outcomes and build skills in the sector. The issues facing our national and international communities require researchers to work together collaboratively across both disciplines and distance. Research is now a global activity and an international orientation is vital to success.
Programs like Nectar are vital in driving collaboration between researchers, government and industry to deliver practical outcomes, and enabling Australian researchers to continue to produce approximately 3% of the world’s research outcomes with only 0.3% of the world’s population. One of Nectar’s most significant strengths is that it is researcher-led and supported, with structures throughout its program to ensure that the research community is the primary voice in project planning. This principle will continue to be key as Nectar expands linkages between research and industry, health, education and government, to secure higher levels of innovation and encourage entrepreneurial approaches.
WHEN WORDS COLLIDE “It’s important to understand whether people need to change to fit machines, or if machines need to change to fit people.” PROFESSOR DENIS BURNHAM
SECOND LANGUAGE LEARNING STUDIES COCHLEAR IMPLANTS & HEARING AIDS
LEARNING PROGRAMS FOR CHILDREN WITH LEARNING DISABILITIES
AUTOMATIC MELODY RECOGNITION
HCS RESEARCH IMPACTS AUTOMATIC SPEECH RECOGNITION FORENSIC DETERMINATION OF ORIGIN OF ACCENTS
Alveo is a virtual laboratory that provides human communication science researchers all over Australia with infrastructure to store data collections, analysis tools, and workflows in a common environment, allowing researchers to study speech, language, text, and music on a larger scale. Exploring the relationship between human and mechanised communication, the relatively new research field of human communication science (HCS) brings together phonetics, linguistics, psychology, language technology, computer science, and music cognition to solve key health, social and cultural communication problems. Following the Australian Research Council’s call for more collaborative, interdisciplinary research projects, the field of HCS took off in Australia in 2003, led by Professor Denis Burnham, Director of the MARCS Institute at the University of Western Sydney (UWS). “As a result of the ARC’s call for research collaboration, my colleagues and I, who were studying speech and music cognition at MARCS, joined forces with
researchers specialising in language technology and computer science at Macquarie University,” Professor Burnham said. Now President of the Australasian Speech Science & Technology Association (ASSTA), Professor Burnham takes an HCS approach to his research, studying language development in children and differences in the ways people speak to infants, foreigners, pets, computers – even lovers! “HCS is more or less the way humans communicate with each other and with machines in codified manners: speech, text, music, or emotional responses to sounds,” Professor Burnham said. “Ten to fifteen years ago I wouldn’t have thought I’d be working on things like this.” “It’s certainly been a learning curve, for all of us. “By bringing researchers from each of these fields together you get a confluence of ideas and find new ways of approaching problems – it leads to serendipity.” An increasingly important field, HCS enables researchers to program machines to better understand what people are saying, doing and trying to communicate. This includes everything from automated speech recognition and cochlear implants,
to learning aids for disabled children. “When you’re trying to fit machines and people together it’s important to understand whether people need to change to fit machines, or if machines need to change to fit people,” Professor Burnham said. In response to the ARC’s call for collaborative Research Networks, Professor Burnham and his colleagues formed HCSNet, a five-year ARC funded program to help stimulate the field of HCS in Australia. “As we were really at the coalface of this new field of research the HCSNet held more than 60 seminars and workshops and five annual conferences over the five years it was in operation,” Professor Burnham said. As a result of the network, a number of collaborative research projects were established, including AusTalk, an auditory-visual database of Australian English collected from 1000 speakers and funded under the Linkage Infrastructure, Equipment and Facilities (ARC LIEF) program. “The problem was we had effectively formed a community of researchers working on common problems in HCS, but lacked a common infrastructure to work from.” While significant data collections and analysis tools from projects such as AusTalk were being created and collected by the HCS research community, the disparate locations of the research institutions involved made collaboration and replication difficult. “That’s when we discovered Nectar Labs,” Professor Burnham said. “A virtual space for us to build and store our collections, add our analysis and workflow tools and ensure accessibility across the county, and even around the world.
“We launched our virtual laboratory, Alveo, in June 2014, providing all researchers with the infrastructure to store data collections, analysis tools, and workflows in a common environment, allowing all HCS researchers to study speech, language, text, and music on a larger scale.” Funded by Nectar, the Alveo virtual laboratory was established by Associate Professor Steve Cassidy from Macquarie University (Product Owner) and Dr Dominique Estival from UWS (Project Manager) with the help of Intersect Australia, who undertook the software development. “With 13 different universities involved in the Alveo partnership, the set-up of the virtual lab was a challenge,” Professor Burnham said. “At different institutions there were different specialties: Melbourne had a lot of linguists and engineers, Macquarie had a number of language and technology researchers, while at UWS we were focusing more on psychology and music cognition.” The Alveo launch at the end of June included a ‘hackfest’ and user workshop, allowing researchers to explore ways they could use the virtual laboratory, and come up with ideas for new tools and data to incorporate into the program. “I was struck by the number and diversity of researchers interested in adding collections and using the virtual laboratory in their work,” he said. “Alveo has really inspired me to consider new ways of approaching my research. “If all HCS researchers put their data in the one place, with the same metadata and a suite of analysis tools, we could standardise research so much more easily and have far greater outcomes. It’s really exciting.”
Alveo is now entering its second phase with UWS partnering with the recently established ARC Centre of Excellence for the Dynamics of Language, directed by Professor Nicholas Evans at The Australian National University. Alveo will also become a portal for PARADISEC, the Pacific and Regional Archive for Digital Sources in Endangered Cultures, a key facility for preserving recordings of endangered languages and music. For more information about Alveo, go to alveo.edu.au
INSPIRING CULTURAL COLLABORATION The Humanities Networked Infrastructure (HuNI) was developed as part of the Nectar Labs program to share and combine humanities data assets, allowing researchers to discover unexpected connections. A partnership among 13 public institutions, HuNI currently combines information from 30 of Australia’s most significant cultural datasets. When Professor Deb Verhoeven, Chair of Media and Communication at Deakin University, began her humanities research career, her colleagues guarded their collections closely. “In the past, you used to prove your credentials by amassing some kind of collection that you held tight,” Professor Verhoeven said. “Then you carefully eked out that knowledge over time. “Recently there has been a shift, a very good one, towards sharing the assets. Your credentials are now driven by your ability to interpret collections and circulate these interpretations as widely as possible – not holding them close to your chest.”
Professor Verhoeven wrote her PhD on Australian cinema history. Many of the films she has studied can only be seen at film archives because the copyright holders have not released them in accessible formats. “I’m an expert on something no one else can really take an interest in because they can’t actually see the films,” she said. “That’s why I’m a great advocate for getting the assets out there. “If only those movies were widely available I could start having a conversation with someone. It might even turn into a debate – wouldn’t that be interesting? But that’s not going to happen anytime soon because no one can easily access the materials. “I don’t believe locking things up like this is the answer to expanding knowledge or improving research quality in Australia. I believe you should have access to as much as is ethically acceptable.” Professor Verhoeven has been the Project Leader for HuNI, the Humanities Networked Infrastructure, developed as part of the Nectar Labs program. HuNI is a major step forward
“I don’t believe locking things up is the answer to expanding knowledge or improving research quality in Australia.” PROFESSOR DEB VERHOEVEN
in sharing and combining humanities data assets and allowing researchers to discover unexpected connections. A partnership among 13 public institutions, HuNI combines information from 30 of Australia’s most significant cultural datasets, with the potential to incorporate more in future. Previously these datasets existed in silos. Combining them through the search and analysis tools in HuNI provides a much richer research experience. “HuNI allows researchers to create, save, and publish selections of data, analyse and assert relationships among data, share findings, and export the data for use in external environments,” Professor Verhoeven said. One researcher who has already benefited from HuNI is Dr Michelle Smith, Research Fellow in the Centre for Memory, Imagination and Invention at Deakin University. Dr Smith’s current project is examining how female beauty in the Victorian period, in magazines and fiction read by young women. “The history of how female beauty became connected with consumer culture could tell us a lot about the formation of today’s expectations about the attractiveness of girls and women,” Dr Smith said. “As more women undergo cosmetic surgery to meet a
particular ideal of beauty, it could be beneficial to understand how this ideal was originally shaped.” Dr Smith said that before HuNI, it was difficult to know what kinds of materials existed that would relate to her project in the Australian context. In addition, because she works in the field of literary studies, she undertook her research using familiar sources within that discipline. “HuNI helped me to see the connections in my research topic with other areas, including theatre, design and fashion. For a start it’s given me the ability to find materials across these areas, but it’s also allowing me to draw connections between them,” Dr Smith said. “Australia’s cultural history has been somewhat buried and existing projects disconnected from each other. HuNI brings together so many important resources and enables me to gain a more comprehensive picture of what kinds of materials exist from the period I’m interested in. It’s opened my research up to considering how historical figures like fashion designers or movie stars impacted upon and interacted with the print culture that is my primary focus. “I’ve been surprised by HuNI because I am a very traditional humanities researcher who likes to work in library special collections and rare book rooms. I’m not one to rush to new technologies, and I
tend to like working in familiar ways. However, even with my traditionalist approach, HuNI was instantly useful and appealing to me.” As more researchers begin using HuNI and as the data becomes even more inter-operable, Professor Verhoeven sees the platform opening up a huge range of research that would not have been possible before. “Research – in any field – is important,” she said. “It’s about
the human endeavour towards improvement and greater knowledge. That’s why we do it. It’s almost built into our DNA. “The opportunity to build HuNI has been extraordinary for the humanities, arts and social sciences in Australia. It’s unprecedented at a national level to have an encompassing platform development like this. It is world class,” she said.
“The vision Nectar had for this sector though HuNI is a profound one and it will carry these research areas incredibly well into the future. This is a complete game changer in the humanities, arts, and social sciences sector, and we just can’t underestimate the impact it’s going to have.” For more information about HuNI, go to huni.net.au
A CHANGING TIDE “It’s always been an aspiration of mine to bring observations and modelling together,” DR ROGER PROCTOR
The Marine Virtual Laboratory (MARVL) brings together researchers from modelling and observations communities to work together to present data that is more suitable for ingestion into marine simulations and models. MARVL allows the Australian marine science community to rapidly address their research questions, rather than spending time on configuring models and gaining access to data. The ocean is the heat store for the world’s atmosphere, the driving force behind the planet’s climate, but until recently there was no nationally coordinated effort to understand Australia’s oceans and their effect on weather patterns. The first move toward collaboration came in 2001 with Project BLUElink, a large-scale cooperative project between the CSIRO, the Bureau of Meteorology, and the Royal Australian Navy, which was introduced to better understand
Australia’s marine environment, and develop reliable ocean forecasting for maritime users. However, BLUElink’s focus on forecasting still left marine observers in the dark, leading Dr Roger Proctor, an oceanographer based at the University of Tasmania, to establish Australia’s Integrated Marine Observing System (IMOS). “There was nothing for observations,” Dr Proctor said. “Different groups ran their own programs but these were uncoordinated, it was ‘do your own thing’ basically. To address this, IMOS was established in 2007 under the National Collaborative Research Infrastructure Strategy (NCRIS) to collect observations from a variety of measuring platforms including: Argo floats, ships, tagged animals, moorings and underwater vehicles. “There are more than 3,000 Argo floats deployed around the world’s oceans,” Dr Proctor said. “Each follows a ten day cycle, drifting along
at 1000m depth for nine days, and sinking to a depth of 2,000m on the tenth day to measure the ocean’s temperature and salinity on its rise to the surface.” “At the surface it relays the information via satellite to a data centre. IMOS is the second largest contributor to this international dataset.” All IMOS data is freely and openly available through the IMOS Ocean Portal for the benefit of Australian marine and climate science, with measurements (collectively and over time) revealing trends about temperature, ocean chemical properties, and fluctuations within fish communities in various locations. “IMOS is an excellent resource. However, the observations alone cannot create medium to long-term forecasts or predict what will happen under particular conditions – this is where models and simulations come in,” Dr Proctor said. “It’s always been an aspiration of mine to bring
observations and modelling together, and the Nectar Labs program provided the infrastructure for the marine science community to do this.” Seeing an opportunity to integrate these two elements, Dr Proctor and the IMOS team set up a virtual laboratory in collaboration with the CSIRO for observations that would also deliver modelling packages to users more easily. “When we realised our two virtual laboratory proposals were 100% complementary we decided to work together to establish MARVL, so that the modelling and observations communities would both gain from the interaction,” Dr Proctor said. “Since we’ve been in operation, the modellers have been exposed to new ideas about how they can use our observations, and present data which is more suitable for ingestion into simulations and models,” he continued. “Stage two of MARVL resulted in the system being tested by major marine research groups throughout Australia, and offering Nectar cloud simulations under NCRIS 2013 funding.” Ian Coghlan, a Senior Coastal Engineer at UNSW’s Water Research Laboratory was one of the first researchers to benefit from using MARVL. Running wave experiments at the
Solitary Islands near Coffs Harbour in New South Wales to determine the conditions under which coastal erosion is likely to happen, Ian Coghlan was able to reduce his setup time by a factor of 40 by using the MARVL web interface. “The wave model was a painstaking process requiring 2-3 months of significant modelling expertise, and permission from a number of different organisations to access forcing datasets,” Mr Coghlan said. “Using MARVL drastically reduced the lead time I needed to work on my research questions, and if any changes to the initialisation of the wave model were needed, a refined model could be set up and executed promptly through the virtual laboratory.” MARVL allows the Australian marine science community to rapidly address their research questions, rather than spending time and effort on configuring models and gaining access to data. “It’s the first step. It’s not a black box that does absolutely everything but MARVL enables researchers to start thinking about their problem sooner, rather than having to assemble all the information they need to get to that point,” Dr Proctor says.
“We’d like to make it faster and easier to use and would like to begin integrating biological models in addition to the physical models we now have, and that’s a whole new challenge,” he continued. “Ideally we’d like to get to the point where marine park and fisheries managers can also use it to do scenario planning.” MARVL will also play a key role in the Australian National Shelf Reanalysis (ANSR) project, which will aim to produce a 20-year computer simulation of shelf sea regions around Australia, containing relevant observations collected by IMOS and partners in the Australian Ocean Data Network. “Without observing and understanding how the ocean and the atmosphere interact, you have little chance of accurately predicting the weather or climate change,” Dr Proctor said. “More efficient use of ocean models and observations to improve hindcasting and planning for marine and coastal environments is so important, and MARVL provides us all with an integrated platform we can use to do just that.”
For more information about MARVL, go to marvl.org.au
ACCESS TO FURTHER COLLABORATION ON CLIMATE “The CWSLab aims to give scientists more time to investigate science questions and explore new methods to present a better understanding of our weather and climate systems.” TIM PUGH, BUREAU OF METEOROLOGY
The Climate and Weather Science Laboratory (CWSLab) is a virtual laboratory led by the Bureau of Meteorology in close collaboration with the CSIRO. The CWSLab provides an integrated platform of tools and resources for researchers and government agencies to simulate, analyse and predict climate and weather phenomena. Climate and weather has an enormous impact on people’s dayto-day lives, the food they eat, the energy they use, the homes they live in and the work and travel they do. Improvement to climate and weather services relies on collaborative work done by the Bureau of Meteorology, CSIRO and Australian universities. The CWSLab contributes to this effort as it facilitates community-wide access to climate and weather model simulations and data analysis capabilities. The CWSLab is a virtual laboratory for the scientific study of the Earth’s weather and climate,
utilising high performance computing and storage systems. The virtual laboratory provides an integrated platform of services, tools and data for researchers and government agencies to simulate, analyse and predict climate and weather phenomena at the National Computational Infrastructure (NCI) at the Australian National University. Its implementation to the infrastructure at NCI connects to services such as the NCI compute cloud, the data collections storage and the Australian Community Climate Earth-System Simulator (ACCESS) collection. In November 2012 the proposal for the CWSLab was submitted to Nectar. The project, led by the Bureau of Meteorology, was developed in close collaboration with CSIRO, NCI, and the Australian Research Council Centre of Excellence for Climate System Science (ARCCSS). In February 2013, the CWSLab was granted federal government funding through Nectar with additional
“We want to focus on training and approaching various organisations and universities to train users how to incorporate their own methods into this community virtual laboratory, so that it continues to expand.” TIM PUGH, BUREAU OF METEOROLOGY
funding available in December 2014. CWSLab project managers, Dr Aurel Moise and Tim Pugh from the Bureau of Meteorology, said the virtual laboratory was targeted at Australian scientists and students represented in the five universities of ARCCSS, as well as scientists from CSIRO and the Bureau. The virtual laboratory is a service available to Australian university researchers and students. Mr Pugh said the virtual laboratory aims to make the work of the scientists from the different institutions more effective and less time consuming. “In late 2013 users began to utilise the virtual laboratory, which has improved their access to computing and storage resources, facilitated the sharing of experiments and results, and reduced the time it takes to conduct scientific research studies,” he said. “The laboratory has also reduced the technical barriers to using state-of-the-art tools and improved collaboration and contributions to ACCESS.” The CWSLab is important in enabling use of ACCESS, and enables researchers to use data generated by similar groups from around the world more efficiently. In addition, it draws researchers
into the NCI environment and helps focus the national community on the infrastructure provided at NCI. Therefore, it acts like a catalyst to accelerate new and innovative research across data, Australian modelling, high performance computing and high performance ‘big data’. “Prior to the laboratory’s development, when users wanted to develop climate and weather simulations they would spend several months compiling the code, validating the software, writing the workflows and preparing the data to perform the simulations,” Mr Pugh said. “The duplication of effort and lost time was immense.” The laboratory has access to an enormous library of models and experiments of the latest and most recent weather and climate simulations. This allows students and scientists to quickly prepare and run the simulations, store the outputs, and analyse the data for scientific studies and publications. “This improves the traceability and reproducibility of experiments and makes it easier for new scientists to get started,” Mr Pugh said. Through the integration and enhancements of existing community software, such as ACCESS, the CWSLab produces a
cohesive facility for climate and weather process studies in areas such as weather prediction and extreme events, atmosphereocean-land-ice interactions, climate variability and change, greenhouse gases, water cycles, and carbon cycles. “When the scientific discovery and investigation process can be accelerated, it will reduce the cost to the organisation and deliver a more timely solution for the benefit of the Australian government, industries and public,” Mr Pugh said. “We are learning more and learning more quickly, and are able to find cost savings in the work we are conducting.” The CWSLab will not only be used for predicting Australia’s climate trends, it will also be used on an international stage. As a facility for the analysis of climate change simulations, the laboratory will assist in the assessments of future Australian climate change and contribute to the Sixth Assessment Report of the United Nations Intergovernmental Panel on Climate Change (IPCC) during 2017-18. “Every seven years the IPCC, supported by the UN, produces the latest update of what is known globally on climate change,” Dr Aurel Moise, Senior Research Scientist at
the Bureau of Meteorology said. “The latest data is distributed across the world and scientists, including those in the Bureau, CSIRO and ARCCSS, analyse the simulation and publish the results in peer reviewed literature. “The IPCC then brings in a large group of expert authors to assess and evaluate those results and write-up their assessment report.” Looking forward, Dr Moise said that the recently announced successful funding extension for the CWSLab will allow the Bureau to lead the extension of the laboratory’s work for the next 12 months. “It was clearly indicated by Nectar that they are not seeking any new proposals for new virtual laboratories. Their interest is in extending the current virtual laboratories to expand user uptake,” he said. “And that is exactly what we want to do with CWSLab. “We want to focus on training and approaching various organisations and universities to train users how to incorporate their own methods into this community virtual laboratory, so that it continues to expand.”
For more information about CWSLab, go to cwslab.nci.org.au
SOLVING ENDOCRINE DISORDERS WITHOUT BORDERS The Endocrine Genomics Virtual Laboratory (endoVL), funded through Nectar, is changing the face of endocrinology, by providing an easy to access online portal for research data. With more than 8,000 adrenal tumour cases currently registered, endoVL allows researchers to draw on large enough cohorts to conduct studies with real statistical power. While some of our most common chronic diseases are endocrine disorders, including diabetes and thyroid conditions, there are also a number of rare conditions, such as obesity, adrenal tumours and sex development disorders that result from problems with the endocrine system. Because of the rarity of many of these disorders, it is often challenging for researchers and clinicians to gather large amounts of patient data through clinical trials. Only one in a million people will develop adrenal tumours, for example, with most clinical trails only recruiting five to ten cases for each study — certainly not enough
people to confidently compare the effectiveness of treatments. Established in 2012 and led by Professor Richard Sinnott, Director, eResearch at The University of Melbourne, the endoVL, funded through Nectar, is changing the face of endocrinology, by providing an easy to access online portal for research data which allows endocrinologists to work together. “It changes the robustness of the science and also the power of what researchers can do,” Professor Richard Sinnott said. “We have more than 20 large-scale clinical trials running right now, both in diabetes and in adrenal tumours, and they involve research groups globally,” he said. “Diseases don’t know boundaries or country codes, so we have to build systems that allow researchers to collaborate across borders.” It’s also growing every day — the current registry has more than 13 times its predicted number of adrenal cases, with 78 centres internationally contributing adrenal tumour case data to the virtual laboratory.
“Diseases don’t know boundaries or country codes, we have to build systems that allow researchers to collaborate across borders.” PROFESSOR RICHARD SINNOTT
And with many users now building on the statistical power of endoVL to publish their papers in high impact journals such as Nature Genetics and in the New England Journal of Medicine, those who are not using it are now quickly getting on board. “It’s like a honeypot,” says Professor Sinnott. “Once you start building it everybody wants to be involved. If you’re not involved, you will be overtaken by researchers who are making the most of that statistical power.” The endoVL adrenal tumour registry is now also the basis of a recently funded $12 million Horizon 2020 platform, built by Professor Sinnott’s University of Melbourne eResearch team. But it’s not just adrenal and rare endocronogical cases that use endoVL, with registries for Type-1 Diabetes, Disorder of Sex Disease, Niemann-Pick Disease, Atypical Femur Fracture, and Polycistic Ovarian Syndrome, now providing a portal for clinical research communities in Australia and around the world, to store and access data. Associate Professor Maria Craig from The Children’s Hospital
Westmead in Sydney, is a Chief Investigator for the Australasia Diabetes Data Network and an avid user of endoVL. “We have used endoVL to develop a national database of childhood diabetes — including existing and new onset cases — which will enable us to analyse longitudinal outcomes, care, and complications from a national perspective,” Associate Professor Craig said. “It’s the first time the data on childhood diabetes from the five major children’s hospitals in Australia has been aggregated, with around 5,000 patients recruited to the registry to date. “endoVL allows us to show the difference in care among children with diabetes across the country, by age, or by gender, and whether these different treatment modalities influence outcomes.” Associate Professor Craig says that endoVL has become an essential component of her study, providing her team with research options they hadn’t even thought of prior to using the infrastructure. “The search and analysis tools available through the endoVL have enabled our investigators to query and learn from the data in real time,
and in ways they had not envisaged at the beginning of the study,” she said. “If we wanted to compare glucose control in children with insulin pumps to those on multiple injections we could simply use the endoVL search tools to answer the questions – breaking it down by gender, age group, centre, state, diabetes type, and by many other variables. “These are things we hadn’t set out to ask, but the way the software has been developed, we now have these wonderful search tools that give us these options.” With plans of expanding the registry, Associate Professor Craig and Professor Sinnott were recently awarded a $2million JDRDF grant to include centres who manage adults with diabetes in the registry. “It potentially can be applied to a broader population and to other diseases, which was the whole goal of endoVL,” Associate Professor Craig said. “With the help of endoVL, research into six different endocrine diseases have already started and will have results with a much greater impact due to the statistical power of global patient registries.”
Professor Sinnott also sees potential to do much more with the endoVL infrastructure. “The key is to sustain these registries, and to build more of them,” he says. “Pick any disorder and you’ll find they will need a patient registry for clinical studies and trials — whether it’s spinal injury, brain injury or cancer — they all have the same story, the same needs, and this infrastructure can help them all.” For more information about endoVL, go to endovl.org.au
TRACKING THE GLOBAL SUPPLY CHAIN The Industrial Ecology Virtual Laboratory (IE Lab) is helping researchers to analyse supply chain impacts, by harmonising data and developing new analysis tools. The IE Lab allows researchers to access online resources to construct, update and expand large environmental databases at a lower cost and with fewer human resources. When you buy a T-shirt do you know where it really comes from? Industrial ecologist Professor Manfred Lenzen of The University of Sydney can tell you that it was made in China, from fabric woven in Bangladesh, from cotton grown in Uzbekistan, where the Aral Lake is shrinking because of water extractions. The T-shirt’s Uzbekistan water footprint is just one of the many environmental and economic impacts of complex, global supply chains. The IE Lab, funded through Nectar, is helping to analyse these supply chain impacts, by harmonising data and developing new analysis tools. “The job of the industrial ecologist
is similar to a detective,” Professor Lenzen said. “Suppose a consumer buys paint made in China. Do they know that the pigments are made in Tanzania from titanium from Madagascar, where the mining operations threaten lemurs? “Or do mobile phone users know that its capacitors are made with tantalum from Kenya but mined in the Congo, where the mining activity fuels the civil war?” Professor Lenzen said long and complex supply chains ripple across the globe through world economies and tracking the data is just too complex for a single research team. “We had a research bottleneck,” Professor Lenzen said. “The individual research teams, who are now the core members of the IE Lab, did not have the resources to construct or update such large databases. “We had built one such database but we did not have the resources to keep pace with requests to update and expand it.” Professor Lenzen met with colleagues around Australia to begin discussing how they might create a
“The job of the industrial ecologist is similar to a detective.” PROFESSOR MANFRED LENZEN
platform together for their common use. When Nectar issued the call for virtual laboratory proposals, they quickly realised it was exactly what they needed. “It’s terrific,” Professor Lenzen said. “Everyone has just a fraction of the work. We’ve created protocols for how the contributions from various sources should be standardised, and it works beautifully. “Just the fact that we’ve now harmonised all these bits and pieces across the landscape of environmental information, is an achievement in itself.” Partners in the project have included The University of Sydney, CSIRO, UNSW, The University of Queensland, Griffith University, Federation University Australia, The University of Melbourne and the University of South Australia. The Australian Bureau of Statistics is a key data provider and other data comes from the National Pollutant Inventory, the Bureau of Meteorology and from many other sources. A number of researchers are already using the IE Lab, for example in studies on future biofuel industries for Australia, industrial symbiosis and material efficiency, and on waste metal flows. The IE Lab is also collaborating with the Jolliet Lab at the School of Public Health at the University of Michigan, on modelling the health
effects of Australian consumption. Professor Lenzen said the IE Lab is designed to encourage use not only by the research sector but by other sectors as well. Having their work seen and cited more frequently provides incentive to academic users to open their data and corporate users such as consulting firms can integrate their own data sources with the IE Lab, while maintaining confidentiality. Government bodies including the Productivity Commission have also expressed interest in the IE Lab, and international interest is growing rapidly throughout the industrial ecology community. Due to the global nature of supply chains, the more data that is added, the more valuable the IE Lab becomes for users worldwide and with additional funding through the Australian Research Council’s Discovery Projects scheme, Professor Lenzen’s team is now working on a global IE Lab. “At the moment, the IE Lab contains Australian data,” he said. “So for example, you could assess the Australian economic and environmental impacts of, say, a second Sydney airport. “But I want to create a global IE Lab, where you could know, for example, if you bought paper here, was any rainforest cut down in Indonesia.”
Professor Lenzen is currently liaising with builders of global international trade frameworks to assess whether their data streams can also run through collaborative environments such as the IE Lab. “I know that internationally, industrial ecology research suffers from the same predicament that we were facing,” Professor Lenzen said. “It’s prohibitive for an individual institution to build these global databases. Collaboration will get us much, much further with much less cost and fewer human resources.”
“We are overwhelmed with the interest in the IE Lab. We wouldn’t have expected it. Now we see that the Nectar Labs program has really brought our field forward.” For more information about IE Lab, go to isa.org.usyd.edu.au/ielab/ielab.shtml
OPENING UP A NEW FIELD OF ENQUIRY The Biodiversity and Climate Change Virtual Laboratory (BCCVL) integrates modelling tools and datasets with highperformance computers and major data storage facilities, to enable efficient investigation of biological systems. Directed by leading biodiversity and climate change researchers from 16 universities, the BCCVL was established to simplify biodiversity-climate change modelling. Climate change in its simplest form is an alteration in the statistical distribution of weather patterns. Because of these statistical alterations, researchers are now entering uncharted territory, where future climates can no longer be extrapolated from past climate patterns, and new models must be used for predicting future changes. These global climate models (GCMs) are similar to those used for daily weather prediction, however, being based on the laws of physics, the use of these models can be particularly prohibitive for a number of researchers – particularly those with limited programming skills.
To overcome this problem, the BCCVL was established in 2013 as a “one stop modelling shop” to simplify the process of biodiversity-climate change modelling. Through the direction of leading biodiversity and climate change researchers from 16 universities, the goal of the BCCVL is to integrate modelling tools and datasets with high-performance computers and major data storage facilities, to enable more efficient investigation of biological systems. “We were very excited when we heard about the Nectar Labs,” Professor Brendan Mackey, Director of the Griffith Climate Change Response Program said. “Having worked in this area of species distribution modelling for many years, as ecologists and ecological modellers, we were all very experienced in putting together the data, the compute, and the work flows associated with doing our research. “But with such heavy start up costs and all of the other development costs associated with doing that, we saw virtual laboratories as a way of enabling us to not only save an
“Essentially the BCCVL has enabled us to ask questions that we couldn’t ask before.” PROFESSOR BRENDAN MACKEY
enormous amount of time, but to also allow collaboration with those ecologists who weren’t so skilled at programming.” Previously, the lines of inquiry into biodiversity and climate change impacts have been stymied due to the inability of researchers to access a standardised set of tools for analysis and computational limitations. Supported by Nectar, lead partners Griffith University and James Cook University, applied for funding to establish the BCCVL, which has allowed them to set up a series of online modelling tools that can be shared among over twenty academic partners across Australia, saving researchers time and precious resources. “Firstly, when you’re starting up a new project that involved species distribution modelling, you spend 40% of your time just getting all of the data and work flows in place,” Professor Mackey said. “Secondly, there are a lot of ecologists who would like to incorporate this kind of modelling in their work, but can’t because the level of programming expertise needed is too high.
“So we wanted to lower the bar in terms of accessibility, but at the same time raise the bar in terms of the quality of the modelling – ultimately providing a more rigorous approach.” In addition to improving accessibility and rigour, Professor Mackey said that the BCCVL will also allow researchers to focus on more than one statistical modelling technique at a time. “Prior to the BCCVL you’d have to pick one future climate change scenario out of 17 or 18 different statistical modelling processes, 40 global climate changes and at least five future scenarios, then narrow all of that down to a specific time period, which could be any month, from now to 2100,” he said. “The BCCVL enables us to now run intermodal comparisons, comparing different modelling techniques such as different climate change models, different scenarios, or a combination of both, which makes for much more rigorous modelling in terms of quantifying the uncertainties. “You can also do ensemble models, where you can take ten of the most reliable climate change models in Australia and run a
project based on the statistical integration on all of their outputs. “It’s early days and there’s a lot of experimenting still to be done, but essentially the BCCVL has enabled us to ask questions that we couldn’t ask before – questions we may have wanted to ask but couldn’t logistically hope to answer, so it’s opened up a whole new field of enquiry.
From 2015 the Griffith Climate Change Response Program, will use these new rigorous modelling techniques to focus on integrating the university’s research expertise in the fields of regional and urban planning, coastal management, engineering, architecture, governance, environmental sciences, ecology, information technology, environmental modelling and governance. It will also apply these to its focal topic of “Adapting to a rapidly changing climate in coastal regions.” For more information about the BCCVL, go to bccvl.org.au
TAKING THE ‘IT’ OUT OF BIOINFORMATICS “This is the best exemplar of this kind of platform in the world. Few countries are able to do what we’re doing.” ASSOCIATE PROFESSOR ANDREW LONIE
The Genomics Virtual Laboratory (GVL) provides Australian biologists with an online suite of genomics tools and resources to analyse large genomic datasets. With its growing suite of analysis tools, and easy to access web portal, the GVL allows biologists to analyse, share and access their data remotely, allowing them to conduct large-scale projects from their lab. It took 13 years and US$3 billion to map the first human genome sequence, as part of the Human Genome Project in 2003. Since then, revolutionary genome sequencing technology has transformed all fields of biological research, from medicine to agriculture, allowing the average researcher to conduct large-scale projects from their lab. New sequencing facilities at the Garvan Institute of Medical Research in Sydney, for example, provide Australian researchers with the technology to sequence 150 whole human genomes every three days, at a cost of less than $2,000 per genome.
This accessibility has fuelled massive growth in genomics research, however, the proliferation of genome sequencing – at ever-higher resolutions – has created a bottleneck, with a large number of biologists looking to analyse genomes, without the tools or infrastructure to do so. The GVL funded through Nectar was established in 2013 by researchers at The University of Queensland, the Victorian Life Sciences Computation Initiative (VLSCI), the Garvan Institute, and CSIRO, to provide biologists with an online suite of genomics tools and resources. Ron Horst, Project Manager for the GVL at The University of Queensland says that prior to GVL, biologists working in genomics would have to collaborate closely with information technology specialists and bioinformaticians to analyse and make sense of their data. “Our goal for GVL was to take the ‘information technology’ out of bioinformatics,” Mr Horst said. “Traditionally, before a researcher could even start their analysis, they
would have to get funding, buy hardware, install it, load software, test, fail, test, set up Galaxy, and have a systems administrator to run it. “Then there were issues with managing the analysis, with data being shipped back and forth to researchers from sequencing centres on hard drives. It certainly wasn’t ideal.” Overcoming these barriers, the GVL provides an easy to access web portal with a growing suite of genomics analysis tools, allowing biologists to start analysing their data immediately using the Nectar Cloud. The GVL also provides a workbench for reproducible results, and tutorials to help biologists learn the key analysis techniques they need to support their research. Dr Chanyarat PaungfooLonhienne at The University of Queensland’s Institute for Molecular Bioscience is one of the new generation of biologists working with genomics thanks to the GVL. She is collaborating with The University of Queensland’s Sugarcane Ecogenomics Research Group, which aims to understand more about microbes in the soil and the roots of sugarcane plants, and how these affect the plants’ growth and health. “Ultimately, the research could lead to more sustainable practices for growing this important Australian crop, using less fertiliser,” Dr Paungfoo-Lonhienne said. “By using the GVL I have been able to undertake genomic analysis that I would not have been able to do on my own without it. “I am a biologist,” she says. “I don’t know about Linux or coding languages like Python, and running genomics software requires you to know these things. “Instead of spending time trying to learn how to install software and
use a command line interface, I can focus on the goals I want to achieve,” she said. “It makes everything easy for biologists like me.” Dr Paungfoo-Lonhienne is now preparing a manuscript for publication, using results she achieved through using the GVL. “We have some good results, which I’m writing up at the moment,” Dr Paungfoo-Lonhienne said. “Hopefully we can submit a manuscript before the end of the year. I’m very pleased.” The tutorials available through the GVL have also proved very popular with researchers in Australia and internationally, with hundreds of users from around the globe logging on to analyse their data, from Tufts University in the United States to research groups in Kazakhstan. The largest cancer research group in Australia, the Peter MacCallum Cancer Centre is now also using the GVL as a user-friendly platform for their genomics analysis. According to Associate Professor Andrew Lonie, Head of the Life Sciences Computation Centre at VLSCI, the GVL is enabling researchers at Peter MacCallum to explore their data and perform routine, well-established, bioinformatics with ease. “Because they are spending less time on these tasks, the bioinformatics team at Peter Mac can focus on the more challenging and complex problems required for novel discoveries,” Associate Professor Lonie said. “Having this facility in the Cloud also allows Peter Mac’s researchers to collaborate easily with external parties and to share and access data remotely. “This is the best exemplar of this kind of platform in the world. Few countries are able to do what we’re doing,” he continued.
“The GVL provides genomics capability to the masses on a common infrastructure, allowing more efficient research, collaboration and faster outcomes.” For more information about the GVL, go to genome.edu.au
SCALING CHARACTERISATION FOR IMPROVED OUTCOMES The Characterisation Virtual Laboratory (CVL) provides a world-leading data management and workflow environment for scientists who use advanced imaging techniques. Key instruments have been integrated with the CVL portal, so that data is captured directly from the imaging instrument, where it can be processed using software tools stored in the CVL and the Nectar Cloud. Imaging techniques, broadly known as characterisation, are used in a number of research fields, from structural biology, to neuroimaging, and the analysis of energy materials. Professor James Whisstock and his team of researchers at the Whisstock Laboratory, Monash University, use characterisation to study immune defence and blood coagulation, to better understand how they relate to cancer, inflammatory diseases, and clotting disorders. “As we age, almost all of us will be impacted by an aberrant effect of our immune system on our own bodies,” says Professor Whisstock. “If we’re going to change the course of a particular molecular
event, whether it be an aberrant immune response, or an out-ofcontrol signalling response, which occurs in cancer, we need to understand the mechanism. “We use X-ray crystallography and electron microscopy to create images and videos of these molecular processes occurring, so we can see the mechanism in glorious atomic detail.” Recent advances in microscopy have enabled molecular processes to be seen in much greater detail, but running such large amounts of data creates a new, unique, challenge for researchers. “A high-end electron microscope running in molecular movie mode might produce two terabytes of data per day, which may also need to be processed while the instrument is running,” Professor Whisstock says. “The workflow associated with imaging has gone from being something you could do manually on a laptop, to a procedure that requires large-scale data processing. “Getting the data away from the microscope, backing it up, and making it available to the right people at the right time is
“The workflow associated with imaging has gone from being something you could do manually on a laptop, to a procedure that requires large-scale data processing.” PROFESSOR JAMES WHISSTOCK
fairly simple, but then there are the more sophisticated aspects to the processing, applying the right software to the data, in the right way at the right time, to maximise the productivity of the scientist and the instrument,” Professor Whisstock says. “With an instrument that costs several thousand dollars a day to run, you don’t want to be wasting time.” These common challenges, being faced by all researchers using imaging techniques, led a community of scientists from major imaging facilities to establish the CVL. The CVL, funded through Nectar and a list of project partners from around Australia, provides a world-leading data management and workflow environment for scientists who use advanced imaging techniques. Key instruments have been integrated with the CVL portal, so that data is captured directly from the imaging instrument and transported into a managed environment, where it can be processed using software tools stored in the CVL and the Nectar Cloud. Dr Wojtek Goscinski, Coordinator of the Multi-modal Australian Sciences Imaging and Visualisation Environment (MASSIVE) at the Monash e-Research Centre, developed and operates the CVL with his team of researchers.
“It’s like a scientific desktop,” Dr Goscinski said. “By capturing the data from the very point of generation, the workflow is set up from the start. By the time the researcher gets back to their office their data is already there, being managed in an environment with tools that can assist the researcher with processing.” Professor Whisstock says the CVL is an essential component of the new Clive and Vera Ramaciotti Centre for Structural Cryo-Electron Microscopy, at Monash University. “Researchers who have seen and worked with the system have all been impressed with the CVL, and with the strategy behind it,” Dr Goscinski said. “International leaders in this field also recognise the importance of this program and want to use it, which is the ultimate peer review really.” Due to its success, Dr Goscinski is now working to integrate even more imaging instruments with the CVL. “Every lab in Australia that conducts microscopy imaging has a big, high-end PC sitting in the corner with a bunch of terabyte drives hanging off of it, and that’s their solution,” he said. “We can provide these labs with a shared resource, and a managed environment, to replace that PC.” “Our next Nectar project will be to develop tools to put instrument integration into the hands of the people who run the MRI machines or the microscopes.
“We want to provide these technicians with an easy way to connect their microscope to the CVL and central facilities. That way we will be able to scale it to hundreds of instruments rather than tens of instruments.”
CVL partners include Monash University, the Australian Microscopy & Microanalysis Research Facility (AMMRF), Australian Nuclear Science and Technology Organisation (ANSTO), Australian Synchrotron, National Imaging Facility (NIF), Australian National University, the University of Sydney, and the University of Queensland. For more information about the CVL, go to massive.org.au/cvl
THE SKY’S THE LIMIT The All-Sky Virtual Observatory (ASVO) houses a growing ensemble of theory data sets and galaxy formation models, with tools to map simulated data onto an observer’s viewpoint, and apply custom telescope simulators. ASVO draws on the power of Swinburne University’s gSTAR GPU supercomputer to allow astronomers to simulate the Universe through a wide range of telescopes. Investigating how galaxies form and the larger properties of the Universe from the comfort of your desktop computer is no longer the stuff of science fiction – it’s happening right here, right now in a capital city near you. To gain maximum scientific benefit from the new telescopes and facilities going online every day, to produce data in volumes never previously experienced in Australian astronomy, a federation of datasets from all types of astronomical facilities in Australia was needed. After consultation with the Australian astronomy community, two Australian astronomical facilities were chosen to form the first pillar of the ASVO: The SkyMapper node
based at the Australian National University’s (ANU) Siding Spring Observatory; and the Theoretical Astrophysical Observatory (TAO). Under the ASVO each facility has assisted in creating hardware, tools and services for the virtual laboratory, bringing together data from radio telescopes, optical telescopes and supercomputers, covering all parts of the southern sky. The ASVO also provides a direct and vital link between the theoretical and observational aspects of data collection and analysis, enabling federated data access through the provision of International Virtual Observatory Alliance (IVOA) compliant services and data access mechanisms. Dr Yeshe Fenner, Executive Officer, Astronomy Australia Ltd, says the ASVO project is the first step towards building a federated network of datasets, from all types of astronomical facilities in Australia. “Optical data from the Southern Sky Survey, obtained using the SkyMapper telescope at ANU’s Siding Spring Observatory comprises the most detailed and sensitive digitised map of the southern sky at optical wavelengths,” Dr Fenner said.
“Creating a mock catalogue of galaxies might have previously taken months, or even years to develop, but with TAO it just a few minutes.” ASSOCIATE PROFESSOR DARREN CROTON
“This has provided researchers with an opportunity to test the SkyMapper Test Data Release, and preview the characteristics and data access protocols for the SkyMapper Node. “The TAO, developed at Swinburne and launched in March 2014, houses a growing ensemble of theory data sets and galaxy formation models, with tools to map simulated data onto an observer’s viewpoint, and apply custom telescope simulators.” TAO project scientist and Associate Professor at Swinburne University, Darren Croton said that TAO draws on the power of Swinburne’s gSTAR GPU supercomputer to allow astronomers to simulate the Universe through a wide range of telescopes. “TAO makes it easy and efficient for any astronomer to create these virtual universes,” Associate Professor Croton said. “It’s the culmination of years of effort, and its now at the fingertips of scientists around the world.” “It also allows researchers to take the data from massive cosmological simulations and map it onto an observer’s viewpoint, to test
theories of how galaxies and stars form and evolve,” he continued. “Creating a mock catalogue of galaxies might have previously taken months, or even years to develop, but with TAO it takes just a few minutes.” Swinburne University worked with eResearch agency, Intersect, to design the web interface for TAO with simplicity and user-friendliness in mind. “It was important for us to create a service that could be used by any astronomer, regardless of their area of expertise,” Associate Professor Croton said. “By providing an accessible interface we’re accelerating the pace of science and boosting the chance of breakthroughs.” Keeping up with technology and a rapidly changing research environment is also important. “As new survey telescopes and instruments become available, they can be modelled within TAO to maintain an up-to-date set of observatories,” Associate Professor Croton said. “TAO could be especially useful for comparing theoretical predictions against observations coming from next-generation survey
telescopes, like the Australian Square Kilometre Array Pathfinder (ASKAP) in Western Australia, and the SkyMapper Telescope run by the ANU.” “These will cover large chunks of the sky and peer back into the early stages of the Universe, and will be tasked with answering some of the most fundamental questions known to humankind.” Dr Alan Duffy, an astrophysicist using supercomputer simulations to understand how galaxies form is based at the Centre for Astrophysics and Computing at Swinburne University. He has successfully used TAO in his studies. “Using TAO I was able to gain access to some of the most advanced simulations of our Universe ever created, on the biggest supercomputers in the world, tailored precisely to the expected performance of ASKAP, all on my laptop,” Dr Duffy said. “In other words, I could ask in a user-friendly webpage for the exact predicted survey that ASKAP will soon undertake, without ever needing access to expensive supercomputer or bespoke simulation code.”
The ASVO is a partnership between Astronomy Australia Ltd, Swinburne University of Technology, the Australian National University, the National Computational Infrastructure, and Intersect Australia Ltd. ASVO is also supported by: the Australian Government, through the Education Investment Fund; the National Collaborative Research Infrastructure Strategy (NCRIS); and Nectar. For more information about the All-Sky Virtual Observatory, go to asvo.org.au
A GAME CHANGER FOR GEOPHYSICS The Virtual Geophysics Laboratory (VGL) is an online portal that provides researchers with access to integrated software, allowing them to process data within minutes, regardless of their location. The VGL saves time, money and resources by allowing geophysicists to create seamless data sets from multiple sources regardless of where the data is stored. Geophysics is the study of the earth and the analysis of the physical processes and properties of the earth such as snow, ice, oceans, lunar cycles and the atmosphere, to better understand the role the earth, and its elements, play in space. In order to mitigate natural hazards and protect the environment, geophysicists must source minerals and analyse data to better understand the Earth’s role in this complex environment. However, quantitative analysis methods can often be time consuming and complex, requiring an advanced understanding of computation technologies and algorithms. In the past, geophysicists have had to search multiple sources, download files individually, and reformat
disparate data into seamless data sets, all from their desktop computers, in order to make sense of their complex data. Established in 2012, the Nectar – funded VGL is an online portal which has created a new way of working for geophysicists. A collaboration between CSIRO, Geoscience Australia, and the National Computational Infrastructure (NCI), it was developed in close collaboration with end-users and geophysicists from Geoscience Australia, the Australian National University, Monash University and the University of Queensland. Dr Carina Kemp from Geoscience Australia said that the VGL allows researchers to login to gain access to integrated software, allowing them to process data within minutes, regardless of their location. “The VGL saves time, money and resources by allowing geophysicists to create seamless data sets from multiple sources regardless of where the data is stored so they can run their data from anywhere in real time,” Dr Kemp said. “Researchers can also use the VGL to send their data to the Nectar Cloud where they can compute,
“The speed at which we can now carry out our geophysical inversions was not possible before. It removes all of the pre-processing that used to take us days to complete.” DR CARINA KEMP
collaborate and process their work within minutes, allowing researchers to mash multiple files together and reformat files quickly and easily. “The VGL is also a great platform for collaboration, allowing researchers to gain information from one another instantly.” For geophysicists such as Dr Kemp the VGL is a “game changer” and is significantly improving work efficiencies. “One of the greatest benefits is the collaboration and the possibility of sharing now.” she said. “Geophysics algorithms can be quite complex and compute infrastructures seem to grow in complexity with each new generation. “The VGL makes it possible to share and develop a community code and work together to improve it by allowing researchers to run their data more efficiently.” The provenance workflow is another significant outcome of the VGL, enabling researchers to record their procedures and results. “Previously it was up to individual scientists to make sure they were documenting their procedures properly to enable repeatability of the results, but in VGL it is all recorded for us,” Dr Kemp said. “Now using the VGL, taking care of the cropping and pre-processing, such as re-projecting the data on the fly, we can complete our work in a matter of hours, instead of months.”
The VGL also lowers the barriers for researchers to enter computational facilities and makes geophysical processing and modelling tools available to the broader community at a far lower cost. Another outcome of the VGL was the development of the Virtual Hazard Impact and Risk Laboratory (VHIRL), to improve aspects of natural hazard and risk modelling. VHIRL was built by repurposing and extending VGL components to provide a Scientific Software Solutions Centre for the discovery, access, and deployment of scientific codes that have application within emergency management and natural hazard assessment. This demonstrates the benefits of leveraging re-usable virtual laboratory infrastructure for other scientific domains. Alison Kirkby was integral in implementing the geothermal modelling aspect of the VGL, and said that the VGL is making it easier for geophysicists in terms of both cost and accessibility to high-end computers – previously the domain of computer specialists. “The VGL brings consistency and repeatability to the geophysical processing community,” Alison said. “The exploration community in Australia will now have access to geophysical datasets Australia wide, and will be able to process them and access previously calculated results.
“Increasingly we’re seeing that geophysical processing software is moving from being monolithic, highly-expensive software that is licensed per-organisation, per-seat, to small modularised tools that are accessible as mobile apps in the Cloud. “Such applications could be globally accessible and the VGL infrastructure is ideally suited to exploiting these tools.” Dr Lesley Wyborn, senior advisor at Geoscience Australia said the VGL is creating bright future opportunities. “What we are seeing is a vast reduction in the time it takes
to locate and prepare data for processing,” Dr Wyborn said. “Due to the availability of greater processing power, the data can be processed to higher resolutions and at far larger scales. “The user no longer has to invest in high-end infrastructure at their local site – they just pay for what they use and for the time they use it and can access the latest and most modern infrastructure that’s available via the Cloud. “The provenance workflow tool also lowers the barriers to entry, so you no longer need to be a highly skilled programmer who is an expert in geophysics to be able to use it.”
“For Geoscience Australia this has been a key benefit, as it’s also enabled us to know exactly what processing was applied to which data extract, who did it, where they did it, and how they did it.” For more information about the VGL, go to vgl.auscope.org/VGL-Portal/gmap.html
CONTRIBUTING THROUGH COLLABORATION The ARC Centre of Excellence for Plant Energy Biology (PEB) uses the Nectar Cloud to quickly generate computational instances, allowing researchers to analyse and visualise their data efficiently. The Nectar Cloud has helped position PEB as a worldwide leader in plant energy biology, which will ultimately lead to advances in agriculture, forestry, fuels, and medicine. Discovering better ways to feed and fuel the planet by understanding more about how plants use energy is essential to the prosperity of our planet and its species. Professor Ian Small, an Australian Laureate Fellow and Chief Investigator at the ARC Centre of Excellence for PEB at the University of Western Australia, is working with his colleagues to better understand this vital biological phenomenon, which is often taken for granted. “Most of our food is directly or indirectly derived from energy metabolism in plants,” Professor Small said.
“The oxygen we breathe comes from it, the building blocks that plants use to make wood and fibre come from it, as do biofuels. “Even now, the energy that flows through plants still dwarfs our own energy usage – so it’s important that we understand this and learn to harness it to make the best use of our energy resources.” To gain a better understanding of this process, Professor Small and his team of researchers are studying how plants capture energy from sunlight, how they store that energy, and how they use energy to grow and develop. Much of Ian’s work is based on genomic approaches, and understanding how particular genes in particular plants work, which are data and computationally intensive. After hearing about the Nectar Cloud, Hayden Walker, PEB’s IT and Technical Systems Support Specialist began using the Cloud’s infrastructure for some of PEB’s research projects, and he hasn’t looked back. “The genomic analysis and datasets we were running and storing were constantly challenging
“It’s very easy to expand your networks. We now have a whole load of extra collaborators coming in on the same project using the Nectar Cloud.” PROFESSOR IAN SMALL
the capacity of PEB’s own systems,” Mr Walker said. “Traditionally we were creating these things in-house, and it’s very time-consuming and expensive to replicate on the scale we needed, so the flexibility that’s offered by the Nectar Cloud was really appealing. “With Nectar we’re able to generate computational instances very quickly, which means our researchers can start doing their analyses and visualising their data right away.” The Nectar Cloud also allows PEB’s researchers to scale up or down, as they need to. “If we start small on a certain project it’s not that hard to transfer to a larger project with more computational resources,” he said. PEB’s use of the Nectar Cloud has also had unexpected benefits for PEB, in making it easier for them to undertake larger projects with international collaborators. “We’ve been using the Nectar Cloud to undertake a collaborative project with a group at a Max Planck Institute in Germany,” Professor Ian Small said. “This is a group we’re very often competing with, we do similar types of research, and on some projects we
collaborate and on other projects we keep our mouths shut because we’re competing for the same goals. “So while giving access to our own servers is not necessarily something we would have done in the past, having a special sandbox where we can work together, and choose what information we share and when, is excellent – it takes a lot of the stress out of collaboration.” Opening up access to collaborators from a related project with the Beijing Genome Institute, China, and another much bigger project run from the University of Alberta, Canada, has expanded the project with Max Planck Institute even more. “It’s very easy to expand your networks. We now have a whole load of extra collaborators coming in on the same project using the Nectar Cloud,” Professor Small said. “It also makes research much faster, the performance is excellent and it facilitates everything. “It means more collaborations than we would have done in the past – we are now working on projects that would have just been too hard to go ahead with, or would have been too uncomfortable to pursue once we got started,” he continued.
This ease of use, through the Nectar Cloud platform, has helped position PEB as a worldwide leader in plant energy biology, which will ultimately lead to advances in agriculture, forestry, fuels, and medicine. For more information about the ARC Centre of Excellence for Plant Energy Biology, go to plantenergy.uwa.edu. au/news/news.shtml
PROTECTING AUSTRALIA’S SHARK POPULATIONS “I use the cluster in the cloud to access a high-performance computing cluster that uses virtual machines – it makes my life a lot easier and so much faster to get the results I need.” DR CLAUDIA JUNGE
ARC Research Associate at the University of Adelaide’s School of Biological Sciences, Dr Claudia Junge, uses eResearch SA (a Nectar node) and the Nectar Cloud to power her research into Australia’s dusky shark and bronze whaler shark populations. The long-term result of Dr Junge’s research will be better fisheries management. Marine biologist Dr Claudia Junge has had a fascination with the ocean
since she first went scuba diving at the age of 14. After studying biology in Germany and undertaking a PhD at the University of Oslo, Norway, she decided to discover the deep seas of the South. Dr Junge is working with a multidisciplinary team of researchers as well as multiple government and industry partners to find out how many genetic stocks of dusky and bronze whaler sharks there are in Australia, in order to sustainably manage the species. After extracting DNA from shark tissue samples, Dr Junge then uses next-generation sequencing (NGS), applying a method that involves cutting the genome into smaller fragments at specific recognition sites. “Because of this, I end up with thousands of single nucleotide polymorphisms (SNPs) across the whole genome,” Dr Junge said. “To run bioinformatics analyses on so many SNPs and samples you need a number of resources and it’s just not possible to have all of these on your desktop computer – this is where eResearch SA really comes in handy.”
“Once I have extracted the genetic information, such as levels of gene flow, we then work with modelling researchers, and the chemistry information our collaborators have found, to parameterise species-specific spatially explicit population models. “The software and Nectar Cloud resources I have been able to use through eResearch SA have been incredibly useful, especially for the population analyses, because you need specific programs, which they’re always willing to install. “I have datasets that include over 10,000 SNPs for up to 300 different individuals and just one of these analyses can take 150 hours, not to mention that I then have to do this in replicates of 20 for 10 different settings – if I even attempted to do this on a desktop it would take forever. “It’s great that researchers like me can use eResearch SA’s resources – I use the cluster in the cloud to access a high-performance computing cluster that uses virtual machines – it makes my life a lot easier and
so much faster to get the results I need.” “As both species are fished in Australia – bronze whalers predominately in South Australia and dusky sharks predominately in Western Australia – and only produce very few offspring, compared to most commercially fished species, our studies are also important in ensuring that Australasian stocks are not being overfished.” Contrary to previous studies, results from Dr Junge’s project have already shown that dusky shark populations around Australia are made up of the same genetic stock. “We found that dusky sharks being fished in Indonesia belonged to the same genetic stock as our Australian samples,” Dr Junge said. “This is important for fisheries management to keep in mind, particularly when making sustainability agreements interstate and internationally. “Interestingly, bronze whaler sharks in Southern Australia are also very mobile and seem to be connected genetically.”
“From what we can tell, samples from Western Australia and also from around the Great Australian Bight, are somewhat different genetically from the southern and eastern side of Australia, as well as New Zealand.” This research was supported under the Australian Research Council’s Linkage Projects funding scheme (project LP120100652).
A MASSIVE LEAP FOR DATA CAPTURE AND PROCESSING The Multi-modal Australian Sciences Imaging and Visualisation Environment (MASSIVE) uses the Nectar Cloud to capture, analyse and manage data produced by the cryo-EM – a microscope that uses electrons to image frozen samples, providing Australian scientists with an unprecedented view of molecules and protein structures. Our health ultimately depends on the interactions of large biological molecules, such as proteins, lipids and carbohydrates, but in order to understand how these molecules interact scientists must first establish their 3D shape and structure using sophisticated microscopes. In February 2015, Monash University unveiled the most powerful biological microscope in Australia, the $7 million FEI Titan Krios cryo-Electron Microscope (cryo-EM). Capable of providing detailed images of molecular interactions, the cryo-EM will transform the way
researchers view molecules, and in turn, the human immune system. Its success, however, depends greatly on how researchers process, store and analyse the huge amounts of data it creates. The accurate, reliable handling of all the data produced by the cryo-EM hinges on the work of the Nectarfunded Characterisation Virtual Laboratory (CVL) that is led by MASSIVE, coordinated by Dr Wojtek James Goscinski. “The cryo-EM is a microscope that uses electrons to image frozen samples. It is providing Australian scientists an unprecedented view of molecules and protein structures, which will help them to develop drug targets and better treatments for diseases,” Dr Goscinski said. “We’ve been anticipating the arrival of the microscope for years now, so the intent of CVL has been to establish a workflow for processing the data, which includes handling the volume, analysis, processing and visualisation of it.
“We’re providing ways of moving the tools to the data as opposed to moving your data to your laptop.” DR WOJTEK JAMES GOSCINSKI
“Using the CVL, funded by Nectar, we have been able to establish infrastructure for capturing, analysing and managing data for long-term analysis using both the Nectar Cloud, and the MASSIVE HPC systems.” The structure of a protein determines what it does and why it’s important so it’s vital to understanding the 3D structure. The Cryo-EM captures a huge collection of noisy but insightful 2D images, which then need to be heavily processed using a computed tomography technique to reconstruct the 3D structure. The ultimate outcome from this research is to find drug targets. With the intention of producing terabytes of data per day, MASSIVE has created workflows, which will allow researchers to capture their samples quickly, so they can then start analysing data remotely from their desktops. “Once data is captured MASSIVE automatically moves it to the CVL or MASSIVE, which both have an online environment which provides the tools and services researchers need to process the data – this allows researchers to return to their desktops and start analysing the data
straight away,” Dr Goscinski said. “It’s about increasing the speed of how science is done so we get more out of our microscope spend and increase the productivity of that piece of technology. “We use the cloud in a slightly different way in that we set up desktop environments for different communities, such as structural biology, with up to 20 structural biology tools which have already been pre-configured, so it just makes things easier for the research community. “We’re providing ways of moving the tools to the data as opposed to moving your data to your laptop.” The cryo-EM is the centerpiece of the Clive and Vera Ramaciotti Centre for Structural Cryo Electron Microscopy, directed by husband and wife team, Associate Professor Hans Elmlund and Associate Professor Dominika Elmlund, who are key users of MASSIVE. “Getting this data processing pipeline up has required new algorithms which take full advantage of MASSIVE,” Associate Professor Hans Elmlund said. “We had to develop a lot of new code.” Associate Professor Dominika Elmlund said that the relationship
between the cryo-EM and MASSIVE extends to providing temporary increases in computing power to handle data-processing for specific jobs, and determining and arranging the storage needed to archive the enormous amount of data the cryo-EM produces. “We are talking about petabytes of data – thousands of terabytes,” Associate Professor Dominika Elmlund said. “In order to support our community of cryo-EM users, most of whom are life scientists rather than computer experts, MASSIVE has provided the most sophisticated remote desktop environment we’ve ever seen.” Making the data processing step easier and more efficient is the key goal of the partnership between MASSIVE and the cryo-EM, which should result in users receiving their results – such as three-dimensional images of biomolecules – much more quickly. “This will make Australian researchers much more competitive in the publication stakes, and will decrease processing time from months to days,” Dr Goscinski said. “MASSIVE is also working with the Ramaciotti Centre to make
processing and interpreting cryoEM data as easy as possible so that in the future, when a researcher leaves the cryo-EM facility after capturing an image, the data should already be streaming to MASSIVE for processing.”
For more information about MASSIVE, go to massive.org.au
DATA FOR INTELLIGENCE Data to Decisions CRC (D2D CRC), Australia’s leading provider of Big Data capability for defence and national security, uses the Nectar Cloud to host its infrastructure. This allows the D2D CRC to build indexing and search functionality into its platforms, as well as researchspecific functionality, so users can quickly and efficiently filter data sets from their desktop. As part of the Australian Government’s Cooperative Research Centre (CRC) programme, which brings researchers and industry together to improve outcomes, D2D CRC is Australia’s leading provider of Big Data capability for defence and national security. Established in July 2014, with a grant of $25 million, the D2D CRC is tasked with the role of using Big Data technology to extract useful intelligence and insight from otherwise unmanageable data, providing a safer and more secure Australia. David Blockow, Software Architect for D2D CRC, has been working on two major projects since the CRC was established, for the predictive and retrospective analysis of data.
“D2D is slightly different in that we’re not just an administrative body – we facilitate research like other CRCs but we also have a core engineering team, responsible for building software platforms for our end-users to support their research and industry projects,” Mr Blockow said. After undertaking a Future Study to better understand the needs of their end-users, the D2D CRC found that open source intelligence was a major requirement, particularly among security and intelligence organisations. “Open source intelligence is about extracting information from blogs, news sites and social media platforms, any information that can be freely accessed online,” he said. “There’s way too much information out there for an individual or group of people to comprehend, so we have created automated tools to allow our users to extract the data they need.” Working on two major platforms, a predictive tool called Beat the News, and a retrospective tool called Apostle, Mr Blockow said D2D’s partnership with eResearch SA (a Nectar node) has been invaluable in allowing them to build, store and share their data and software.
“All of the infrastructure is hosted in the Nectar Cloud…allowing us to build indexing and search functionality into our platforms as well as research-specific functionality.” DAVID BLOCKOW, SOFTWARE ARCHITECT FOR D2D CRC
“As a CRC partner, eResearch SA provide us with compute (CPU cores) and storage infrastructure to host our platforms – we’re running a Hadoop cluster and Apache Spark as our processing framework through eResearch SA,” Mr Blockow said. “The nice thing about Hadoop and Apache Spark is that they process all of the data locally. The tools allow researchers to perform processing in the Cloud without having to download or move any data. “All of the infrastructure is hosted in the Nectar Cloud through eResearch SA, allowing us to build indexing and search functionality into our platforms as well as research-specific functionality, so users can quickly filter for a particular data set from their desktop.” Used by researchers and government organisations, Beat the News is an analysis tool that aims
to predict events that will happen in the future by looking at social media platforms, such as Twitter, to gauge sentiment. “When leading up to an election for instance, the tool can be used to assess which political parties are being mentioned on Twitter and the feeling towards those parties – allowing our users to make predictions about who might win,” Mr Blockow said. “We’ve already used the platform to do this successfully, and have results that are more accurate than traditional polling.” The platform was built for experimentation, for making predictions, running analytics and ingesting data, but it also provides researchers with an opportunity for collaboration. “It’s a sandpit area for researchers to work – they can connect remotely, run their algorithms, test things out,
bounce ideas off of others, then when they’re happy they can promote their algorithm to the production system so others can use it to make predictions,” Mr Blockow said. “Apostle is more retrospective, and allows users to explore an event that has taken place in the past, by looking at all of the open data that has been ingested into the system over time, and analysing it after the fact.”
For more information about D2D CRC, go to d2dcrc.com.au
“It’s a fast growing field, and with eResearch SA we have access to the latest technology allowing us to continue to provide the best platforms to our end-users.”
PARTNERS DEPARTMENT OF EDUCATION
UNIVERSITY OF MELBOURNE
MARINE VIRTUAL LABORATORY (MARVL)
CLIMATE AND WEATHER SCIENCE
ENDOCRINE GENOMICS (ENDOVL)
THE INDUSTRIAL ECOLOGY LAB
BIODIVERSITY AND CLIMATE CHANGE
THE GENOMICS VIRTUAL LABORATORY
CHARACTERISATION VIRTUAL LABORATORY
THE ALL-SKY VIRTUAL OBSERVATORY
VIRTUAL GEOPHYSICS LABORATORY
VIRTUAL HAZARD IMPACT & RISK (VHIRL)
RESEARCH CLOUD AT MONASH
UNIVERSITY OF TASMANIA
Acknowledgements NECTAR WOULD LIKE TO ACKNOWLEDGE THE FOLLOWING PEOPLE FOR THEIR CONTRIBUTION TO THIS PUBLICATION: EDITORIAL PATRICIA MCMILLAN PRODUCTION MANAGEMENT SARAH MULVEY SARAH NISBET KAREN MECOLES MICHELLE BARKER Nectar is supported by the Australian Government through the NATIONAL COLLABORATIVE RESEARCH INFRASTRUCTURE STRATEGY to establish eResearch infrastructure in partnership with Australian research institutions, organisations and research communities. The University of Melbourne has been appointed as Lead Agent.