Cnag Annual Report 2012

Page 26

research programmes

algorithm development Team Leader: Paolo Ribeca Postdoctoral Fellow: Leonor Frias PhD Student: Santiago Marco Visiting PhD Student: Lukasz Roguski

The goal of the Algorithm Development team is to supply the CNAG with efficient methods for analysing sequencing data, with particular emphasis on both the optimisation of computational-intensive operations and the study of new approaches towards producing higher quality results. More precisely, at the moment the team’s activities are progressing in four different directions: - Algorithms for aligning short reads. This is the most highly developed research line; the group produced the first prototypes of software tools suitable for analysing Illumina data in late 2008, and has been constantly improving and refining them since that time. The latest version of the GEM aligner is faster than any other published software on the same computing hardware with best of class sensitivity and accuracy. - Algorithms for de novo assembly of mammalian-sized genomes from short reads. The task is very complicated

from both the theoretical and practical standpoints due to the short read lengths of 2nd generation sequencing technologies (100-500 nt). - Algorithms for flexible compressed storage of genomic data. The CNAG currently has 2 Pb of storage capacity. Handling data on such a large scale can make even ordinary operations like copying and sorting files a challenge. - Algorithms for accelerating in-hardware the processing of high-throughput data. In spite of our best efforts to improve the basic algorithms used to process genomic data, such processing still requires impressive amounts of computational power. Therefore, all new highperformance computational technologies (GPUs, FPGAs, multi-core coprocessors) are being looked into to meet data analysis needs.

Major achievements: – P ublication and integration into the CNAG pipelines of the GEM aligner, which exceeds the accuracy of commonly used aligners and is 5-8 times faster (Marco-Sola et al. 2012).

–D evelopment of a GEM tool box, focusing in particular on RNA-mapping. –D e novo assembly tools for various de novo sequencing projects such as the Iberian lynx.

The GEM mapper: fast, accurate and versatile alignment by filtration Marco-Sola S, Sammeth M, Guigó R and Ribeca P Nat Methods. 2012 Dec;9(12):1185-8. 26


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.