EMBnet.news 14.3

Page 110

110

EMBnet

Volume 14 Nr. 3

genomic coordinates on the same strand. In the next step, EasyCluster refines the EST grouping by running again GMAP on every pseudo-cluster and by including in a cluster only expressed sequences sharing at least one splice site. Finally, for each generated cluster EasyCluster produces a graphical representation in pure HTML code for a simple inspection of results by eyes. EasyCluster is written in python programming language and works on all unix-based platforms where GMAP can be installed. Results: The EasyCluster program provides EST/FL-cDNA clusters ready to be used in gene prediction pipelines and to detect alternative splicing events. In order to investigate the reliability of EasyCluster we tried to group 256 spliced ESTs related to eleven human homeobox genes (family HOXA) located in the chromosome 7. The same pool of ESTs was used as input in wcd, a new and computationally efficient program to build EST clusters based on sequence similarity (Hazelhurst et al. Bioinformatics. 2008; 13,1542-1546). EasyCluster was able to reconstruct eleven clusters corresponding to each homeobox gene. In contrast, wcd predicted only nine groups where two of them were related to more than a gene. In particular, ESTs supporting HOXA3 and HOXA4 genes and HOXA9 and HOXA10 genes were clustered together. Our simple results, therefore, demonstrate the reliability of EasyCluster and, in general, of genomebased EST clustering programs over widespread systems based on sequence similarity. Given the simplicity, flexibility and portability of our system, we are planning to introduce EasyCluster in a more complex pipeline to facilitate the genome-wide detection of the alternative splicing in newly sequenced genomes.


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.