Venoms: Computational Analyses to determine the Codon Usage Bias and Amino Acid preferences Ketaki Prakash Ghatole1, Hashvitha R1, Ahalya N1, Susanta Pahari2, Prashanthi Karyala2 1 Ramaiah Institute of technology 2 Indian Academy Degree College
Methodology
Abstract Venoms are pharmacologically active complex mixtures of proteins secreted by a wide variety of venomous animals. The coding sequences obtained through transcriptomics provide information explaining the functional venom variation. Studies have found that evolutionary forces and multiple molecular processes like mutation rates participate in shaping codon usage in higher eukaryotes. Thus, the codon usage bias plays a major role in controlling gene expression, such that preferred codons would be directly linked to transcripts with a codon bias. This study aims to determine the amino acid distribution and the codon usage bias in venomous organisms across six selected taxa and between venomous and non-venomous proteins within each taxon. The different results obtained by Euclidean distances based on Amino Acid preferences and Relative Synonymous Codon Usage values between venom proteins of different taxa suggest that the evolution is extremely complex and beyond the phylogenetic constraints. The higher preference for A/U ending in the over-represented codons throws insights on how evolution has acted to restrict using G/C-ending codons in venoms. The Effective Number of Codons vs. GC3 plot and the neutrality plot suggested that the mutational pressure had a very little role as compared to other factors like translational selection in driving the codon bias. The extremely different amino acid preferences and RSCU values between venomous and non-venomous proteins within each group highlight their weak resemblance and unrelated evolution. These conclusions contribute to our knowledge of the selected taxonomic groups and improve our understanding of the evolution of gene architecture of venomous proteins.
•
Contact Information
GC content in the CDSs of the venom protein is lesser than the GC content in the CDSs of the non-venom protein in almost all the taxa except in Insecta. The GC content in all the three positions was less than 50% The average GC content among all the taxa was approximately 40% indicating that venom coding genes have higher AU content than GC in their CDSs.
• •
ENC VS GC3 •
Only 25 of all the 4,251 CDS of venoms had a strong bias in codon usage ENC Vs. GC3 suggested that mutational pressure had a very little role to play in driving codon bias and other factors such as translational selection may have been involved in determining the selective constraints on codon bias
•
NEUTRALITY RULE PLOT •
Results and Discussion AMINO ACID PREFERENCES •
Introduction There are more than 100,000 venomous species on earth belonging to different biological taxa. They are pharmacologically active components that include proteins, peptides, enzymes and non-protein compounds. Venoms have specific effects on biological functions like blood coagulation, blood pressure regulation, and transmission of the nervous or muscular impulses. They act on specific targets with high specificity within the organism. This has made venoms important in pharmaceutical industry. With increasing market for peptide based drugs, researchers are now exploring these venoms Most of the research on venoms has been focused on clinical implications and detailed molecular studies of toxins. However, venom must be considered in an evolutionary context to gain a full understanding of the biology of animal venoms. . Codon usage bias is an important determinant of gene expression and plays an important role in understanding evolution and phylogenetic studies. Codon usage Bias include Amino Acid Preferences, RSCU values, GC content, ENC values. The RSCU value of a codon is the ratio of its observed frequency to its expected frequency in the absence of usage bias. The ENC gives the number of equally used codons that would generate the same codon usage bias as observed. The results obtained in this study will facilitate a better understanding of gene architecture and mechanisms in selected venomous organisms. It will also help in building a genetic profile that can be used for future research on these species.
GC CONTENT
•
•
The venomous proteins have a higher preference for cysteine in all the taxa. The Euclidian distance are shown in the table. The lower Euclidian distance represents greater similarity and vice-versa.
•
PARITY RULE PLOT •
RELATIVE SYNONYMOUS CODON USAGE (RSCU) •
•
•
There are 28, 29, 29, 32, 30 and 27 codons are over-represented in Chilopoda, Scorpiones, Serpentes, Cnidaria, Aranea and Insecta respectively. The Euclidian distance are shown in the table.
This plot for venomous CDSs suggested that directional mutation pressure accounts for only 26.19% of the effects observed Spearman's Rank Correlation Coefficient- 0.9429. This concludes that a very strong positive correlation exists between the GC12 and GC3 of the 6 taxa
PR2 plot can be a useful tool to measure the direction of bias within AU and GC pairs biases Our results suggest that the nucleotide composition mutation bias and translational selection were the main driving factors of codon usage bias in CDSs of venom proteins.
Conclusions CLUSTERED IMAGE MAP (RSCU) The preferred codons (RSCU > 1) in all the venomous proteins of all the groups are strongly biased towards A/U bases in third position. All the preferred codons in Insecta are A/U ending followed by 89.65% in Scorpiones, 82.14% in Chilopoda, 73.33% in Araneae, 62.06% Serpentes and 59.37% in Cnidaria.
Venoms have evolved independently at different times in different creatures The amino acid preferences suggest maximum similarity between Chilopoda and Araneae followed by Chilopoda and Serpentes and maximum dissimilarity between and between Scorpiones and Insecta. On the other hand, RSCU values suggested maximum similarities between Cnidaria and Araneae followed by Scorpiones and Insecta. Maximum dissimilarity was observed between Insecta and Serpentes. The distances between the taxa showing different results when compared based on amino acids and RSCU values suggest that the evolution process is extremely complex and beyond the phylogenetic constraints.The RSCU values also suggested that preffered codons appear to have tendancy towards A/U ending codons. The ENC plot and the neutrality plot suggested that the mutational pressure had a very little role.GC content in the CDSs of the venom protein is lesser than the GC content in the CDSs of the non-venom protein in almost all the taxa except in Insecta.
References 1. 2. 3.
Utkin, Y. N. (2015). Animal venom studies: Current benefits and future developments. World Journal of Biological Chemistry, 6(2), 28. Jiménez-Porras, J. M. (1968). Pharmacology of peptides & proteins in snake venoms. In Annual review of pharmacology (Vol. 8, pp. 299–318). Annu Rev Pharmacol. Zhoua, Z., Danga, Y., Zhou, M., Li, L., Yu, C. H., Fu, J., Chen, S., & Liu, Y. (2016). Codon usage is an important determinant of gene expression levels largely through its effects on transcription. Proceedings of the National Academy of Sciences,113(41),E6117–E6125