Functional genomics

Page 212

16  Mass Spectrometry for Protein Quantification in Biomarker Discovery

3.3. Mass Spectrometric Analysis

211

1. All digested samples should be randomized for injection order to remove systematic bias from the data acquisition. Typically, up to 2 mg of the tryptic peptides are required for the injection onto a C18 nanoflow column (i.d. = 75 mm, length = 5 cm). Peptides are eluted with a linear gradient from 5 to 40% acetonitrile developed over 120 min at a flow rate of 200 nl/min, and effluent is electro-sprayed into a LTQ or LTQ-Orbitrap mass spectrometer (Thermo-Fisher Scientific). 2. The electro-spray ionization (ESI) source is operated with a 2 kV potential and a capillary temperature of 200°C. The instrument is tuned using an angiotensin I peptide. The max ion time is set to 200 ms for the parent ion scan and to 500 ms for the zoom scan and MS/MS scan. This method requires all the MS data be collected in the data-dependent “Triple-Play” mode (MS scan, Zoom scan, and MS/MS scan). Parent ion scans and MS/MS scans are collected in “Centroid” mode, and zoom scans are collected in “Profile” mode. Dynamic exclusion is set to a repeat count of one, an exclusion duration of 60 s, and rejection widths of −0.75 and +2.0 m/z. 3. Database searches against the International Protein Index (IPI) and/or the Nonredundant (NCBI) databases are carried out using SEQUEST®, X!Tandem, or Mascot algorithms, or a combination of two or three of these search engines. Protein identification confidence can be assessed using the algorithm described by Higgs et al. (37) or other publicly available algorithms (e.g., ProteinProphet™, which is an open source software available at http://proteinprophet.sourceforge.net/).

3.4. Protein Identification

1. Proteins identified by search engines such as SEQUEST® and X!Tandem are generally categorized into priority groups based on the confidence of the protein identification as shown in Table 1. Each algorithm compares the observed peptide MS/ MS spectrum and a theoretically derived spectra from the database to assign quality scores (XCorr in SEQUEST® and E-Score in X!Tandem). These quality scores and other important predictors are combined in the algorithm that assigns an overall score, %ID confidence, to each peptide. The assignment is based on a random forest recursive partition supervised learning algorithm (38). The %ID confidence score is calibrated so that approximately X% of the peptides with %ID confidence > X% are correctly identified. 2. The confidence in protein identification is increased with the number of distinct amino acid sequences identified. Therefore, proteins are also categorized depending on whether they have only one or multiple unique sequences at the required confidence. A protein will be identified with a higher confidence if it has at least two distinct amino acid sequences with a required


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.