Created: 19th June 1998, last updated: 19th June 1998, © 1998 ABRF

Research Group/Committee Reports

Protein Identification Research Group


It is frequently the case that an SDS-PAGE band contains two or more co-migrating proteins. The presence of more than one protein in the sample makes identification of the proteins a challenge. This is especially true when one of the proteins is present in excess over the others. This year's Protein Identification study was designed to address the issues, evaluate the techniques, and discover the pitfalls involved in the identification of co-eluting proteins in an SDS-PAGE band. The sample consisted of a Coomassie (R-250)-stained gel piece containing two closely spaced, approximately 40 kDa proteins. The major protein was chicken ovalbumin, present initially at 50 pmol (loaded onto the gel). The minor component was rabbit aldolase, present initially at a level of 10 pmol (loaded onto the gel). A Coomassie (R-250)-stained gel piece containing no added protein was provided as a control. Participating laboratories were asked to perform an in-gel digest on the samples and identify the major and minor proteins using any means available to them. A representative protocol for trypsin or Lys-C digestion was provided with the samples.

Out of 109 laboratories requesting samples this year, 45 returned data sets (41.3% participation). The level of participation continues to increase each year; the participation last year was 36%. Edman sequence analysis continues to be the most commonly used approach for protein identification, with 48.9% of the participating laboratories using this method exclusively. The use of mass spectrometry (MS) among participating laboratories continues to increase, with 31.1% of the respondents using it exclusively. Approximately 20% of the participating laboratories used a combination of Edman sequencing and MS. Overall, 80% of the respondents identified the major protein correctly, though only 29% were able to identify the minor component.

Among laboratories using Edman sequence analysis exclusively, 82% made an identification on the major protein but only 14% made calls on the minor component. The proposed identification for both proteins was 100% correct. The average number of peptides sequenced per lab that derived from the major protein was 2.4, with the range being 0-12 peptides. In comparison, the average number of peptides sequenced per lab that derived from the minor component was 0.4 with the range being 0-3. Median sequencing yields for all peptides reported were 5.3 pmol for the major and 2.2 pmol for the minor component, indicating an overall median recovery of 10.6 and 22.0%, respectively.

Laboratories using exclusively MS employed a variety of techniques to identify the two proteins. These can be broadly divided into two groups: those that used MS derived sequence data to identify the proteins and those which used peptide mass maps. Almost 80% of the laboratories producing MS data sets proposed an identification for both proteins, however, only 64% of these identified both proteins correctly. In those cases where MS-derived sequence data was used to make an identification, the correct protein was called in every case. Although some laboratories used the peptide mass map approach to correctly identify both components, all of the incorrect protein identifications were based on peptide mass maps.

In conclusion, Edman sequence analysis is still the most common approach used to identify proteins, but the use of MS is increasing. Mass spectrometry was more likely than Edman (64% vs 14%) to identify the minor component. It appeared that, in many cases, the time and effort required to identify the minor component by Edman was a limiting factor. No incorrect protein identifications were made when the database search was conducted using sequence data, regardless of how it was derived. In this study, protein identification based solely on peptide mass maps had a high failure rate. These failures appear to have resulted from data sets in which the mass assignments were in question or in which too few peptide masses were used for the search. Improved guidelines for conducting and interpreting peptide mass map data searches could have eliminated all of the false positives.


Return to the ABRF Home Page