created: 10/09/97, last updated: 10/09/97,© 1997 ABRF
The following article continues our coverage of presentations made at the ABRF '97 meeting held in Baltimore, Maryland, February 9-12, 1997.
Mary B. Moyer Glaxo Wellcome Inc.
The role of Edman sequencing has evolved from a tool to determine the complete sequences of proteins to obtaining partial amino acid sequences for designing oligonucleotide probes and primers for cDNA cloning. More recently, Edman sequence analysis is increasingly used to identify proteins and to characterize recombinant proteins. There have been steady advances in the efficiency and sensitivity of automated Edman sequencing over the last thirty years to enable primary structural information to be obtained from low pmol amounts of material. In the last several years, however, there have been tremendous advances in mass spectrometric technologies such that a few laboratories are now able to use this approach to obtain peptide sequences from fmol amounts of protein digests. This capability, combined with the enormous growth of sequence databases, has led to considerable speculation regarding the role of Edman sequence analysis in the ever-changing research environment.
Speculation on the future of Edman sequence analysis is not a new phenomenon&emdash;similar discussion was prompted by the advent of DNA sequencing (e.g., Malcom, A.D., "The Decline And Fall Of Protein Chemistry", Nature 275: 90-91 (1978)). Although mass spectrometric protein sequencing has superior sensitivity and higher sample throughput than Edman sequencing and is unaffected by a "blocked" amino-terminus, it is generally limited to peptides that are less than about 25 residues in length. In addition, mass spectrometric sequencing data is considerably more difficult than Edman sequencing data to interpret in terms of a de novo sequence, it usually cannot distinguish between Ile/Leu and Lys/Gln, and it lacks the quantitation inherent in Edman sequencing. Finally, it is not possible to use mass spectrometric sequencing to identify radioactively labeled sites of modification that are present at low stoichiometry.
The impetus for this workshop was to begin to explore the impact of recent advances in mass spectrometric sequencing and of the various genome projects on Edman sequence analysis. As evident from the following summaries, there is a growing consensus that Edman sequencing will continue to play an integral role in protein characterization and indeed, a primary role for Edman sequencing seems to be emerging in terms of the analysis of recombinant proteins.
Will Burkhart, "The Changing Role of Edman Sequencing"
Edman sequence analysis and mass spectrometry remain indispensable tools in deducing the primary structure of proteins; however, examples exist where mass spectrometry has not succeeded in obtaining partial amino acid sequencing data. One such case was the integral membrane protein, tumor necrosis factor-alpha (TNF-alpha) converting enzyme (TACE). TACE is a metalloprotease that is responsible for processing TNF-alpha from a 26 kDa precursor to the secreted 17 kDA mature form. TNF-alpha is a cytokine that contributes to a variety of inflammatory disease states, thus purification and cloning of TACE was a critical goal. Purification using a biotinylated inhibitor indicated that TACE activity correlated with the appearance of an 85 kDa protein on SDS-PAGE. In order to obtain Edman sequence data, the 85 kDa band was excised and the protein electroeluted directly onto the HP C-18 sequencing cartridge; in situ reduction and alkylation was performed followed by Lys-C endoprotease digestion. Following digestion, the cartridge was coupled to a capillary LC column for collection of peptides, but no peptides were obtained. The sequencing cartridge was then inserted directly into the HP sequencer and subjected to Edman degradation. A single 44 amino acid stretch of sequence was obtained that subsequently enabled cloning of the TACE cDNA.
The presence of a cysteine-rich region, a transmembrane domain, and a proline-rich cytoplasmic tail helped explain the protein's resistance to proteolytic digestion. In this case, mass spectrometry may not have been useful because a set of peptides was not obtained that would have permitted either a mass search or mass spectrometric sequencing. Because TACE proved to be a novel protein that was not present in any database, even if peptide mass searching would have been possible it would not have identified the protein. Because the peptide was irreversibly bound to the C-18 column, any desalting step based on a reverse-phase column probably would have resulted in loss of the peptide. Thus, Edman sequence analysis was crucial for achieving the goal of cloning TACE, an important new target for drug discovery.
As exciting as obtaining sequence on a novel protein like TACE can be, that is not currently the primary focus of Edman sequencing in our laboratory. The majority of our efforts are spent analyzing recombinant proteins that will be used for structural studies, bioassays, and screens of potential drug candidates. For crude extracts and partially purified samples, proteins are sequenced following SDS-PAGE and electroblotting to monitor levels of expression and to optimize fermentation conditions. For purified proteins, Edman sequencing is employed to determine the status of the amino-terminus, obtain an indication of purity, detect degradation products, confirm proper processing of precursors, and check processing of native substrates.
Several examples of the critical role Edman sequence analysis has played in analyzing recombinant proteins were described. Edman sequencing demonstrated the occurrence of a mis-initiation at an internal Met and internal cleavage in the calcium-binding pocket of a 19 kDa protein used in crystallographic studies. Studies with various calcium concentrations, combined with Edman sequence analysis of the preparations, resulted in finding conditions that optimized yield of the native protein. The mis-initiation was repaired by changing the methionine to a threonine. In another case, Edman sequencing of a band at the correct molecular weight on a blot provided our only means of monitoring purification of an 8.5 kDa protein, since no antibody or assay was available at the time. Finally, a 25 kDa protein that appeared highly purified on SDS-PAGE yielded two major sequences by Edman sequencing; a minor 6 kDa band was found to be a significant contaminant in terms of molar ratio. By providing rapid feedback during the processes of production and purification of recombinant proteins, Edman sequencing has become an invaluable tool. Advances in automation and speed in sequencing capabilities are essential in this role.
Mike Rohde, "Relative Roles of Edman Sequencing and Mass Spectrometry in a Biopharmaceutical Research Lab"
Edman sequencing and mass spectrometry are highly complementary and essential tools for characterizing proteins. In the biotechnology pharmaceutical laboratory, sequencing is an essential requirement for determination of modified amino acids such as glycosylation and phosphorylation, and heterogeneity assessment. The strengths of Edman sequencing are that it can provide the amino-terminal sequence of a protein if the protein is not blocked and that ambiguities do not arise from residues with the same or similar masses (e.g., Edman sequencing easily differentiates Ile from Leu). In comparison, direct mass spectrometry of intact proteins provides an exact physical measurement, can confirm expected composition, and provides clues about heterogeneity. The intact mass is used to suggest carboxyl-terminal processing, the presence of substitutions, indication of carbohydrate structure, and confirmation of expected sequence composition.
Several examples of the synergy of mass spectrometry and Edman sequence analysis in solving problems were presented. For example, a recombinant protein with a mis-paired disulfide gave two chromatography peaks after full reduction of disulfide bonds and slow re-oxidation refolding. Peptide isolation and sequencing showed no difference in primary sequence. Analysis by mass spectrometry of each peak showed the expected mass in peak 1 (7648) while peak 2 yielded the expected mass plus glutathione (7953).
Stability studies of epidermal growth factor (EGF) were performed where extended storage at 4°C resulted in two peaks when originally a single peak was observed by reverse-phase HPLC. Analysis by reverse phase HPLC and mass spectrometry indicated that the new form of the protein was more hydrophobic and that the mass had increased by one dalton. This information was critical because it indicated a possible deamidation; EGF has one glutamine and two asparagine residues, one of which is at the amino-terminus. To pinpoint the modification site, EGF was digested with thermolysin and the resulting peptides separated by HPLC. MS of the digest located the deamidation; however, when the putative deamidated peptide was subjected to sequence analysis, it was found to be blocked; a beta-aspartic acid rearrangement had occurred.
In another example, HPLC separation of a proteolytic digest of a recombinant protein yielded six peptides with different HPLC elution positions; all beginning with the same sequence when subjected to Edman degradation. Analysis by mass spectrometry was crucial in determining that observed mass differences were: Met oxidation + sulfation, Met oxidation, O-glycosylation + sulfation, O-glycosylation, sulfation, and no modification. Edman sequence analysis was not useful in the case of sulfation because the sulfate is removed in the first round of sequencing. A combination of Edman sequence analysis and MS located a threonine as the glycosylated residue.
In summary, MS and Edman sequence analysis are complementary techniques, and both should be used in structural analysis. While Edman sequencing provides greater certainty for amino acid sequence, mass spectrometry often furnishes more useful information on modified amino acids. Analysis time can be shortened if the right tools are chosen.
Jerome Bailey, "Amino- and Carboxyl-Terminal Sequencing Chemistries on a Single Sequencing Platform"
This presentation described the combined application of amino-terminal and carboxyl-terminal chemistries on a single instrument to obtain structural information from both termini of the same sample loaded once. The similarity of the HP carboxyl-terminal thiohydantoin chemistry and the amino-terminal Edman methodology makes possible this complementary approach of first conducting automated amino-terminal sequencing followed by automated carboxyl-terminal analysis on a single sample. With both chemistries performed on the same instrument, cross contamination becomes a consideration. No cross reactions are observed in the carboxyl-terminal data following amino-terminal chemistry. However, carboxyl-terminal chemistry contributes one peak to the amino-terminal chromatogram that elutes between PTH-threonine and PTH-glutamine. This artifact peak originates from the coupling reagent and rises and falls depending on the sample and the cycle.
The HP-1100 HPLC system serves as the PTH- and TH- amino acid analyzer for the detection and identification of both amino- and carboxyl-terminal sequencing derivatives. Two LC columns are connected on-line with solvent switching valves and four solvents are used for chromatography of the derivatives. PTH-amino acid standards are chromatographed at the 10 pmol level while TH-amino acid standards are run at the 100 pmol level. In the carboxyl-terminal thiohydantoin separation, aspartic acid and glutamic acid are observed as methyl esters. In samples where amino-terminal sequencing precedes carboxyl-terminal chemistry, TH-lysine elutes later in the TH chromatogram due to modification of the epsilon amino group of lysine during Edman degradation.
Several methods for sample loading are available depending on the sequencing chemistry to be used for analysis. Proteins to be subjected to amino-terminal analysis only are loaded onto the reverse-phase cartridge of the biphasic sequencing cartridge or are excised from PVDF or Teflon electroblots and inserted in the membrane compatible cartridge for sequencing. If only carboxyl-terminal sequencing chemistry is to be performed, or amino-terminal sequencing will be followed by carboxyl-terminal sequencing, samples are applied to inert Zitex membranes or sequenced from Teflon electroblots. Samples blotted to PVDF are not compatible with the carboxyl-terminal chemistry cleavage reagent trimethylsilanolate.
Results were shown for 10 pmol of beta-lactoglobulin A applied to a biphasic column for amino-terminal analysis; an initial yield of 60% was obtained with a repetitive yield of 93.2% for 20 cycles. Cycles from both 10 pmol and 2 pmol loads were shown for human serum albumin applied to a biphasic column for N-terminal analysis. The first three cycles from 100 pmoles of HSA loaded onto Zitex was shown; samples applied to Zitex gave no DPTU peak in amino-terminal cycles. The same sample was then subjected to carboxyl-terminal analysis and two cycles were obtained. As another example of amino-and carboxy-terminal analysis on the same sample applied to Zitex, 400 pmoles of superoxide dismutase was subjected to amino-terminal sequencing which was not successful due to a blocked amino terminus; the same sample subjected to carboxyl-terminal sequencing gave the carboxyl-terminal lysine as the late eluting form with alanine as the next residue. Monoclonal IgG was also shown as a sample on Zitex subjected to amino-terminal and then carboxyl-terminal sequence analysis: both termini of the heavy and light chain were determined. Proline was observed in the second cycle of carboxyl-terminal analysis for the heavy chain. Proline derivatizes to a thiohydantoin faster that the other amino acids. In contrast to the other amino acid peptidylthiohydantoins, proline peptidylthiohydantoin contains a ring quaternary amino group which results in a premature cleavage of this thiohydantoin. Improved washing and extraction methods have improved proline recovery yields to 30-50% of predicted values.
In addition to direct loading of purified proteins, samples electroblotted to Teflon from SDS-PAGE were also analyzed by both amino- and carboxyl-terminal degradation. Cycles from both ends of 50 pmol of a 35 kDa recombinant electroblotted protein were shown. This approach is particularly useful for checking the status of both termini at various stages of purification.
Acknowledgments
Will Burkhart acknowledges the contributions of David Becherer, Marcos Milla, and Marcia Moss.
Mike Rhode acknowledges the contributions of Chris Clogston, Pat Derby, Rebecca Elmore, Ann Hsu, Vish Kattta, Scott Lauren, Hsieng Lu, Lee Anne Merewether, Bob Rush, Chras Spahr, and Ken Stoney.
Jerome Bailey acknowledges the contributions of Chad Miller and the entire Hewlett-Packard development team; Stephen Tindall and Vijay Khadse of ARGO Bioanalytica for their enhancements to the carboxyl-terminal sequencing chemistry.
Mary B. Moyer may be contacted at Glaxo Wellcome Inc., 5 Moore Drive, Research Triangle Park, NC 27709, Tel: (919) 483-3089, E-mail: mbm16218@glaxo.com. Will Burkhart may be contacted at Glaxo Wellcome Inc., 5 Moore Drive, Research Triangle Park, NC 27709; Mike Rohde at Amgen Inc., Amgen Ctr-14-2-E, Thousand Oaks, CA 91320; and Jerome Bailey at Hewlett-Packard, Co., 3500 Deer Creek Road, Palo Alto, CA 94304.
Return to the The ABRF Home Page