Re: High Sensitivity MS/MS Sequencing

StvTindall@AOL.com
Tue, 7 Jan 1997 11:55:47 -0500

From: StvTindall@AOL.com
Date: Tue, 7 Jan 1997 11:55:47 -0500
Subject: Re: High Sensitivity MS/MS Sequencing
To: Recipients of ABRF List <abrf@aecom.yu.edu>

On 97-01-06, Matthias Mann wrote:

"... mass spectrometry and associated sample preparation and software
techniques are still evolving rapidly. My personnal opinion is that mass
spectrometric 'de novo' sequencing will become quite routine and wide spread
in two to three years."

Matthias,

My impression from reading several contributions to this and other
discussions is that the existing MS/MS technology can produce satisfactory
data for "de novo" sequencing, but that the "manual" interpretation of data
is the major impediment to "de novo" sequencing. Do you think that
improvements in automated data interpretation hold the key to the wide spread
implementation of MS/MS "de novo" sequencing?

Thanks in advance for your comments.

Steve
====================
Stephen Tindall
Argo BioAnalytica, Inc.
Phone: 1-201-605-2100
Fax: 1-201-605-2104
StvTindall@aol.com
====================
Subj: Re: High Sensitivity MS/MS Sequencing
Date: 97-01-06 12:18:09 EST
From: Matthias.Mann@EMBL-Heidelberg.DE (Matthias Mann)
Sender: abrf-request@aecom.yu.edu (Association of Biomolecular Resource
Facilities)
To: abrf@aecom.yu.edu (Recipients of ABRF List)
CC: Shevchenko@EMBL-Heidelberg.DE, podtelez@EMBL-Heidelberg.DE (Alexander
Podtelezhnikov), Ashman@EMBL-Heidelberg.DE

Dear ABRF newslist subscribers,
though I don't normally follow this list, Ken William's question seems to
have been partly addressed to me. Also, the field of protein sequencing has
its fair share of hype it is not always
easy to tell fact from fiction. So here is my opinion.

By way of background, my group at EMBL is not a core facility but contains
that function (synthetic peptides & Edman sequencing, currently two out of
eleven people). We have been engaged in the development of mass spectrometric
methods for microcharacterization of proteins for
a long time, in fact since ES MS and MALDI were discovered (first at Yale,
then in Denmark and now at EMBL). So there are a lot of 'man years' in the
developent of our technology (most
of which is now commerically available and all of which has been described in
publications, see our home page listed at the bottom).

Furthermore, I have been extremely lucky in having a number of brilliant
coworkers in fields ranging from physics and software to protein chemistry
and biology. So we are definitely not in the same position as a normal core
facility and indeed it would be very surprising if all the things that are
'relatively easy' for us should be easy for somebody just starting to learn
the techniques. The problem is that one can't write this in a paper. You
can't write 'this is easy for
us now but wasn't easy a year ago and is not easy for other labs'.

Nevertheless, bearing Ken's point in mind about bringing core facilities
under pressure, we are now trying to formulate things more carefully.

Competing the traditional Edman sequencing there are now actually three
levels of mass spectrometric analysis:

I. Protein Identification
II. EST searches
III. 'De novo' mass spectrometric sequencing

I. Protein identification is now be performed routinely and with high
throughput in our lab either by high mass accuracy reflector MALDI with
nitrocellulose containing 'thin films' and delayed extraction
or by nanoelectrospray peptide sequencing of unseparated peptide mixtures on
a Sciex triple quadrupole. The sensitivity is subpicomole (protein loaded on
the gel, not peptides in the mass spectrometer). The throughput is very high
at the one picomole level and drops towards the level
of detection, currently about 0.1 to 0.2 pmole in a spot.
A recent study from our lab (Shevchenko et al Proc. Natl. Acad. Sci. USA 93
(25) 14440 - 14445 (1996)) identified 150 yeast proteins at low amounts. To
get a high throughput, high mass accuracy MALDI petide mapping provided the
first screen and nanoelectrospray tandem mass spectrometry the second.
Interestingly for core facilities, no chromatography or blotting were used in
the study and all analyses were done from single gels.

If high mass accuracy reflector MALDI had not been available the
identifications
could all have been done by ES MS/MS, ableit at a lower throughput. A paper
now in press (G. Neubauer et. al. PNAS Feb(?) 1997)) describes the
identification of a complete yeast mulitprotein complex containing 20
proteins by nanoelectrospray tandem mass spectrometry alone.
The issue here is one of sensitivity and certainty of identification
currently obtainable by a core facility with a limited number of people.

If you put some effort into it and get the necessary technology you should be
able to identify proteins unambiguously at the 1 to 10 pmole level and
several core facilities are doing this now. Depending on the sample
preparation used, one to several proteins per day should
also be realistic.

II. Unknown human and mouse proteins are in almost all cases done via EST
sequence databases in our lab, rather than by 'de novo' sequencing. dbEST
contains 540.000 human and 120.000 mouse ESTs at this time, corresponding to
more than half the human genes but corresponding to almost all the gel
purified proteins that you are likely to see. Our approach to EST searching
is described in the December issue of TIBS. (Mann, M. Trends Biol. Sci.
(TIBS) 21 494 - 495 (1996)).

More recently we have also had much success in matching proteins to EST
databases across species boundaries. So there is no need to wait to 2005, the
database techniques will make almost all mammalian proteins available much
sooner. The techniques to match to EST databases are more tricky than
'straight' matching to full length sequence databases and is currently still
an active research topic. This should change in the course of this year,
though. Simulaneously,
the TIGR EST database will become public in April and ESTs are being mapped
at a very fast rate to their genomic locations, further enhancing the
usefulnes of the EST searching approach.

III. We routinely do 'de novo' sequencing for cloning. In a collaboration we
will ask for at least one picomole which of course means that we often have
much less than that. In the case of the key apoptosis signalling protein
FLICE (Muzio, M.et al. Cell , 85 817-827 (1996)) the silver stained spot
contained between 0.2 and 0.5 pmole and 43 residues were called 'de novo'
(and proven correct after full length cloning.) Another example is the
sequencing of telomerase (not the human one) where 150 amino acids were
obtained at an amount that was too low for Edman sequencing and which lead to
the successful cloning of the enzyme.

In our lab, we have now sequenced more than 30 proteins not contained in
sequence databases for either cloning or homology searching. We only do
collaborations, but a price would have to be very high, as much expertise and
time is involved. However, mass spectrometry and associated sample
preparation and software techniques are still evolving rapidly. My personnal
opinion is that mass spectrometric 'de novo' sequencing will become quite
routine and wide spread in two to three years.

Hope this helps

Matthias Mann
Group Leader Proteins & Peptides
EMBL
Heidelberg
+49 6221 387 560 (phone)
+49 6221 387 306 (fax)
http://www.mann.embl-heidelberg.de/