Re: High Sensitivity MS/MS Sequencing (fwd)

PPMAL (ppmal@cco.caltech.edu)
Tue, 7 Jan 1997 11:17:34 -0800 (PST)

Date: Tue, 7 Jan 1997 11:17:34 -0800 (PST)
From: "PPMAL (Hathaway/Krapf)" <ppmal@cco.caltech.edu>
To: "ABRF Hypermail (Dirk Krapf)" <abrfhyp@cco.caltech.edu>
Subject: Re: High Sensitivity MS/MS Sequencing (fwd)

Date: Mon, 6 Jan 1997 14:02:56 +0200
From: Matthias Mann <Matthias.Mann@embl-heidelberg.de>
To: Recipients of ABRF List <abrf@aecom.yu.edu>
Cc: Shevchenko@embl-heidelberg.de,
Alexander Podtelezhnikov <podtelez@embl-heidelberg.de>,
Ashman@embl-heidelberg.de
Subject: Re: High Sensitivity MS/MS Sequencing

Dear ABRF newslist subscribers,
though I don't normally follow this list, Ken William's question seems
to have been partly addressed to me. Also, the field of protein
sequencing has its fair share of hype it is not always
easy to tell fact from fiction. So here is my opinion.

By way of background, my group at EMBL is not a core facility but
contains that function (synthetic peptides & Edman sequencing, currently
two out of eleven people). We have been engaged in the development of
mass spectrometric methods for microcharacterization of proteins for
a long time, in fact since ES MS and MALDI were discovered (first at
Yale, then in Denmark and now at EMBL). So there
are a lot of 'man years' in the developent of our technology (most
of which is now commerically available and all of which has been
described in publications, see our home page listed at the bottom).
Furthermore, I have been extremely lucky in having a number of
brilliant coworkers in fields ranging from physics and software to
protein chemistry and biology. So we are definitely not in the same position
as a normal core facility and indeed it would be very surprising if
all the things that are 'relatively easy' for us should be easy for
somebody just starting to learn the techniques. The problem is that
one can't write this in a paper. You can't write 'this is easy for
us now but wasn't easy a year ago and is not easy for other labs'.
Nevertheless, bearing Ken's point in mind about bringing core
facilities under pressure, we are now trying to formulate things
more carefully.

Competing the traditional Edman sequencing there are now actually three
levels of mass spectrometric analysis:

I. Protein Identification
II. EST searches
III. 'De novo' mass spectrometric sequencing

I. Protein identification is now be performed routinely and with high
throughput in our lab either by high mass accuracy reflector MALDI
with nitrocellulose containing 'thin films' and delayed extraction
or by nanoelectrospray peptide sequencing of unseparated peptide mixtures on
a Sciex triple quadrupole. The sensitivity is subpicomole (protein loaded
on the gel, not peptides in the mass spectrometer). The throughput
is very high at the one picomole level and drops towards the level
of detection, currently about 0.1 to 0.2 pmole in a spot.
A recent study from our lab (Shevchenko et al
Proc. Natl. Acad. Sci. USA 93 (25) 14440 - 14445 (1996)) identified 150
yeast proteins at low amounts. To get a high throughput, high mass
accuracy MALDI petide mapping provided the first screen and
nanoelectrospray tandem mass spectrometry the second. Interestingly for
core facilities, no chromatography or blotting were used in the study and
all analyses were done from single gels.
If high mass accuracy reflector MALDI had not been available the
identifications
could all have been done by ES MS/MS, ableit at a lower throughput. A
paper now in press (G. Neubauer et. al. PNAS Feb(?) 1997)) describes the
identification of a complete yeast mulitprotein complex containing 20 proteins
by nanoelectrospray tandem mass spectrometry alone.
The issue here is one of sensitivity and certainty of identification
currently obtainable by a core facility with a limited number of people.
If you put some effort into it and get the necessary technology
you should be able to identify proteins unambiguously at the 1 to 10 pmole
level and several core facilities are doing this now. Depending on
the sample preparation used, one to several proteins per day should
also be realistic.

II. Unknown human and mouse proteins are in almost all cases done via EST
sequence databases in our lab, rather than by 'de novo' sequencing.
dbEST contains 540.000 human and 120.000 mouse ESTs at this time,
corresponding to more than half the human genes but corresponding to
almost all the gel purified proteins that you are likely to see. Our
approach to EST searching is described in the December issue of TIBS.
(Mann, M. Trends Biol. Sci. (TIBS) 21 494 - 495 (1996)).
More recently we have also had much success in matching proteins to
EST databases across species boundaries. So there is no need to wait
to 2005, the database techniques will make almost all mammalian
proteins available much sooner. The techniques to match to EST
databases are more tricky than 'straight' matching to full length
sequence databases and is currently still an active research topic. This
should change in the course of this year, though. Simulaneously,
the TIGR EST database will become public in April and ESTs are
being mapped at a very fast rate to their genomic locations, further
enhancing the usefulnes of the EST searching approach.

III. We routinely do 'de novo' sequencing for cloning.. In a collaboration
we will ask for at least one picomole which of course means that we
often have much less than that. In the case of the key apoptosis signalling
protein FLICE (Muzio, M.et al. Cell , 85 817-827 (1996)) the silver
stained spot contained between 0.2 and 0.5 pmole and 43 residues were
called 'de novo' (and proven correct after full length cloning.) Another
example is the sequencing of telomerase (not the human one) where 150
amino acids were obtained at an amount that was too low for Edman
sequencing and which lead to the successful cloning of the enzyme.
In our lab, we have now sequenced more than 30 proteins not contained
in sequence databases for either cloning
or homology searching. We only do collaborations, but a price would have
to be very high, as much expertise and time is involved. However, mass
spectrometry and associated sample preparation and software techniques
are still evolving rapidly. My personnal opinion is that mass spectrometric
'de novo' sequencing will become quite routine and wide spread in two
to three years.

Hope this helps

Matthias Mann
Group Leader Proteins & Peptides
EMBL
Heidelberg
+49 6221 387 560 (phone)
+49 6221 387 306 (fax)
http://www.mann.embl-heidelberg.de/