Re: owl database & SEQUEST

Jimmy Eng (engj@u.washington.edu)
Tue, 23 Feb 1999 16:59:02 -0800 (PST)

> Date: Tue, 23 Feb 1999 15:18:05 -0500
> From: Elliott Nickbarg <enickbarg@genetics.com>
> To: Recipients of ABRF List <abrf@aecom.yu.edu>
> Subject: owl
>
> Manfred,
>
> We have had problems with the new version as well using command line sequest.
> Searches run OK, but the sequest_summary gives anomolous results.
> A possible explanation may be that the header format has been altered.
> (see below):
>
> New version
> >From Owl v30.2:
>
> >owl|P01842|LAC_HUMAN IG LAMBDA CHAIN C REGIONS. - HOMO SAPIENS (HUMAN).
> QPKAAPSVTLFPPSSEELQANKATLVCLISDFYPGAVTVAWKADSSPVKAGVETTTPSKQ
> SNNKYAASSYLSLTPEQWKSHRSYSCQVTHEGSTVEKTVAPTECS
> >average mass (m+h)+ = 11237.5
>
> Old version
> >From Owl v30.1:
>
> >LAC_HUMAN pir|P01842| IG LAMBDA CHAIN C REGIONS. - HOMO SAPIENS (HUMAN).
> QPKAAPSVTLFPPSSEELQANKATLVCLISDFYPGAVTVAWKADSSPVKAGVETTTPSKQ
> SNNKYAASSYLSLTPEQWKSHRSYSCQVTHEGSTVEKTVAPTECS
> >average mass (m+h)+ = 11237.5
>
> -Elliott Nickbarg

Elliott,

The problem you described regarding (I assume) the consensus analysis of
the Summary program has been documented on the Sequest web page at

http://thompson.mbt.washington.edu/sequest

since mid-1997 when the OWL database modified its header line with
version 29.3. See the link referring to "Issues associated with the OWL
protein database". Essentially, the field size dedicated to store the
protein reference in Sequest was not large enough to encompass the new
OWL header reference (i.e. the string "pir|P01842|LAC_HUMAN"). I have
addressed this problem with Sequest and Summary but questions
regarding the schedule of availability of new binaries should be directed
to Finnigan Corp.

The short term solution, as outlined on the above web page link, is to
convert the OWL databases back to their old formats. I have made
available a program which will do this conversion for you. Binaries of
this program are available for DigitalUnix/True64, Ultrix, and WindowsNT
platforms. Searching a converted OWL database should clear up your
Summary anomolous results.

-------------

On a different note associated with problems Manfred was encountering
running Sequest via Bioworks ...

UNIX and NT represent end of line terminations differently and
apparently this affects the Bioworks implementation of a Sequest
search. The Bioworks manual documents that databases downloaded from a
UNIX server should be "converted" using the select.exe program. Here are
quick instructions on how to do this:

Open a MS-DOS window, change directories to the directory that contains
your downloaded database and run 'select.exe'. It should ask your for
the "database to read" which you would enter the downloaded database
name. It will then ask you to enter a new output database name ...
create some name (such as db.new). Then it will prompt you for 3
header strings to filter out (string1, string2, and string3). Just
leave these fields empty by hitting enter. You should search against
the newly created database. FYI this select.exe program is what you
use to create subset or species specific databases. For example, enter
'human' and/or 'homo sapien' for a string to filter out and the newly
created database will only contain sequences with 'human'/'homo sapien'
in the header field.

- Jimmy
: Jimmy Eng, Software Engineer Dept. of Molecular Biotechnology :
: (206)616-5058 office+voice mail Univ. of Washington, Box 357730 :
: (206)685-7301 fax Seattle, WA 98195-7730 :
: engj@u.washington.edu http://weber.u.washington.edu/~engj :