Because only two facilities used instrumentation made by manufacturers other than Applied Biosystems, preliminary analysis of the results was restricted to the ABI data. The committee received 83 usable data sets from 44 laboratories. Before analysis by the committee, these data sets were divided into four groups according to the sequencing method used (dye-labeled primers or dye-labeled terminators) and the status of the data, unedited or edited.
1 GAGCGCGCGT AATACGACTC ACTATAGGGC GAATTGGAGC TCCACCGCGG 51 TGGCGGCCGC TCTAGAACTA GTGGATCCCA GAGTTCTAGG CATGTGTTAG 101 GCACTCAAAA AACATCTGCT AAATGAATTA ATAAATACAT GCCTTTCAAA 151 ATAGAAGATT TACTAAGTTC TGGGGAGAGA ACACTTTATT TCATATATTG 201 GTACAGAACT ATCAATATTT TAGAGCTATA AATTATTGGC AAAAAATGGT 251 GAAAAGTAGG GAATTTAGAA CAAGACCTTC TGAGTTCCAA CCCAGCACCA 301 TCCCTTATTA GGTATACAAT CTTGAGCAAA TGACTAAGCC TCTTTGTGCC 351 TCTGTTTTCC AGTTGACATA ATAGAAATGA TAATAATACC CACCTGGCCG 401 GGCGCGGTGG CTCACGCCTG TAATCCTAGC ACTTTGGGAG GCCGAGGCGG 451 GTAGATCACC TGAGGTCAGG AGTTCAAGAC CAGCCTGACC AACATGGAGA 501 AACCCCGTCT CTACTAAAAA TTCAAAATTA GCTGGGCGTG GTGGCGGGTG 551 CCTGTAATCC CAGCTTCTCG GGAGACTGAG GCAGGAGAAT CGCTTGAACC 601 CGGGAGGCAG AGGTTGCAGT GAGCCGAAAT CGTGCCATTG CACTCCAGTC 651 TGGGCAACAA GAGCGAAACT CCGTCTCAAA AAAAATAAAA ATTAATAAAA 701 ATAATACCAA CCTTACAGGA TAATTGTGAG AATTAACTGA ATCAATTCAT 751 CGAAAGCCCC TAGAGCAGTA CTTACCACTT AGTACCTACT AAATAAATCT 801 TAGCAGCTGT TATTAGCTCT GAFigure 1. The sequence of the "unknown" DNA template.
Each data set was aligned with the known template sequence, and the total number of insertions, deletions, substitutions, and no-calls at each base position was tabulated for each of the four groups. An overview of the data is presented in Figures 2-5, which plot the correct assignments at each position in the sequence. The number of data sets for each group varied and is shown in the figure legends. Interpretation of the dye-primer results requires caution because of the relatively small number of submissions. The sample set for dye-terminator results was much larger because of the wider popularity of this method.
The first 50 bases were difficult for both dye-primer and dye-terminator methods (about 90% correct base calls). Between 51 and 100 bases, dye-terminator sequencing produced about 99% correct base calling, while dye-primer sequencing still produced only about 90% correct base calls. The poorer performance of the dye-primer method in this region can be attributed to fluorescent overload from the primer front. After this initial section, correct base calling improved to about 99% for both methods.
Manual editing of the dye-primer data significantly increased the length of accurate sequence. The average length for the unedited dye-primer data was about 380 bases but this improved to about 480 bases after editing. However, it was somewhat surprising to find that manual editing did not have a similar effect with dye-terminator sequencing. In this case, both unedited and edited data sets produced good results up to about 325 bases. Beyond 350 bases, base deletions became particularly troublesome in the dye-terminator data.
A paper describing the complete results of this study is being prepared. In addition, the Nucleic Acids Research Committee intends to analyze the submitted data individually to produce a relative ranking. However, quantitative evaluation of this seemingly simple problem is difficult, and the final report has been delayed by the need to develop new algorithms and software. In the meantime, we have mailed individual alignments of each sequence to all participants who could be identified. Any participants who have not yet received an alignment should contact Al Smith.
The Nucleic Acids Research Committee is also planning an oligonucleotide synthesis field test for 1995 and in addition to analysis by capillary electrophoresis, the effect of the oligonucleotide primer on the quality of DNA sequencing reactions will also be evaluated. This testing will further supplement the data we have already collected on the performance capabilities of automated instrumen-tation in the real world.
Acknowledgment
The Nucleic Acids Research Committee would like to thank Dr. Eric Westin of Virginia Commonwealth University for making this sequence available prior to publication, the Bioinformatics Group, Center For Biotechnology, at the St. Jude Children's Research Hospital in Memphis for custom software and sequence alignments, and all the laboratories who participated in this study.
Return to the The ABRF Home Page