Peptide Sequence Tag

Introduction: The "Sequence Tag" protein identification technique was developed by Matthias Mann and Matthias Wilm in the mid 90�s while at the Protein and Peptide Group at the EMBL in Heidelberg Germany(1-3).  In the spirit of open research and the early internet Dr. Mann made the search program, �Peptide Search�, freely available along with regularly updated databases distributed through EBI. This program is no longer available (2010) as a standalone Macintosh application from Matthias Wilm at the EMBL.  This search technique was and is far advanced for it�s time and leverages the idea of search constraint to its maximum potential. Sequence Tag employs MS/MS data produced by tandem MS methods.  In this search strategy the peptide fragment spectrum is searched for obvious sequence tags.  A sequence tag is a short string of amino acid mass differences deduced from the fragment spectrum, see Figure 1.

ESI Ion Trap Generated MS/MS Mass Spectrum

Figure 1. This spectrum was collected on an LCQ DECA XP Plus ion trap mass spectrometer.   In blue is the deduced short amino acid sequence, "Sequence Tag".  Note the parent m/z value 690.39 in the upper left hand header of the spectrum.

As a component of the search the neutral peptide mass must be determined from the regular MS spectrum.  For a description on determining the neutral mass of a peptide please see our tutorial on ESI mass spectrum interpretation. Take a look at the full mass spectrum used to select the mass for MS/MS fragmentation, see Figure 2. Use the spectrum in Figure 2 to determine the neutral mass of the peptide. Remember these two peaks are related mathematically. The x-axis is mass/charge.


Full ESI Ion Trap Mass Spectrum

Figure 2. ESI full mass spectrum displaying related ions 690.5 at the +2 charge state and 461.1 at the +3 charge state.

The Ultimate Constraint:  Using this technique one would look for obvious ladders of peaks in an MS/MS spectrum, see Figure 1. The human brain can easily begin to identify masses that differ by the average amino acid residue mass ~100u.  This search technique is so specific that the tag can be as little as 1 amino acid in length.  The reason this search technique is so highly specific is because it locates the tag at a precise mass distance from the amino and carboxyl terminus of the peptide and divides the unknown peptide into three regions.  The sequence tag for the spectrum in Figure 1 can be defined as, peptide parent neutral mass, 1378.8,  and Sequence Tag, (772.4)LVV(1083.6). Once the protein is identified using this technique the remaining peptides could be matched against the proposed sequence by their parent mass alone.  For a schematic representation of the spectrum and the tag see Figure 3.  

The Ultimate Flexibility:  This approach also allows for homology matching when the program is asked to match only two of the three regions.  In this way one of the terminal regions is allowed to float.  The sequence tag is maintained and the program only needs to match the mass difference before or after the tag.  While you lose some of the power of the original constraint this can be a way to search when the traditional sequence tag fails to find a match due to peptide modification.



Figure 3. This is a simplified diagram of the spectrum shown in Figure1 illustrating how the "Tag" is located at a precise distance from the peptides termini.  This search constraint is very powerful.

Conclusion: Most mass specromitrists have fond memories of this technique since it was one of the first freely available programs, and helped many scientist whose bioinformatics resources were limited.  The original program does suffer from some surmountable problems: the manual nature of calling the sequence tag, the manual nature of determining the neutral peptide mass, and the problem of not knowing whether one is calling a "b" or "y" ion series. Hence, as we have entered the era of high through-put proteomics style peptide sequencing and identification this technique has suffered a decline in popularity.  One can envision a rebirth and dominance of this technique once an automated de-novo sequencing program is able to call the sequence tag automatically with high probability.  Again the manual nature of this technique has slowed it�s popularity in recent years. 

This program is no longer available as a web based program at the EMBL.  If you can find a sequence tag program on the net please forward the URL to 


  1. Mann M, Wilm M., Error-tolerant identification of peptides in sequence databases by peptide sequence tags.  Anal Chem. 1994 Dec 15;66(24):4390-9.
  2. Mortz E, O'Connor PB, Roepstorff P, Kelleher NL, Wood TD, McLafferty FW, Mann M. Sequence tag identification of intact proteins by matching tanden mass spectral data against sequence data bases. Proc Natl Acad Sci U S A. 1996 Aug 6;93(16):8264-7.PMID: 8710858
  3. Shevchenko A, Jensen ON, Podtelejnikov AV, Sagliocco F, Wilm M, Vorm O, Mortensen P, Shevchenko A, Boucherie H, Mann M.  Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two dimensional gels.
    Proc Natl Acad Sci U S A. 1996 Dec 10;93(25):14440-5. PMID: 8962070


return to toc



e-mail the  with all inquiries
home | terms of use (disclaimer) 
Copyright � 2004-2016  IonSource  All rights reserved. 
Last updated:  Tuesday, January 19, 2016 02:48:33 PM