Uppsala Software Factory

Uppsala Software Factory - MSEQPRO Manual


1 MSEQPRO - GENERAL INFORMATION

Program : MSEQPRO
Version : 980206
Author : Gerard J. Kleywegt, Dept. of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Box 590, SE-751 24 Uppsala, SWEDEN
E-mail : gerard@xray.bmc.uu.se
Purpose : generate PROSITE profiles from aligned protein sequences
Package : SBIN


2 REFERENCES

Reference(s) for this program:

* 1 * G.J. Kleywegt & T.A. Jones (1998). Databases in protein crystallography. Acta Cryst D54, 1119-1131. [http://alpha2.bmc.uu.se/gerard/papers/databases.html] [http://www.iucr.org/iucr-top/journals/acta/tocs/actad/1998/actad5406_1.html]

* 2 * G.J. Kleywegt & T.A. Jones (1999 ?). Chapter 25.2.6. O and associated programs. Int. Tables for Crystallography, Volume F. To be published.


3 VERSION HISTORY

97???? - 0.1 - first version
971023 - 0.3 - improvements
971111 - 1.0 - cleaned up code and manual
980206 - 1.1 - minor changes


4 INTRODUCTION

This program generates PROSITE profiles from a set of aligned protein sequences. It is extremely similar to STRUPRO (quod vide). It only uses uninterrupted stretches of aligned residues to calculate the profile (the rationale being that stretches without insertions/deletions are more likely to correspond to structurally conserved features. It is mainly intended as an add-on to STRUPRO, the idea being that an initial profile is generated using aligned 3D structures (with STRUPRO); this is scanned against the sequence database (with the "pfsearch" program from the pftools package), and the alignment of the matching sequences is used to produce a new profile (with MSEQPRO). Program PRF2MSEQ can convert the output of a profile/database ("pfsearch -y") into a partial multiple sequence alignment file suitable as input to MSEQPRO. Alternatively, you can make a multiple sequence alignment file yourself and use that as input to MSEQPRO directly.

In order to scan sequence profiles against SWISS-PROT, you will need:

(1) the "pftools" suite of programs, written by Philipp Bucher ( mailto:pbucher@isrec-sun1.unil.ch ) and available by ftp from http://ulrec3.unil.ch:80/ftp-server/pftools/ (the suite should compile on most Unix machines).

(2) the SWISS-PROT database of protein sequences ( http://www.expasy.ch/sprot/sprot-top.html ), which can be downloaded by ftp from ftp://ftp.expasy.ch/databases/swiss-prot/ (at the time of writing, the file "compressed/sprot35.dat.Z").


5 INPUT TO THE PROGRAM

The input to this program is largely a subset of the input for STRUPRO (see the STRUPRO manual).


5.1 Start-up

When you start the program, it prints some information:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO ***

Version - 971111/1.0 (C) 1992-97 Gerard J. Kleywegt, Dept. Mol. Biology, BMC, Uppsala (S) User I/O - routines courtesy of Rolf Boelens, Univ. of Utrecht (NL) Others - T.A. Jones, G. Bricogne, Rams, W.A. Hendrickson Others - W. Kabsch, CCP4, PROTEIN, E. Dodson, etc. etc.

Started - Tue Nov 11 16:04:56 1997 User - gerard Mode - interactive Host - sarek ProcID - 11151 Tty - /dev/ttyq14

*** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO ***

Reference(s) for this program:

* 1 * G.J. Kleywegt, Uppsala University, Uppsala, Sweden, Unpublished program.

For manuals and complete references, check: http://alpha2.bmc.uu.se/usf/

*** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO *** MSEQPRO ***

Max nr of molecules : ( 100) Max nr of residues in sequence : ( 1000) Nr of amino-acid types : ( 20) Random sequence length : ( 2000000) One-letter codes : ( A R N D C E Q G H I L K M F P S T W Y V) Three-letter codes : ( ALA ARG ASN ASP CYS GLU GLN GLY HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


5.2 Random-number seed

The first bit of input is an integer seed for the random-number generator. This will be used to generate a random amino-acid sequence, and to generate random sequences when calculating the weight of each structure/sequence. If you repeat this run of the program on the same machine with the same seed, you should be getting identical results.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Random-number seed ? (  123456)
 Random-number seed : (  123456)
 => Random number generator initialised with seed :     123456
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.3 Random sequence

The program will now generate a random amino-acid sequence of (at present) 2,000,000 residues. This sequence has an amino-acid distribution similar to that found in proteins in the PDB (GJK, unpublished results). It will be used later to calculate scores for the profile parts, which gives you some idea of the "signal-to-noise".

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Generating random sequence ...
 Target composition    : (   0.081    0.044    0.046    0.058    0.019
  0.058    0.037    0.080    0.022    0.053    0.081    0.059    0.020
  0.040    0.047    0.068    0.063    0.016    0.038    0.071)
 Working ...
 Actual composition    : (   0.081    0.044    0.046    0.058    0.019
  0.057    0.037    0.080    0.022    0.053    0.081    0.060    0.020
  0.040    0.046    0.068    0.063    0.015    0.038    0.070)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.4 Substitution matrix

Next, you need to provide the name of a file which contains the matrix to be used in the construction of the profiles. A number of matrices are available; others can be made by the user.

Note: if you have defined the environment variable GKLIB so that it points to the directory where you keep your collection of these matrix files (in Uppsala: /nfs/public/lib), the program will use this to generate the default file name.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Library file with matrix ? (sbin_blosum45.lib) ../sbin_blosum45.lib
 Library file with matrix : (../sbin_blosum45.lib)
 Comment : (! BLOSUM 45 matrix made from BLOCKS v. 5.0 and scaled in half-
  bits.)
 Comment : (! ARNDCQEGHILKMFPSTWYVBZX)
 Comment : (! integer matrix)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Such a matrix file may look as follows (if it would contain real, instead of integer, numbers, replace the MATI by MATR and the format by something appropriate):

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
!
! PAM 250 matrix recommended by Gonnet, Cohen & Benner
! Science June 5, 1992.
! Values rounded to nearest integer
!
TYPE 22 (30(2x,a1))
  C  S  T  P  A  G  N  D  E  Q  H  R  K  M  I  L  V  F  Y  W  X  *
!
MATI (30i3)
 12  0  0 -3  0 -2 -2 -3 -3 -2 -1 -2 -3 -1 -1 -2  0 -1  0 -1 -3 -8
  0  2  2  0  1  0  1  0  0  0  0  0  0 -1 -2 -2 -1 -3 -2 -3  0 -8
  0  2  2  0  1 -1  0  0  0  0  0  0  0 -1 -1 -1  0 -2 -2 -4  0 -8
 -3  0  0  8  0 -2 -1 -1  0  0 -1 -1 -1 -2 -3 -2 -2 -4 -3 -5 -1 -8
  0  1  1  0  2  0  0  0  0  0 -1 -1  0 -1 -1 -1  0 -2 -2 -4  0 -8
 -2  0 -1 -2  0  7  0  0 -1 -1 -1 -1 -1 -4 -4 -4 -3 -5 -4 -4 -1 -8
 -2  1  0 -1  0  0  4  2  1  1  1  0  1 -2 -3 -3 -2 -3 -1 -4  0 -8
 -3  0  0 -1  0  0  2  5  3  1  0  0  0 -3 -4 -4 -3 -4 -3 -5 -1 -8
 -3  0  0  0  0 -1  1  3  4  2  0  0  1 -2 -3 -3 -2 -4 -3 -4 -1 -8
 -2  0  0  0  0 -1  1  1  2  3  1  2  2 -1 -2 -2 -2 -3 -2 -3 -1 -8
 -1  0  0 -1 -1 -1  1  0  0  1  6  1  1 -1 -2 -2 -2  0  2 -1 -1 -8
 -2  0  0 -1 -1 -1  0  0  0  2  1  5  3 -2 -2 -2 -2 -3 -2 -2 -1 -8
 -3  0  0 -1  0 -1  1  0  1  2  1  3  3 -1 -2 -2 -2 -3 -2 -4 -1 -8
 -1 -1 -1 -2 -1 -4 -2 -3 -2 -1 -1 -2 -1  4  2  3  2  2  0 -1 -1 -8
 -1 -2 -1 -3 -1 -4 -3 -4 -3 -2 -2 -2 -2  2  4  3  3  1 -1 -2 -1 -8
 -2 -2 -1 -2 -1 -4 -3 -4 -3 -2 -2 -2 -2  3  3  4  2  2  0 -1 -1 -8
  0 -1  0 -2  0 -3 -2 -3 -2 -2 -2 -2 -2  2  3  2  3  0 -1 -3 -1 -8
 -1 -3 -2 -4 -2 -5 -3 -4 -4 -3  0 -3 -3  2  1  2  0  7  5  4 -2 -8
  0 -2 -2 -3 -2 -4 -1 -3 -3 -2  2 -2 -2  0 -1  0 -1  5  8  4 -2 -8
 -1 -3 -4 -5 -4 -4 -4 -5 -4 -3 -1 -2 -4 -1 -2 -1 -3  4  4 14 -4 -8
 -3  0  0 -1  0 -1  0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 -2 -4 -1 -8
 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8
!
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that the matrix may contain entries for residue types not used by MSEQPRO (e.g., "X", "B", "Z", "*"); the program will ignore these.


5.5 Minimum fragment length

Only uninterrupted stretches of a certain minimum length will be used in the profile (they must be at least 3 residues long).

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Min fragment length ? (       5)
 Min fragment length : (       5)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.6 Sequence weighting

Appropriate weighting of the various structures/sequences is important to minimise bias in the profile (e.g., five different structures of the same human protein and only one of an insect form of the protein will bias the profile towards human sequences). The following weights can be used:

- uniform weights, i.e. all weights equal; this is not advisable

- sequence distance weights, as defined by Sibbald and Argos; this is probably the most sensible choice (in this implementation, the number of "Monte Carlo" cycles executed lies between 100,000 and 1,000,000, or fewer if the weights converge to within 1%)

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Sequences may be weighted:
 U = uniform weights
 S = sequence distance weights
 Weighting scheme ? (S)
 Weighting scheme : (S)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


5.7 Sequence and profile files

Provide the name of the file containing the (partial) multiple sequence alignment, as well as of the (output) profile file (these customarily have an extension ".prf").

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Name of sequence file ? (aligned.seq) strupro.mseq
 Name of sequence file : (strupro.mseq)

Name of profile file ? (aligned.prf) mseqpro.prf Name of profile file : (mseqpro.prf)

Remark : (! 1131 pos. 28 - 250 P35694|BRU1_SOYBN BRASSINOSTEROID- REGULATED PROTEIN BRU1.) Remark : (! 1231 pos. 64 - 268 P33693|EXOK_RHIME SUCCINOGLYCAN BIOSYNTHESIS PROTEIN EXOK.)

[...]

Remark : (! 1338 pos. 23 - 287 P53301|YG46_YEAST HYPOTHETICAL 52.8 KD PROTEIN IN BUB1-HIP1 INTERGENIC REGION.) Nr of sequences : ( 40) Nr of residues : ( 209) SEQ > (---CA--GSFYQD-----FD-SLSLDKVSG-FKSKKEYLFG-----RID----------MQLK- GTVTAYYLSSQ-THDEIDFEFLG----NLSGD----------PY-TNIFT-DPTRNFHTYSIIWKPQ- HIIFLVD-VFKNA-EPLGV----PFPKNQ--PMRIYSSLWNAD----DWAT-GAEYEAN------ELDAYS) SEQ > (---CT---WSKKQ---VKTV-ILELTFEEK-FACGEIQTRK---RFGYG--TYEARIKAADGS- GLNSAFFTYIG-PHDEIDFEVLG-AKVQINQY-SAKGGNEFLAD--VPGG--ANQGFNDYAFVWEKN- RIRYYVN-----G-HEVTD--PAKIPVNA---QKIFFSLWGTD-TLTDWMG-GDECQFA-AQS)

[...]

SEQ > (--EST-DSTTAAS-NPLKTT-ALATSFSED-FSSSSKWFTD-AGEIKYG-GKLEVILKAANGT- GIVSSFYLQSD-DLDEIDIEWVG-DNTQFQSN-----------F----FS-TPTDKFHNYTLDWAMD- KTTWYLD-SVRVL-NTSSE------GYPQ-SPMYLMMGIWAGG-GTIEWAG-GSWESIE-ADGGSIYGRYD)

Nr of positions without INDELs : ( 60) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


6 OUTPUT

MSEQPRO will now start looking for aligned stretches of residues which do not have any insertions/deletions in any of the sequences. If such a fragment (exceeding the minimum required length) is found, it will be used to generate a part of the new profile.

For every successful stretch of residues that the program encounters, the output includes:

- information about the length of the stretch of residues

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 New conserved stretch !
 Length : (          7)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- weights are calculated and printed. In the case of sequence distance weights ,this may take a little while since thousands of random sequences need to be generated, and statistics accumulated.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Calculating sequence distances ...
 WARNING - Weights did not converge : (    1000000)
 Largest shift (%) : (   3.731)
 Weights      : (   0.057    0.044    0.021    0.039    0.021    0.033
  0.024    0.021    0.035    0.030    0.037    0.029    0.033    0.023
  0.017    0.021    0.023    0.012    0.012    0.022    0.027    0.004
  0.033    0.009    0.004    0.008    0.041    0.024    0.004    0.027
  0.015    0.024    0.009    0.016    0.053    0.009    0.009    0.009
  0.059    0.063)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- for every residue, the amino acid in every sequence, and the profile matrix entries are listed

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 AA-TYPE :  ALA ARG ASN ASP CYS GLU GLN GLY HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL
 |KGGAGAGGGAAPPSASSAASKGGGGGGGGGGSGGTGGGAS|
 PROFILE :   12 -14  -2 -10 -21 -10 -10  23 -17 -26 -24  -9 -16 -25  -8   9  -3 -25 -23 -17
 |KEEREEEEEERRRRRRRRRRRRIRRRQRRRRRRRERRRDS|
 PROFILE :  -14  35   1  -1 -29  22   9 -19   1 -24 -20  18  -8 -27 -15  -3  -8 -23 -11 -22
 |EINNNYYNYLLLVTALVLLTAAAAAAATAVAAAAYAAALK|
 PROFILE :    4 -11  -9 -17 -19  -8 -12 -18 -10   1   3 -10   0  -5 -20  -6  -2 -16   2   0
 |YQRYRRRRRYVYYYYYYYYYLLRLLLSLLLLFLLLFFFNW|
 PROFILE :  -15   3 -13 -18 -27  -9 -12 -24  -1  -5   2  -7  -2   9 -25 -16  -9   5  24  -8
 |LTSKSSSSTTTLLLLLLLLLYYYYYYYYYYYYYYFYYYWF|
 PROFILE :  -11 -13 -16 -21 -22 -15 -17 -24  -8  -1   6 -16   0  16 -24 -10   1   3  23  -4
 |FRVAVTTVKLRLLMMLMMMMASISSASASQSSSSGSSSET|
 PROFILE :    0  -5  -9 -16 -19  -5  -9 -13 -13  -5  -3  -8   2  -8 -17   0   3 -23  -9  -2
 |GKQGQNNQSEGDEDAEAAAEAAATAARKATSATLKAAAND|
 PROFILE :    3  -1   7   5 -23   4   4   0  -7 -24 -22   3 -14 -27 -13   4  -4 -27 -18 -20
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- next, the program slides the profile along the entire random amino acid sequence and calculates statistics:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Random sequence tests :      1999994
 Average, St.dev.      :        -51.6        31.5
 Minimum, Maximum      :       -163.0       104.0
 Z-min, Z-max          :        -3.53        4.94

Mol # 1 Raw score = 23 Z-score = 2.37 Mol # 2 Raw score = 33 Z-score = 2.69 Mol # 3 Raw score = 31 Z-score = 2.62

[...]

Mol # 38 Raw score = 97 Z-score = 4.72 Mol # 39 Raw score = 6 Z-score = 1.83 Mol # 40 Raw score = 25 Z-score = 2.43 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----


7 RESULTS

When the program has finished, it will print a summary:

- the pairwise sequence identity matrix (in %), *ONLY* counting the residues that ended up being in the profile:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Nr of residues in profile : (         43)

Sequence identity for these residues only: % Seq id mol # 1 -> 100.0 30.2 32.6 37.2 32.6 32.6 32.6 32.6 32.6 32.6 27.9 23.3 23.3 20.9 23.3 18.6 23.3 23.3 23.3 18.6 16.3 4.7 11.6 16.3 7.0 16.3 11.6 7.0 11.6 14.0 9.3 11.6 11.6 9.3 55.8 11.6 11.6 11.6 11.6 30.2 % Seq id mol # 2 -> 30.2 100.0 58.1 48.8 58.1 51.2 53.5 58.1 55.8 37.2 25.6 9.3 16.3 16.3 16.3 16.3 16.3 16.3 16.3 16.3 11.6 16.3 14.0 14.0 16.3 18.6 16.3 14.0 11.6 16.3 11.6 9.3 16.3 14.0 34.9 11.6 11.6 14.0 11.6 27.9

[...]

% Seq id mol # 40 -> 30.2 27.9 41.9 39.5 41.9 44.2 44.2 41.9 41.9 39.5 23.3 20.9 16.3 23.3 18.6 14.0 16.3 14.0 14.0 16.3 14.0 9.3 11.6 9.3 9.3 16.3 11.6 9.3 11.6 14.0 9.3 14.0 9.3 7.0 34.9 9.3 9.3 9.3 18.6 100.0

Average sequence identity (%) : ( 27.961) St. dev. : ( 24.095) Minimum : ( 2.326) Maximum : ( 100.000) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----

- some results pertaining to the random sequence

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Sum of maximum random scores : (        746)
 Sum AVE+3SIGMA random scores : (        243)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- the accumulated raw scores of the input structures/sequences.

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Score for molecule   1 =        513
 Score for molecule   2 =        487
 Score for molecule   3 =        617
[...]
 Score for molecule  39 =        435
 Score for molecule  40 =        526
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

- a suggestion is made for the minimum raw score to be used in searches against the (SWISS-PROT) sequence database (note that it is better to scan the whole sequence database to get realistic statistics)

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
 Minimum raw score : (        400)
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   


8 PROFILE FILE

For the example above, the following profile file is generated:

      
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
ID   MSEQPRO; MATRIX.
AC   PS99999;
DT   JAN-1900 (CREATED);
DE   Created by MSEQPRO V. 971111/1.0 at Tue Nov 11 16:10:39 1997 for user gerard
CC
CC   Substitution matrix file : ../sbin_blosum45.lib
CC   Nr of sequences used : 40
CC   Min fragment length : 5
CC   Weighting scheme : S
CC
MA   /GENERAL_SPEC: ALPHABET='ARNDCEQGHILKMFPSTWYV'; LENGTH= 43;
MA   TOPOLOGY=LINEAR;
MA   /DISJOINT: DEFINITION=PROTECT; N1=1; N2= 43;
MA   /CUT_OFF: LEVEL=0; SCORE= 400;
MA   /DEFAULT: MI=-100; I=-10; IM=0 ; MD=-100; D=-3; DM=0;
MA   /M: SY='G'; M=12,-14,-2,-10,-21,-10,-10,23,-17,-26,-24,-9,-16,-25,-8,9,-3,-25,-23,-17;
MA   /M: SY='R'; M=-14,35,1,-1,-29,22,9,-19,1,-24,-20,18,-8,-27,-15,-3,-8,-23,-11,-22;
MA   /M: SY='A'; M=4,-11,-9,-17,-19,-8,-12,-18,-10,1,3,-10,0,-5,-20,-6,-2,-16,2,0;
MA   /M: SY='Y'; M=-15,3,-13,-18,-27,-9,-12,-24,-1,-5,2,-7,-2,9,-25,-16,-9,5,24,-8;
MA   /M: SY='Y'; M=-11,-13,-16,-21,-22,-15,-17,-24,-8,-1,6,-16,0,16,-24,-10,1,3,23,-4;
MA   /M: SY='T'; M=0,-5,-9,-16,-19,-5,-9,-13,-13,-5,-3,-8,2,-8,-17,0,3,-23,-9,-2;
MA   /M: SY='N'; M=3,-1,7,5,-23,4,4,0,-7,-24,-22,3,-14,-27,-13,4,-4,-27,-18,-20;
MA     /I: MI=0; I=-1; MD=0; /M: SY='X'; M=0; D=-1;
MA   /M: SY='G'; M=13,-19,-4,-13,-23,-17,-17,44,-20,-28,-22,-17,-16,-25,-17,3,-10,-21,-26,-18;
MA   /M: SY='D'; M=-11,-14,-2,12,-23,-7,-3,-21,-12,-7,-2,-12,-5,-17,-16,-4,3,-30,-10,-5;
MA   /M: SY='G'; M=-4,-15,5,-9,-24,-17,-18,20,-15,-16,-19,-15,-12,-17,-23,-3,-11,-15,-18,-10;
MA   /M: SY='G'; M=-4,-13,-9,-16,-22,-14,-14,0,-17,-10,-2,-18,-5,-7,-13,-1,-1,-24,-12,-8;
MA   /M: SY='A'; M=26,-15,-3,-10,-11,-8,-7,-5,-17,-12,-15,-10,-13,-18,-11,18,12,-28,-18,-2;
MA   /M: SY='F'; M=-17,-20,-22,-36,-22,-31,-26,-30,-18,7,16,-27,7,51,-23,-21,-10,0,20,2;
MA   /M: SY='F'; M=-17,-16,-20,-30,-25,-24,-25,-28,-2,-1,2,-20,-1,45,-29,-19,-11,21,42,-4;
MA   /M: SY='L'; M=-9,-18,-21,-28,-19,-18,-20,-28,-18,15,28,-24,17,13,-24,-18,1,-18,2,9;
MA   /M: SY='A'; M=7,-14,-12,-13,-18,-11,-11,-12,-12,-4,-5,-11,-5,-7,-19,0,-2,-18,1,3;
MA   /M: SY='P'; M=-3,-14,-7,-4,-28,-4,3,-11,-15,-18,-24,-9,-17,-25,34,5,1,-31,-23,-20;
MA   /M: SY='M'; M=-4,-13,-8,-8,-22,-7,-9,-8,-12,-4,-7,-11,5,-16,-9,-5,-4,-27,-14,-1;
MA     /I: MI=0; I=-1; MD=0; /M: SY='X'; M=0; D=-1;
MA   /M: SY='C'; M=-7,-11,3,1,6,-7,-4,-13,-4,-25,-25,-9,-18,-24,-5,5,1,-37,-19,-19;
MA   /M: SY='Y'; M=-12,-15,-5,-15,2,-12,-17,-18,-3,-14,-12,-16,-10,-5,-27,-12,-11,0,6,-15;
MA   /M: SY='D'; M=-10,-7,11,27,-26,2,19,-8,2,-28,-23,-2,-20,-29,-8,3,-5,-34,-17,-23;
MA   /M: SY='E'; M=-5,-2,-3,-8,-22,25,3,-21,-5,-8,-13,-2,-2,-25,-12,6,7,-25,-10,-10;
MA   /M: SY='I'; M=-7,-21,-24,-33,-20,-18,-27,-31,-22,33,20,-22,27,1,-24,-18,-7,-23,-3,30;
MA   /M: SY='D'; M=5,-14,7,29,-21,-5,5,-4,-10,-27,-21,-5,-21,-30,-11,5,0,-32,-19,-17;
MA   /M: SY='V'; M=-6,-24,-25,-35,-19,-27,-30,-34,-29,35,14,-25,13,8,-26,-15,-5,-22,-2,36;
MA   /M: SY='E'; M=-11,0,-9,-11,-32,33,4,-13,-1,-17,-15,-1,0,-26,-16,-9,-14,7,-4,-26;
MA   /M: SY='F'; M=-14,-10,-15,-25,-24,-2,-12,-26,-11,-3,-2,-14,1,23,-22,-13,-10,1,11,-6;
MA   /M: SY='D'; M=-4,-14,-2,13,-23,-7,-2,-11,-3,-14,-3,-11,-4,-19,-17,-7,-9,-29,-12,-10;
MA   /M: SY='T'; M=-1,-11,12,-4,-17,-10,-10,9,-13,-18,-20,-11,-15,-18,-16,13,15,-31,-18,-12;
MA     /I: MI=0; I=-1; MD=0; /M: SY='X'; M=0; D=-1;
MA   /M: SY='G'; M=7,-11,5,-1,-21,-2,6,10,-4,-25,-22,-7,-16,-24,-12,9,-3,-28,-19,-20;
MA   /M: SY='F'; M=-12,-22,-21,-35,-23,-28,-27,-26,-22,17,15,-27,8,31,-27,-19,-9,-8,12,13;
MA   /M: SY='N'; M=-4,-7,22,21,-19,-1,7,-8,-4,-23,-24,-3,-20,-24,-12,13,9,-36,-18,-20;
MA   /M: SY='P'; M=-8,-18,-17,-17,-28,-15,-11,-15,-16,-6,-14,-14,-9,-8,17,-7,-7,-15,-6,-7;
MA   /M: SY='H'; M=-17,-4,20,9,-26,-1,-3,-15,37,-21,-19,-7,-10,-8,-21,-5,-10,-20,15,-25;
MA     /I: MI=0; I=-1; MD=0; /M: SY='X'; M=0; D=-1;
MA   /M: SY='T'; M=-6,0,13,1,-18,3,-2,-16,-4,-15,-17,-2,-11,-18,-14,10,21,-30,-12,-12;
MA   /M: SY='V'; M=-8,-19,-25,-29,-19,-22,-27,-32,-15,24,9,-19,9,9,-28,-15,-5,-10,18,29;
MA   /M: SY='V'; M=7,-13,-13,-20,-16,-12,-14,-15,-17,2,3,-13,0,-10,-19,-2,4,-25,-11,8;
MA   /M: SY='I'; M=-9,-16,-13,-25,-22,-18,-20,-29,-22,16,8,-20,5,6,-20,-8,2,-20,-1,11;
MA   /M: SY='Q'; M=-7,-5,-1,9,-24,5,11,-16,-9,-18,-16,-2,-12,-24,-5,2,3,-29,-15,-15;
MA   /M: SY='W'; M=-20,-18,-29,-35,-36,-24,-28,-26,-15,-9,-6,-21,-9,37,-30,-29,-19,77,42,-16;
MA   /M: SY='N'; M=-9,5,15,12,-25,8,9,-14,1,-20,-19,3,-13,-24,-14,1,-3,-30,-14,-20;
MA   /M: SY='P'; M=8,-10,-4,-9,-22,-5,-4,-10,-15,-16,-20,-3,-11,-22,15,5,4,-27,-19,-14;
MA   /M: SY='N'; M=-3,-9,15,13,-23,0,3,4,3,-27,-26,-7,-18,-26,-11,10,-1,-33,-17,-23;
//
 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
   

Note that insertions and deletions in the uninterrupted stretches are severely penalised (since there were none in the set of aligned sequences !), whereas they may occur anywhere in between such stretches (since they occur in at least one of the input sequences !).


9 KNOWN BUGS

None, at present ("peppar, peppar").


10 UNKNOWN BUGS

Does not compute.


Uppsala Software Factory Created at Fri Dec 18 19:42:30 1998 by MAN2HTML version 971024/1.6