Program : MSEQ2ALSC
Version : 980212
Author : Gerard J. Kleywegt, Dept. of Cell and Molecular Biology,
Uppsala University, Biomedical Centre, Box 590,
SE-751 24 Uppsala, SWEDEN
E-mail : gerard@xray.bmc.uu.se
Purpose : convert multiple sequence alignment to ALSCRIPT format
Package : SBIN
Reference(s) for this program:
* 1 * G.J. Kleywegt & T.A. Jones (1998). Databases in protein crystallography. Acta Cryst D54, 1119-1131. [http://alpha2.bmc.uu.se/gerard/papers/databases.html] [http://www.iucr.org/iucr-top/journals/acta/tocs/actad/1998/actad5406_1.html]
* 2 * G.J. Kleywegt & T.A. Jones (1999 ?). Chapter 25.2.6. O and associated programs. Int. Tables for Crystallography, Volume F. To be published.
980211 - 0.1 - first version
980212 - 1.0 - cleaned up code and manual
MSEQ2ALSC is a simple non-interactive program that reads a multiple-sequence alignment file as produced by STRUPRO or PRF2MSEQ, and outputs a file that can be used as input to Geoff Barton's ALSCRIPT program (see his Web-site).
Usage: MSEQ2ALSC < mseq_file > alsc_file
The multiple-sequence alignment file produced by STRUPRO, together with MSEQ2ALSC and ALSCRIPT, enables you to produce cute structure-based sequence alignments. MSEQ2ALSC will clearly flag parts of the sequences that are and those that are not structurally aligned. The latter will be enclosed by " <" and "> " in the output file.
HINT: use reasonable distance cut-offs in STRUPRO (e.g., 3.5 A equivalent, and 5.0 A extension, or even 2.5 and 3.5 A) if you intend to use the sequence alignment with ALSCRIPT to prevent erroneous alignments.
HINT: if you align multiple domain proteins, treat the individual domains separately in STRUPRO and MSEQ2ALSC. Afterwards, merge the two ALSCRIPT input files.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ../MSEQ2ALSC < aligned.mseq > q.alsc ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Part of the input file:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ! ! Sequence alignment file ! Created by STRUPRO V. 980211/1.2 at Thu Feb 12 13:28:44 1998 for user gerard !! NOT ALIGNED MOL 1 FROM PRO- 1 TO PRO- 1 !> P ! NOT ALIGNED MOL 2 FROM CYS- 1 TO ASP- 2 !> CD ! NOT ALIGNED MOL 3 FROM CYS- 1 TO ASP- 2 !> CD ! NOT ALIGNED MOL 4 FROM PRO- 1 TO PRO- 1 !> P ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Part of the output file:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- >MSEQ2ALSC_1 >MSEQ2ALSC_2 >MSEQ2ALSC_3 >MSEQ2ALSC_4[...]
>MSEQ2ALSC_24 *
<<<<<<<<<<<<<<<<<<<<<<<< PCCPPPPVVVVV----CCCCCTTS -DD---VKDDDD----DDDDDKKN -------E---------------- >>>>>>>>>>>>>>>>>>>>>>>>
NAANNNDFAAAAAAAAAAAAADDK FFFFFFFAFFFFFFFFFFFFFQQF SVVASANGLLLLDDDDVVVVVNNL GGGGGGGIGGGGGGGGGGGGGGGG NTTTNTYKTTTTTTTTTTTTTTTT WWWWWWWYWWWWWWWWWWWWWWWW KKKKKKKKKKKKKKKKKKKKKEEK
[...]
VEEVVVKKEEEEKKKKEEEEEKKE RRRRRRKAKKKKKKKKRRRRRKKK EAAEEEVQEEEEEEEEAAAAAKKV
<<<<<<<<<<<<<<<<<<<<<<<< ------H----------------- >>>>>>>>>>>>>>>>>>>>>>>>
* ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
If you use a multiple, partial sequence alignment file produced by PRF2MSEQ, you will get an alignment of the residues that matched the profile. Parts in between profile fragments will be indicated by " <...> " in the output file.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ../MSEQ2ALSC < prf.mseq > q.alsc ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Part of the input file:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ! 491 pos. 127 - 205 P37366|CCL1_YEAST CYCLIN CCL1. IAQHLNL PTEVVATAISFFRRFFLENS DPKSIVHTTIFLACKSENYFI DSFAQKAKSTRDS VLKFEFKLLESL ! 512 pos. 209 - 296 P46277|CG1B_MEDVA G2/MITOTIC-SPECIFIC CYCLIN 1 (B-LIKE CYCLIN). VDWLIEVHDKFDL MHETLFLTVNLIDRFLEKQS KLQLVGLVAMLLACKYEEVSV PVVGDLILIS YTRKEVLEMEKVMVNAL ! 402 pos. 53 - 146 P55168|CG1C_CHICK G1/S-SPECIFIC CYCLIN C. EHLKL RQQVIATATVYFKRFYARYS DPVLMAPTCVFLASKVEEFGV ISAATSVLKTRFS YRMNHILECEFYLLELM ! 356 pos. 175 - 238 P55168|CG1C_CHICK G1/S-SPECIFIC CYCLIN C. LAWRIVNDTYRTDL PPFMIALACLHVACVVQQKDA R VDMEKILEIIRVILKLY ! 444 pos. 53 - 146 P25008|CG1C_DROME G1/S-SPECIFIC CYCLIN C. EQLKL RQQVIATATVYFKRFYARNS DPLLLAPTCILLASKVEEFGV ISNSRLISICQSA YRTNHILECEFYLLENL ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Part of the output file:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- >MSEQ2ALSC_1 >MSEQ2ALSC_2 >MSEQ2ALSC_3[...]
>MSEQ2ALSC_90 * -----------------------------I-------V-V-V------------------------------------------------ -----------------------------N-------N-N-N------------------------------------------------ -----------------------------Q-------Q-Q-Q----------------------------------------------R- -----------------------------F-------F-F-F----------------------------------------------M-
[...]
DIF-ST-T-GGGGGGNGVVLGIGLGGNNDGIDGDNDDGDGDGDDIIIITDI-IIINYNIIIDYDVVVVVVNNNNDPNNNNEENNNNEH-G SSSRARRSRAAAAAATSAVSASASAGCTTYTTYTTSTYTYTYTTATTTSTT-TAAASASSSTTTTTTTTTAAAAASSSSSKDSSSSEA-A
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< .......................................................................................... .......................................................................................... >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
-YYVYYVYVCCCCCCYYFFYCYFYCYYYY-YY-YYYY-Y-Y-YYYYYYCYY-YYYYYYFFFYYYYYYYYYYYYYYFIIIIYFIIVVFV-A -TRDRRDRDTTTSSTSTSTTSSTTTDSTTTRNTTSTTTSTSTETSTSSTTS-TTTTSTTTTTTSTTTTTTNTTTKERRRRTEKKKKES-S -RMMTMMMMEEEGGEDPDVHVNVHEERRKLVKLAQKKLKLKLQSHKKKHAK-KRRKEKVVVKLKKKKKKKSSSARAPPPPDAPPPPAP-K -KNENNENEDDRDDDKEKDEEEEEDENDKEPKEKQAKEKEKEKQKAKEARE-AKKAKKSSSKEKHHHHHAASSSEKDEEENKQQQQKR-N -EHKHHKHKDEDEEEQEEEHDQDQEEDDQTDQHQEQQSQSQSQQQQQKEQK-QDDDEEDDDQTQQQQQQQEQQQETEEEESTEEEETQ-D VVIIIIIIIIIIIIIIVIIIIVVIIILIVLIVIVVVVLVLVLVVIIIIVII-IIIIIILLLVLLIIIIIIVIIILILLLLIILLLLIL-I LLLLLLLLLLLLLLTLVLILKLCLKLLILKLLKLVLLKLKLKLLLRLRIRR-RLLRYLMMMLKLRRRRRRRRRRIKLLLLRKLLLLQR-K KEEEEEEEESSNTTRVQDRTERVAEQDRRPLRPRKRRPRPRPRRVQAQDQQ-SEETPERRRKPRQQQQQDEEEEARQQQQPREEEERD-N FMCICCICIMMHMMMMAAAMGMAMGAAAMCMMCMMMMCMCMCMMMMMMMMM-MMMMVMMMMMCMMMMMMMMMMMMMMMMMDM----MW-A EEEIEEIEIEEEEEEEEEEEEEEEEEEEELEELEEEELELELEEEEEEEEE-EEEEMEEEEEIEEEEEEEEEEEEEEEEEEE----EE-E FKFRFFRFRILKLLLKKRRKRKRKKRRQHLRHMHAQHMHMHVHHKIQQRLQ-CKKQQQKKKHLHMMMMMMITTMTLLLLLLL----LV-M KVYVYYVYVIIIMMIKYHYTFKYTFYTYLDFLDLDVLDLDLDLLTATKQQK-NLLTKHIIILDVKKKKKKTLLISLLLLLLL----LL-F LMLILLILIIILIMIIMIMIIIMIIIIMVLLILILIVLVLVLIIIMIMIIM-IMMILIVVVVLLIIIIIIIIIIIVLLLLIV----IV-M LVLLLLLLLMMLMMMLLLLLLLLLLLLILHLLHLLLLHLHLHLLLLLLLFL-LLLLALLLLLYLLLLLLLLLLLLLVVVVMV----LL-L ENEKEEKEKKKQKKKGTA-NEGNNKRNDKQ-KRKKKKQKQKQKKSKNKSKK-RNNKQKEEEKQKRRRRRRKKKRRSNNNNET----S--T SALLNLLLLEAAAA-AIT-KKAVKTV-TVT-VTTTIVTVTVTVVT-SATAT-RTTTLK---VTVVAVVVVEEELTVKKKK-T----T--S LLMYLMYMYLLLLL-LLL-LLLLLLL-LLY-LYLLLLYLYLYLLL-L-LI--LLLLVL---LYLLLLLLLLLLLLLLLLL-L----L--L * ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
None, at present ("peppar, peppar").
Does not compute.