Program : NCS6D
Version : 961122
Author : Gerard J. Kleywegt & T. Alwyn Jones,
Dept. of Cell and Molecular Biology,
Uppsala University, Biomedical Centre, Box 590,
SE-751 24 Uppsala, SWEDEN
E-mail : gerard@xray.bmc.uu.se
Purpose : find NCS-operators through a 1D->6D search
Package : RAVE
Reference(s) for this program:
* 1 * T.A. Jones (1992). A, yaap, asap, @#*? A set of averaging programs. In "Molecular Replacement", edited by E.J. Dodson, S. Gover and W. Wolf. SERC Daresbury Laboratory, Warrington, pp. 91-105.
* 2 * G.J. Kleywegt & T.A. Jones (1994). Halloween ... Masks and Bones. In "From First Map to Final Model", edited by S. Bailey, R. Hubbard and D. Waller. SERC Daresbury Laboratory, Warrington, pp. 59-66.
* 3 * G.J. Kleywegt & R.J. Read (1997). Not your average density. Structure 5, 1557-1569. [http://www4.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=9438862&form=6&db=m&Dopt=r]
* 4 * G.J. Kleywegt & T.A. Jones (2037 ?). Convenient single and multiple crystal real-space averaging. To be published ???
* 5 * G.J. Kleywegt & T.A. Jones (1999 ?). Chapter 25.2.6. O and associated programs. Int. Tables for Crystallography, Volume F. To be published.
930410 - 0.3 - first version of the manual
930410 - 0.4 - altered sampling; record top 50 scores
930502 - 0.5 - compute average/st.dev.; updated manual
930606 - 0.6 - consistent rotation angle definition
930615 - 1.0 - new production version; choice of interpolation
methods
931217 - 1.1 - more error checks
940316 - 1.2 - allow Polar angles; store 100 best solutions
940525 - 1.3 - removed bug from rotation matrix analysis routine
950216 - 1.4 - write TOP 100 solutions to the output file; don't
print analysis of NCS operator for each and every
new rotation step, to reduce output
951022 - 1.5 - made sensitive to OSYM
960412 - 1.6 - echo all input to help debug scripts
961111 - 1.7 - properly initialise array of best RT operators
961122 - 2.0 - dynamic memory allocation
From version 2.0 onward, NCS6D allocates memory for its map dynamically. This means that you can increase the size that the program can handle on the fly:
1 - through the environment variable MAPSIZE (must be in capital letters !), for example put the following in your .cshrc file or your script:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- setenv MAPSIZE 8000000 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
2 - by using command-line argument MAPSIZE (need not be in capitals), for example in your script:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- run ncs6d -b mapsize 10000000 < ncs6d.inp >& ncs6d.out ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
Note that command-line argument takes precedence over the environment variable. So you can set the environment variable in your .cshrc file to a "typical" value, and if you have to deal with a map which is bigger than that, you can use the command-line argument.
If sufficient memory cannot be allocated, the program will print a message and quit. In that case, increase the amount of virtual memory (this will not help, of course, if you try to allocate more memory than can be addressed by your machine (for 32-bit machines, something 2**32-1 bytes, I think), or reduce the size requirements.
NCS6D needs space for 1 map.
NCS6D is a brute-force program to help you find NCS RT-operators in particular in MIR maps (when you do Molecular Replacement, the operators follow from superposition of the solutions you obtained).
NCS6D uses a set of BONES or PDB atoms as input and tries to find a set of rotations and translations which maximise the correlation coefficient between the density at the (BONES) atoms and those at the same atoms after application of the operator.
NOTE: this program is sensitive to the environment variable OSYM.
It should point to your local copy of $ODAT/symm, the directory
which contains the spacegroup symmetry operators in O format.
When asked for a file with spacegroup operators in O format,
you may either provide a filename, or the name of a sapcegroup
(including blanks if you like, case doesn't matter). The program
will try to open the following files, assuming that STRING is the
what you input:
(1) open a file called STRING
(2) if this fails, check if OSYM is defined and open $OSYM/STRING
(3) if this fails, open $OSYM/string.sym
(4) if this fails, open $OSYM/string.o
Hint: if you make soft links in the OSYM directory, you can also type
spacegroup numbers (e.g.: \ln -s p212121.sym 19.sym).
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----*** NCS6D *** NCS6D *** NCS6D *** NCS6D *** NCS6D *** NCS6D *** NCS6D ***
Version - 931217/1.1 (C) 1993 - Gerard J. Kleywegt & T. Alwyn Jones, BMC, Uppsala (S) User I/O - routines courtesy of Rolf Boelens, Univ. of Utrecht (NL) Others - T.A. Jones, G. Bricogne, Rams, W.A. Hendrickson Others - W. Kabsch, CCP4, PROTEIN, etc. etc.
Started - Fri Dec 17 15:16:53 1993 User - gerard Mode - interactive Host - rigel ProcID - 1724 Tty - /dev/ttyq16
*** NCS6D *** NCS6D *** NCS6D *** NCS6D *** NCS6D *** NCS6D *** NCS6D ***
Max size of map : ( 4194304) Max nr of (BONES) atoms : ( 50000) Max nr of symmetry ops : ( 64) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Map file ? ( ) p2test_0.E ... Map read OK Cell axes (A) : ( 91.800 99.500 56.500) Cell angles (d) : ( 90.000 90.000 90.000) Grid axes (pts) : ( 100 110 64) Origin (pts) : ( 0 0 0) Extent (pts) : ( 100 110 64) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The name of the (CCP4) map that you want to use. This map should contain AT LEAST one asymmetric unit.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Read BONES or PDB file (B/P) ? (B) p Name of PDB file ? () ca.pdb Number of atoms : ( 314) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Read BONES or PDB file (B/P) ? (B) b Name of BONES file ? ( ) p2.bones Data block name : (P2SKL_ATOM_XYZ) Data block type : (R) Nr of values : (2703) Format : ((3f10.3)) Number of atoms : ( 675) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
A BONES or PDB file.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- File with symmetry operators ? ( ) p212121.o Opening O datablock : (p212121.o) Datablock : (A_SYMMOP) Data type : (R) Number : (48) Format : ((3F10.4)) ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The symmetry operators of your spacegroup, from an "O" datablock file.
NOTE: if your map contains both the volume around your reference
molecule and that around your suspected NCS-related mate,
then you only need to supply the identity operator (P1).
A file containing this operator may look as follows:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- unit_operator R 12 (3f15.7) 1.0000000 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Approximate centre of NCS mate ? ( 0.000 0.000 0.000) 20 10 0 Centre of gravity of atoms : ( 48.501 63.622 33.709) Approximate centre of NCS mate : ( 20.000 10.000 0.000) Initial operator : 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -28.500534 -53.622337 -33.709057 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The approximate centre-of-gravity of the NCS-related molecule in Cartesian Angstroms.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Rotations are Euler angles in degrees Rotation PHI start, end, step ? ( 0.000 350.000 10.000) 0 340 20 Rotation PSI start, end, step ? ( 0.000 350.000 10.000) 10 90 20 Rotation KAPPA start, end, step ? ( 0.000 350.000 10.000) 0 340 20 ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The (Eulerian) rotation space that is to be searched.
NOTE: prior to version 0.6 the actual rotation matrix was generated,
but since Alwyn's routines for applying operators to vectors
expect the transpose rotation matrix, this has been changed.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- Translations are in A relative to NCS-centre Translation X start, end, step ? ( -10.000 10.000 2.000) Translation Y start, end, step ? ( -10.000 10.000 2.000) Translation Z start, end, step ? ( -10.000 10.000 2.000) Estimated number of trials : ( 2156220) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The translational space that is to be searched, centred on the centre-of-gravity of the NCS-related molecule.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- You have a choice of interpolation methods: N = density at nearest grid point A = average density at 8 nearest grid points L = linear interpolation using 8 points F = full, 64-point spline interpolation Interpolation type (N/A/L/F) ? (L) l ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The density-interpolation method to be used.
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- File for best RT-operator ? (rt_best.o) ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
The file to which the best RT-operator will be written (as an "O" datablock; can be fed into IMP in order to finetune the operator).
For each atom, the density at the nearest grid point is computed and stored.
Then a set of seven do-loops is executed. The outermost three are over PHI, PSI and KAPPA, respectively. For each set of three angles a new rotation matrix is calculated and printed. The position of the centre-of-mass upon application of this rotation is calculated and the initial RT-translation vector is computed so that this point ends up at the approximate position of the NCS-mate that you input.
Then three loops through each of the translations are executed. In the innermost loop, which is over the atoms, the net translation is added to the rotated coordinates and the density at the nearest grid point for each atom is retrieved. Using these values and the stored densities for the reference atoms, the correlation coefficient is calculated.
Whenever the correlation coefficient exceeds the previous maximum, the new maximum (and the operator etc.) is printed.
When the complete search is finished, the best operator is written to file.
NOTE: from version 0.4 onwards, each atom is assigned the AVERAGE density of the eight closest grid points. This effectively increases the sampling by a factor of eight. BUT: a consequence of this is that if a point ends up exactly at a grid point, the density will still be calculated as an average of 8 grid points. In other words, if you want to test the program by looking for the identity operator, do NOT expect to find correlation coefficients of 1.000 (rather, they will be somewhere between 0.9 and 1.0).
NOTE: from version 1.0 onwards, there is a choice of 4 interpolation
methods (ranging from coarse and fast to accurate and slow):
N = density at nearest grid point
A = average density at 8 nearest grid points
L = linear interpolation using 8 points
F = full, 64-point spline interpolation
An example of the output:
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ... Please be patient for a while ...FORGN : ( 0.000 0.000 0.000) FEXT : ( 0.990 0.991 0.984) GEXT : ( 0.980 0.982 0.969) Rotation origin : ( 28.501 53.622 33.709) Centre of atoms : ( 48.501 63.622 33.709)
CPU total/user/sys : 5.7 4.3 1.4 Rotations : ( 0.000 10.000 0.000) Rotation matrix : 0.984808 0.000000 0.173648 0.000000 1.000000 0.000000 -0.173648 0.000000 0.984808 C-o-G after rotation : ( 41.910 63.622 41.619) Initial translation : ( -21.910 -53.622 -41.619) Min score for the top : ( 0.000) ... NEW MAXIMUM ... Trial number : ( 1) Trans offset : ( -10.000 -10.000 -10.000) Trans vector : ( -31.910 -63.622 -51.619) Nr of atoms checked : ( 675) Correlation coefficient >>> ( -0.053) NCS COG : ( 10.000 0.000 -10.000) ... NEW MAXIMUM ... Trial number : ( 2) Trans offset : ( -10.000 -10.000 -8.000) Trans vector : ( -31.910 -63.622 -49.619) Nr of atoms checked : ( 675) Correlation coefficient >>> ( -0.026) NCS COG : ( 10.000 0.000 -8.000) ... NEW in the top : ( 50) Trans vector : ( -31.910 -55.622 -31.619) Correlation coefficient >>> ( 0.138) ... NEW MAXIMUM ... Trial number : ( 55) Trans offset : ( -10.000 -2.000 10.000) Trans vector : ( -31.910 -55.622 -31.619) Nr of atoms checked : ( 675) Correlation coefficient >>> ( 0.138) NCS COG : ( 10.000 8.000 10.000) NEW in the top : ( 50) Trans vector : ( -31.910 -53.622 -51.619) Correlation coefficient >>> ( 0.029) ... ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
* Note that whenever a new rotation matrix has been calculated,
the elapsed CPU-time for the previous cycle is printed.
In the above output, one cycle takes approximately one
minute (this was on an SGI; on a DEC ALPHA it takes only
7 seconds). By multiplying this number by the total number
of rotation steps you get a fairly accurate estimate of
the total amount of CPU-time your job is going to take.
In this case: 18 * 5 * 18 * 1 minute = 27 hours (on an
ALPHA: 18 * 5 * 18 * 7 seconds = just over three hours).
* Note that the total CPU-time is determined by:
NATOMS*NTRAX*NTRAY*NTRAZ*NPHI*NPSI*NKAPPA
Whenever you cut the three rotation or translation stepsizes in half,
the execution time will increase by a factor of 8 (or 64 if you half
both) !!!
* From version 0.4 onwards, the top 50 scoring operators are kept and listed at the end. This makes the program a wee bit slower but that's well worth it if you happen to have two or more high-scoring solutions (the one with the highest score isn't necessarily the best solution, so you may want to try several solutions).
* From version 0.5 onwards, the average correlation coefficient and the standard deviation therein are computed and printed.
* From version 1.4 onwards, all top 100 solutions are written to the output file; it may look as follows (NOTE: only solutions whose correlation coefficient is GREATER THAN ZERO will be written):
----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- ! Created by NCS6D V. 950216/1.4 at Thu Feb 16 20:55:44 1995 for user gerard ! ! NCS6D best operator with CC = 0.1245881 .LSQ_RT_NCS6D R 12 (3F15.8) -0.50000000 0.86602539 0.00000000 0.86602539 0.50000000 0.00000000 0.00000000 0.00000000 -1.00000000 -9.10041237 -65.30522919 30.92998123 ! ! Top 100 solutions (UNSORTED) ! ! ! Solution # 1 with CC = 4.9569793E-02 .LSQ_RT_OTHER R 12 (3F15.8) -0.51507688 0.57790893 0.63302225 0.57790893 -0.31127504 0.75440651 0.63302225 0.75440651 -0.17364812 -14.58862305 -25.44091797 -72.44584656 ! ! Solution # 2 with CC = 4.1004587E-02 .LSQ_RT_OTHER R 12 (3F15.8) -0.93969256 0.34202012 0.00000000 0.34202012 0.93969256 0.00000000 0.00000000 0.00000000 -0.99999994 42.58096695 -67.67810822 32.92997742 ... ! ! Solution # 100 with CC = 9.7307660E-02 .LSQ_RT_OTHER R 12 (3F15.8) -0.82453328 0.48209068 0.29619813 0.48209068 0.32453328 0.81379765 0.29619813 0.81379765 -0.49999994 22.06101608 -63.70230103 -48.80622864 ! ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE ----- EXAMPLE -----
None, at present.