Dependencies ============ RSRef itself provides the following capabilities: 1) Comparison of map to model: Statistics, residual calculation... 2) Refinement of atomic positions, B-factors and occupancies, optionally constrained as rigid groups. 3) Optimization of "imaging" parameters relevant to electron microscopy, and to limited extent, crystallography: Magnification, resolution, envelope function & overall B-factor. B-factor restraints are offered, but stereochemical restraints are available only through embedding RSRef in another program (CNS). Functionality depends on the availability of other programs. CNS: Provides stereochemical restraints, together with gradient descent or molecular dynamics optimizers in cartesian or torsion angle coordinate systems (http://cns-online.org/). For installation of RSRef, you will need the platform-specific CNS files, created during a CNS build. In principle, RSRef should be compatible with multiple versions of CNS, but it has been tested with CNS version 1.3. BSOFT: adds the capability to read maps of many formats, not just Xplor/CNS (http://lsbr.niams.nih.gov/bsoft/). The desired packages should be installed before RSRef. RSRef has the following python dependencies: Python (!): It will be a developmental package that you will need, not a minimal interpreter-only distribution. NumPy (http://numpy.scipy.org) - needed for many parts of RSRef. SciPy (http://numpy.scipy.org) - needed for optimization & statistics (Dependence of statistics on SciPy could be removed if called-for.) Version >= 0.12 needed for full callback support from optimizer. (The callback emulator provided is a poor substitute that does a poor job in calculating the impact of parameters.) cmd2 - a widely used extension to the standard cmd module. NumPy & SciPy installation from source is straightforward but non-trivial due to a lengthy set of dependencies. However, many modern Linux distributions ship with pre-installed binaries. Alternatively, consider the (64-bit?) Enthought (http://www.enthought.com) distribution that integrates these and many other packages as a pre-built binary. It requires minor changes to RSRef's installation, but setup.py will take you through this. (As Enthought EPD 2.7.3-2 is late in the SciPy 0.12 update (see above), may have to install SciPy 0.12 on top of EPD after removing the packaged version. SciPy is then installed from source after installing the development versions of LAPACK, BLAS and ATLAS.) RSRef Installation ================== (Unix is described first. Adaptations for Windows are described in a subsequent section.) In the following "X.Y.Z" is used to denote version, eg. 0.3.6. The distribution on the web site is encrypted. Once we have received a license agreement, a password will be sent, whereupon the file can be decripted using openssl with a command like: openssl enc -d -blowfish -in encrypted-rsref-X.Y.Z.tar.gz -out rsref-X.Y.Z.tar.gz enter bf-cbc decryption password: The distribution file, rsref-X.Y.Z.tar.gz has been prepared using distutils: setup.py sdist. Copy it into the parent directory of where the RSRef root ( rsref-X.Y.Z/) will be created by unpacking the distribution file: tar -xvzf rsref-X.Y.Z.tar.gz Most of setup's build and install options should be available, but it is not recommended that the default installation under pythonM.N/site-packages be used. Instead, it is recommended that RSRef be installed "in place" under the rsref.X.Y.Z directory by specifying the --home=. option. Consider whether the rsref.X.Y.Z is in the best location. If moved later, CNS will require setting of the environmental variable LD_LIBRARY_PATH. It is very important that the installation setup be run using the same version of python that will be used to run (CNS) RSRef. If 'which python' does not point to the appropriate version, then (preferably) change your $PATH environmental variable or set an alias. For example, if you wished to use an EnThought (epd) python distribution and use the bash shell, you might want something like the following in your .bashrc: PATH=/progs/epd-7.2.2-3/bin:$PATH If this sets up a conflict for other program packages, you may also substitute the explicit path in the commands below, but then you will also have to do this every time that you run RSRef. Within the rsref.X.Y.Z directory, setup.py is run within the shell script described below. The individual command has the following usage, but is usually run within a script due to the complex input / environment required: python setup.py [local-options] [distutils-ACTIONS] [distutils-options] It is recommended that the following example script (eg_install.sh) be edited for local compatibility and then run as an installation shell script. #!/bin/bash # usage: ./eg_install.sh [-B] [-C] # options: # -B to save CNS build files, see/edit CNS_BUILD below. # (Needed only for debugging / development.) # -C to force re-make of CNS if prior CNS build exists. # (Needed w/ -B to repeat prior (failed) installation.) # # Change path below to RSRef home directory: INST_HOME=/home/chapman/Work/Refinement/TestInst/rsref-0.3.4 export PYTHONPATH=$INST_HOME/src:$PYTHONPATH echo "Default answers to interactive setup script should suffice except when" echo "asked for the destination of the rebuilt CNS, enter the absolute path of:" echo "${INST_HOME}/bin" echo "-------------------------------------------------------------------------" for ARG in $*; do echo $ARG if [ $ARG == -B ]; then # Change following path to suitable scratch area: CNS_BUILD=--cns_build=~/Temp/Cns_for_RSRef echo "Building CNS in $CNS_BUILD. (Use -C to force rebuild.)" elif [ $ARG == -C ]; then FORCE_REBUILD=-C fi done python -u setup.py --bsoft=/usr/local/bsoft \ --lib_dir=${INST_HOME}/lib \ --cns=/progs/cns_solve_1.3 $FORCE_REBUILD $CNS_BUILD -v \ install --home=${INST_HOME} exit 0 Most of the local options indicate the locations of libraries that RSRef will need. Help on all options is available through: python setup.py --help It is also possible to put the modified CNS back into the the CNS bin directory as non-RSRef functionality is not changed. User-privaledges will suffice for a local installation with --home=., but administrator privaledges would likely be required for installation in (for example) /usr/local, pythonM.N/site-packages (not recommended) or to update the CNS installation. The cns_build option allows the CNS build files to be retained for quick rebuilds, but usually one would not specify the option, letting setup use temporary space that will be cleaned up automatically. Validation: ----------- (After setting the environment appropriately, next section...) A quick minimal test refinement can be run with run_rsref.sh. The script comments on the results expected. More extensive examples are noted in "Getting Started" below. Additionally, a panel of module unit tests is available in ${RSREF_HOME}/src/test. (For these unit tests ${RSREF_HOME}/src should be in $PYTHONPATH, using a command like (for bash): PYTHONPATH=${PYTHONPATH}:${RSREF_HOME}/src). In this directory, the unit-test modules can be run individually, or python can be asked to find/run all test modules with the command "python -m unittest discover -v". Windows: -------- Stand-alone functionality is supported. BSoft is not available for Windows so only CNS/Xplor maps are supported. CNS is available, and if there is strong motivation, I can consider procuring the compilers to support the RSRef extension. The distribution file, rsref-X.Y.Z.zip has been prepared by distutils setup sdist. Extract the files into the same folder containing the .zip file (likely one folder above the default). In a Command prompt window, descend into rsref-X.Y.Z and install: setup.py --lib_dir=%RSREF_HOME%\lib install --home=. where you substitute the full path to your current folder. Validation: A command-line equivalent to run_rsref.sh would be: set DATA=%RSREF_HOME%\src\test rsref.py --map=%DATA%\alanine-2.0.xplor --high_resolution=2.0 \ --pdb_in=%DATA%\alanine.pdb \ --by_resolution --map_use=0.9 --atom_extent=2.5 evaluate stop (Which could be put in a batch .bat file...) Environment: ============ Unix: ----- Something like the following will be needed (in bash): export PATH=${RSREF_HOME}/bin:${PATH} export PYTHONPATH=${RSREF_HOME}/bin:${RSREF_HOME}/lib/python:${PYTHONPATH} where you will substitute your value of ${RSREF_HOME} The use of cns input files may depend on CNS environmental parameters that can be set individually as needed or in csh with source cns_solve_env. Windows: -------- Windows environmental variables are set in a similar way (command prompt window): set PATH=C:%RSREF_HOME%\bin;%PATH% set PYTHONPATH=%RSREF_HOME%\lib\python;%PYTHONPATH% or they can be set permenantly in the autoexec.bat file using msconfig. Documentation: ============== On-line documentation has been prepared from the source using epydoc. The entry point is ${RSREF_HOME}/doc/index.html for general and API documentation, but options and command summaries are documented seperately: rsref.py -h - command-line; rsref (cmd) help - commands, i.e. run rsref.py, then "help" at the prompt. Getting started: ================ Examples: --------- A suite of examples needs to be assembled. ${RSREF_HOME}/examples contains sub-directories with data, input, output structures and log files. See README.txt in each sub-directory for details: virus-antibody: exemplifies rigid-body domain-level refinement of a complex. virus-5A: exemplifies flexible fitting to a 4.5 A EM reconstruction. Version History: ================ 0.3.0 Completely new Python implementation (11/18/11): ----------------------------------------------------- Brings together the major features of several past branches that will no longer be supported. (Version numbers restarted.) Drops (for now) GUIs / macro extensions for O molecular graphics. Imaging parameters can now be refined, not just set. Low-pass filtering / resolution-limit refinement replace prior attempts to model EM CTF parameters explicitly. Corrects bug that omitted stereochemical terms from CNS partial derivatives. Additional checks (eg. atoms remain w/in map) improve robustness. Speed greatly improved by retaining map & other attributes in-memory throughout refinement session, even when interfacing w/ external program. Supports latest versions of CNS (1.3 tested). Multiple map formats supported (CNS, XPlor, CCP4, MRC, SPI, DSN6...). Multiple form-factor sets for X-ray, neutron, EM. Calculates contributions from non-crystallographic (molecular) symmetry equivalents. 0.3.1 Crystallographic (lattice) symmetry supported (December 2011) 0.3.2 Restrained refinement of individual B-factors (12/15/11) 0.3.3 Atomic density caching debugged (3/5/12): Previously, could have used prior calculation with same B, but different atom type (overall, minor impact). Caching disabled when B-factors, Envelope & Resolution refined; improves robustness of image and atom refinement. 0.3.4 Distribution de-bugging (8/23/12): Minor changes as a result of testing installation scripts. Documentation edit. Under-hood foundations for rigid-group refinement, still being developed. 0.3.5 Rigid-group refinement (Sept 2012): Input for commands refine and image_refine change; no backward compatibility. Internally scaled units for different parameter-types improve convergence. Development of examples folders. 0.3.6 (Oct 2012): Compound expressions within Selections. Under-hood foundations for torsion angle refinement, still under development. Further examples. 0.3.9 (Apr 2013) Mostly under-the-hood consolidated support for Superpose, marginally affecting RSRef. Default is now not to normalize input map. Use -N or --normalize to replicate prior results. Analyze is moved from an option of refine to a stand-alone command. Default action of refine (no parameters specified) is now to continue a prior refinement, not to choose default parameters. (Note comma signifies absence of parameters.) 0.4.0 (July 2013) Input / scripts will need updating... Upgraded from optparse to argparse: Options deprecated: --by_resolution by_resolution used to change the meaning of (atom_extent, map_use & map_require). Now use (relative_extent, relative_use & require_relative). Internals refactored as option attributes (command line settable). Other new arguments: Upgraded from cmd to cmd2nest, an extension to the cmd2 command interpreter. Command options become more standard (unix-like). (Commands "refine" & "superpose" still need to migrate.) Adds help options, I/O redirection, history, environment... Replaces exec, EXEC, echo, value with py python interpreter. Exceptions trapped for 2nd chance before termination. ("set debug True" for full traceback.) "help" gives additional information. Upgraded to bsoft version 1.8; older versions not supported. Temporarily unaivailable pending upgrade to user-interfaces: superpose.py, torsion.py, (test_syperpose.py; test_torsion.py). New: symexp.py - a user interface to module symmetry which no longer has stand-alone functionality. This debuts a just-in-time task manager that will be deployed in other top-level modules. Replaced: run_symmetry.sh by run_symexp.sh. 0.4.1 (July 2013) RSRef switched to just-in-time task manager. Satisfies pre-requisites only as needed. Should reduce run-time errors. Specifically, in rsref.py, replaces RSRef() with Tasks() then adds a binding RSRef = Tasks for backward compatibility. 0.4.2 (October 2013) User-control of refinement parameterization redone, simpler (not backwardly compatible). Changes to input files will be required. AtomicParameterization properties used to limit calculation of partials, improving efficiency. Torsion angle refinement (pre-alpha, awaits real-example testing). Above requires changes to pre-alpha superpose.py & test_torsion.py which are not running in this version. (Does not affect refinement). Ongoing issues: =============== SciPy bug in some Windows versions: ----------------------------------- Enthought (Windows 64 bit) bug in SciPy 0.8.0 through 0.10.0 (at least, still a problem 3/4/12) - causes optimize.fmin_l_bfgs (refinement) to hang. This likely started with EPD version 7.0. Googling indicates that can be circumvented by installing SciPy independently (with Python, NumPy) or using an older version of EPD. This bug is known to others, and is not a problem for Linux EPD distributions. Status of Windows 32 bit is unknown. OpenMP Multi-Processing on unix platforms: ------------------------------------------ Python's global interlock precludes parallel calculations through multi- threading, except with C extensions. A user's NumPy or SciPy may have been compiled with OpenMP, in which case the array processing will grab half of the total available threads, but with the over-head of short-lived threads, there is almost no improvement in time. Not all NumPy / SciPy instals are compiled with OpenMP, so first check whether RSRef is using more than 100% of a CPU (top or pidstat). If so, RSRef will be friendlier to other processes if the number of threads / process is limited by setting the environmental variable OMP_MUM_THREADS=1. When embedded, the CNS part can make effective use of multi-threading, so you might want to allow a larger number of threads, balancing the gain in CNS with the waste in RSRef. Credits: ======== Michael S. Chapman with help from Andrew Trzynka & Brynmor K. Chapman Last updated: 07/05/13; chapmami@ohsu.edu