RSRef: Real space refinement of molecular structure into density maps

Section author: Michael S. Chapman <chapmami@ohsu.edu>

Authors:

Michael S. Chapman <chapmami@ohsu.edu>, Andrew Trzynka <trzynkaa@ohsu.edu>, Brynmor K. Chapman & Leo Selker,

Oregon Health & Science University

Version:

0.5, March 23, 2016

Usage:

python -u -m rsref [-option(s)] <refining.pdb>

Synopsis

Compares / refines an atomic model to an electron density or Coulombic potential map.

Supports a wide range of resolution regimes from low resolution electron microscopic reconstructions to high resolution x-ray crystallography.

The package serves dual purposes:

Standalone program:

  • Model / density comparison & statistics.
  • Refinements without full stereochemical restraints: rigid-group, torsion angle, etc..
  • Restrained refinement of atomic B-factors.
  • Optimization of image parameters by agreement with an atomic model.
  • Supports symmetry: crystal lattice and molecular (non-crystallographic).

Library or Application Programming Interface (API):

  • To support extension of other refinement programs (eg. CNS), thereby combining real-space fitting with stereochemical restraints and additional optimizers such as simulated annealing.
  • To develop new strategies of refinement by combining map-fitting with functionalities being developed in other libraries (eg. CCTBX).

Sources of Documentation

All of the following should be referenced:

Program options

Brief explanations of the arguments for stand-alone runs are given with: rsref.py -h, further information below.

When called as a library or embedded, the programmer has the option of supporting the relevant subset of command-line options (as in our modified cns), or handling program control independently. Absent superseding documentation from the calling program, rsref’s command-line documentation offers a first approximation.

Commands

Within the program (RSRef> prompt), available commands are listed with help -a; help about command is provided by help command, and for all commands by help -rl. The help text is reproduced later in this document.

When embedded, the calling program is responsible for program control, so commands, syntax, options and access to help will be different. For our CNS extension, we refer back to this document from the documentation for cns-rsref.

Todo

add link from cns-rsref to its documentation.

Examples
The examples directory in the installation contains scripts, data files, results and log files. Explanations are in the README.txt file in each examples subdirectory.
Details, API

Details are encoded within docstrings that are accessible to programmers using Interactive Development Environments (IDEs). They are also compiled with sphinx into html files, linked from the module index on the home page. This is the searchable, cross-linked (API) reference documentation that will explain the meaning of parameters, performance of different functions etc..

The documentation is accessed from (index.html) on-line or in the distribution directory doc/html. (Additional formats can be generated with sphinx.)

Concepts

Image Refinement

In real-space refinement, we optimize an atomic model by maximizing the agreement with a 3D image. Mostly, this involves adjustment of the parameters of the atomic model. However, the agreement also depends on how the molecule is rendered in the imaging experiment, and how this is accounted for in calculating the image expected of the object.

In Crystallography, the image depends on the resolution limits, and on the overall B-factor which accounts, in part, for individual atomic motions, but also for the average effects of lattice disorder, radiation damage etc. which could be termed loosely experimental or instrumental attenuations. In real-space, one could also add in the effects of phase errors which tend to become progressively worse, attenuating the high resolution signal.

In Electron Microscopy, the image depends even more on additional instrumental parameters: magnification, contrast transfer function (CTF) and a group of factors such as beam coherence, specimen stability etc. that are collectively described by an envelope function. There may be further attenuation of the signal as a function of resolution due to additional physical limitations of the instrument or the image averaging, alignment etc. that are part of the computer processing of the 3D reconstruction. The average effects of many of these are often approximated by a low-pass filter. Experimenters often apply approximate corrections for some or all of the effects - current practices vary widely.

RSRef provides the wherewithall to apply further corrections to the 3D image calculated from the atomic model. The rationale is that the best atomic parameters will be obtained in a refinement where the discrepancies between calculated and experimental images have been minimized. Thus, RSRef also provides the means to refine the parameters of the empirical correction functions to maximize the agreement between calculated and experimental maps as the atomic model improves. Nevertheless, there are important limitations on these image corrections as discussed below. Generally, then, one will want to apply best-estimate corrections during the 3D reconstruction, and then use RSRef’s corrections to reduce the residual discrepancy when compared to an atomic model.

In RSRef, parameters for image corrections are determined from comparison of the 3D map with the atomic model. This is fundamentally different from the algorithms used in EM reconstruction, and there are both advantages and disadvantages. The primary disadvantage is that corrections such as inverse-CTF should be applied to individual 2-D images, but RSRef is working only with the 3D reconstruction in which the 2D images have been integrated. The primary advantage is that the atomic model is a partially-independent external standard that can reveal systematic errors that are otherwise difficult to characterize. It is possible that RSRef could highlight changes to parameters for which it might be worth repeating the reconstruction.

The current version of RSRef no longer attempts to correct explicitly for the “average” 3D effect of systematic errors in CTF parameters applied to 2D images, or Wiener filters. Instead, if focuses on the parameters of simpler corrections that most affect comparison of 3D map to atomic model:

  • Magnification.
  • Envelope correction.
  • Additional attenuation / low-pass filters (approximating the average effects of Wiener filters, beam incoherence, etc.)

Purists might argue that atomic models should be refined against an uncorrected map, so that least-squares can provide the solution of least error. Statistics are usually very flattering, when refinement is performed against a map where the high resolution signal is attenuated! However, experience is that such maps are often devoid of the detail needed to perform a high quality refinement. Furthermore, at high resolution, much of the attenuation comes from a highly predicatable detector-based Gaussian attenuation. This has led to the popularity of empirical corrections, such as that encoded by EMBfactor, to sharpen the reconstructed map. Statistics will always be worse, because sharpening will always increase high resolution noise as well as signal (but the structure might nevertheless be more accurate).

That said, the program is neutral to prior processing. No assumptions are made about what corrections have already been applied. The philosophy is to apply incremental corrections using similiar functional forms to corrections in widespread use in Electron Microscopy. By applying incremental corrections for agreement between atomic model and images (maps) we hope to make up for previously under- or over- corrections during the data processing.

The corrections can be pre-set and/or least-squares refined in RSRef. For EM attenuation, RSRef supports Gaussian Envelopes; Butterworth or Gaussian low-pass filters, in addition to relative magnification. Users can either use 3rd party software to apply corrections with the refined parameters to their EM reconstructions, or they can continue to apply inverse corrections to the calculated model density through RSRef.

All corrections applied are isotropic, without directional dependence. Magnification corrections are applied to the map, but other corrections change the density calculated from the atomic model so that it more closely represents the image that would be generated from the EM/x-ray instrumentation.

Magnification

The parameter is defined, for this program, as the change in the reported microscope magnification. Thus, if images had been collected at a nominal 50,000 x, a magnification of 1.01 would be appropriate if best agreement with model were obtained if the actual magnification were 50,500 x. (This follows the convention for relative magnification in BSoft, but is the inverse of the Scale parameter in EMAN.)

Once refined, a magnification correction can be applied at program startup with the -X argument. Alternatively, a corrected map can be output by RSRef map --observed or independently by dividing the grid separation APIX) and origin by the magnification. If this is done, no further correction is needed in RSRef, and the model will superimpose on the map in molecular graphics programs.

(For crystallographic data, magnification is equivalent to an isotropic reduction in unit cell lengths, and is likely not very useful!)

Note that an increase in magnification decreases the number of grid points within a radius of each atom. Thus, there is a systematic decrease in least-squares residual which may spoil refinement. This can be mitigated in two ways, using the --normalize option of RSRef to correct for the changing voxel size. Alternatively, for searches (but not refinement), one can optimize the correlation coefficient instead of the scale-dependent residual.

Overall B or Envelope Function

The EM envelope correction and the crystallographic overall B-factor are exponential Gaussian attenuators added to the atomic B-factors and applied in reciprocal space when atomic densities are calculated from scattering factors. They are all co-variant and can not all be refined at the same time. It is always somewhat arbitrary what components are included in the atomic B-factors, and which are factored out into an overall B-factor.

For crystallographic data, the overall B-factor supports provides compatibility with those reciprocal-space refinement programs that use Wilson scaling / overall B-factors. There is no need to modify individual atomic B-factors. The overall B-factor may also account for resolution-dependent average effects of phase error in a map. A user could also choose to use a low-pass filter for this purpose (see below).

In electron microscopy (EM), the envelope parameter accounts for instrumental point-spread effects such as beam coherence (described above) and detector based attenuations that may be related to sampling. It is identical in form to the overall B-factor, except that, following conventions prevalent in each field, they differ by a factor of 4. Overall B uses the crystallographic convention: f = foexp(-Bs2/4), whereas envelope uses the usual EM convention of EMAN (but not all EM software): f = foexp(-Bs2) [Saad-2001].

[Saad-2001]Saad, A., Ludtke, S.J., Jakana, J., Rixon, F.J., Tsuruta, H., and Chiu, W. (2001). Fourier amplitude decay of electron cryomicroscopic images of single particles and effects on structure determination. J Struct Biol 133, 32-42.

The overall B and EM envelope parameters are combined into a single exponential attenuator, so their separate input / refinement is provided only as a convenience.

Only isotropic corrections are supported. (Anisotropic corrections are not possible within the density calculation algorithms used.)

Contrast Transfer Function (CTF)

Additional EM corrections beyond the envelope function are no longer supported, due to 2 challenges:

  • They can deviate significantly from spherical symmetry (spherical aberration etc.), and it is computationally tractable only to apply symmetric corrections with the density calculation algorithms in use.
  • CTF corrections applied to individual images during reconstruction are not really applicable to the whole map that we work with in refinement.

Earlier attempts to account for (merely) the spherically symmetric effects of just systematic errors in CTF correction have been deprecated. They were neither particularly successful or easily rationalized. The spherically symmetric effects of non-optimal CTF corrections are now accounted for with low-pass filtering of the model density.

Low-pass filter / Resolution

The recommended way to account for further attenuation is a Butterworth low pass filter that is, by default, 5th order, as in the EM software, Spider. The following section describes how this can be used to refine the effective (soft) resolution limit, which may be easier to estimate by comparison to an atomic model than by other means.

Some EM packages, notably EMAN, support a Gaussian attenuation. A Gaussian attenuation could also be applied in RSRef by increasing the overall B-factor or envelope constant (see above), but there is currently no support for calculating the additional B-factor corresponding to a desired soft resolution limit.

Image Parameter Refinement with image_refine

While imaging parameters could, in principle, be refined jointly with atomic parameters, a separate routine, image_refine is provided, for greater efficiency and to ensure that imaging parameters are refined using the full model and not the subset of atoms that might have been selected for a batch of local model refinement.

The convergence radius of image refinement is finite, particularly when multiple parameters are being refined. There are several reasons:

  • An increase in magnification decreases the number of grid points within the molecular envelope, systematically biasing the least-squares residual downwards. This can be mitigated in part with the --normalize command-line option. However, refinement may still not be possible, especially if the anticipated change is substantial. A search for the maximal correlation coefficient may be more appropriate.

    • A search can be done by alternating the following commands with different values for the magnification: evaluate; py my.map_calc.magnify(magnification=0.98, atoms=my.atoms); evaluate.
  • The effects of changing B-factor / envelope may be similar to changing the filter resolution, particularly with a low resolution map. Thus the parameters may be nearly co-linear and the refinement ill-conditioned.

  • The filter resolution may have little impact if its value is less than or near equal to the hard resolution limit.

  • The gradient vector may be so dominated by some parameters that others have negligible effect and will appear frozen.

  • Shift magnitudes that must be guessed on the first cycle, may move some parameters beyond reasonable ranges from which refinement can recover.

  • Analytical derivatives are determined by summing the effects of imaging parameters on the density of each atom only within the map_use radius (see below). It therefore underestimates some of the long range effects.

These problems tend to get worse at low resolution, but their impact also depends on the size of the refinement and therefore the number of grid points that are contributing.

The same types of problems could afflict both imaging and model refinements, but it is imaging refinement that is usually more challenging, due to the mix of parameter types.

There are several ways that the above affects can be mitigated, but they depend somewhat on the size, stage and resolution of the refinement. Thus program defaults may need some adjustment:

Units

The mathematics of least-squares optimization implicitly assumes that all refining parameters have the same units. Clearly atomic positions, B-factors and group rotations do not! Physical units are converted to internal units for refinement. Ideally, the scaling should give components of the gradient vector that:

  • Are in rough proportion to the effect of typical errors in this parameter type on the (fitting) objective function.
  • Are within an order of magnitude (or two) for different parameter types, so that refinement does not ignore some parameter types.
  • Do not take parameters out of reasonable ranges on the 1st cycle.

RSRef prints average values for gradient components. Unit scaling is changed by adjusting the units_per_* command-line arguments.

Limits
The option for constraining parameters is internally set with limits that are, by default, +/- 5% in magnification & +/- 40% in resolution. Convergence may be improved with narrower limits, set with the --limit* command-line arguments.
Finite differences
For image refinement (not atomic refinement) a finite_difference option is available to switch to ~5 x slower numerical derivatives. This avoids some inaccuracies of approximation (above) and can help particularly at low resolution.

With the image_refine routine, it is anticipated that image parameters will be refined at the start, and occasionally as the model improves.

Model parameterization

RSRef provides several modes with which model parameters can be individually or collectively optimized in stand-alone mode. These complement, rather than replicate those available through embedded use in other programs.

Individual atom

Without restraints, this is only of limited use, perhaps for solvents and counterions, and requiring pretty high resolution.

Group

Groups can be arbitrarily defined, for example as domain or amino acids. Domain refinement might be appropriate for electron microscopy at resolutions worse than about 5 Å.

Groups are implemented to apply a consistent change to the component atomic parameters. Unlike many other programs, they are not required to share the same initial properties (although they can optionally be set to be so). Thus, one can refine an additional B-factor for a domain while keeping the underlying variation in individual atomic B-factors that might have come from a high resolution structure.

Torsion

Currently implemented is optimization of protein φ and ψ angles only (with riding rigid side chains). This offers a reduced parameterization that can be appropriate for modeling conformational changes to well-characterized underlying structures.

To refine all φ, ψ (without stereochemical restraints) requires a resolution at which adjacent backbone chains can be resolved in the density. Limited experience indicates that this may be possible with a good 5 Å map, but not at 7 Å resolution. At resolutions lower than ~5 Å, it will be necessary to limit the variable dihedrals to those within domain linkers, hinges etc.. (Currently, these would have to be specified manually.) It is also expected that the addition of van der Waals restraints will help avoid strands overlapping at low resolution.

At high resolutions (about 2 Å), the lack of side chain refinement is likely to be limiting.

Limited experience shows excellent performance in morphing a high resolution structure from one conformational state into a map of a different conformational state at resolutions of 5 to 2.5 Å, with a convergence radius that can exceed 3 Å.

The torsion angle algorithm performs rotations that are rigid either site of each dihedral. This is different from most implementations that window on short fragments of the structure (in turn), allowing structural deformations where it meets the rest of the (fixed) structure, deformations that are subsequently refined out through stereochemical restraints. The RSRef algorithm is best suited for large hinge rotations and shears of loops and domains, and thus complements the algorithms in Xplor and CNS that are better at local improvements more typically needed in a manually-built model. Thus, expect RSRef to perform better in a molecular replacement or conformational change situation, but other programs to better improve models that have been built from scratch.

Command-line options

The most up-to-date documentation is generated from python -u -m rsref -h:

Deviations for RSRef embedded in other programs

Currently, only a CNS-embedded version has been implemented, but it is expected that other wraps would be very similar.

CNS-embedded version

RSRef control is excercised through command-line parameters that are passed through CNS. CNS does not use command-line parameters, so conflicts on input are not expected.

Many option (categories) are not relevant to the embedded functionality are are ignored without warning. Some of these may be similar to CNS parameters which take precedence. There are a few cases where similar input to both RSRef and CNS are expected, likely points of confusion:

Input model parameters
The coordinates are read from the file specified in the CNS input file, not through RSRef specification. b_overall is the only model-based argument that is applied (so that an EM envelope correction can be applied in addition to previously determined crystallographic B-factors). An equivalent of max_shift is available through CNS.
Ignored arguments
help, version, cmd_file, also all arguments in the groups Model parameterization and Image refinement. Image refinement can only be run in stand-alone mode, but the resulting parameters can be input to a cns run.
Density calculation & comparison
These arguments should be supplied. Note that for density calculation, form_factors supplied here are used and not those that may be specified in the CNS input file.
Experiment & Map parameters
These are all needed. It is the map supplied here that will be used as the experimental (observed) input map. (CNS map parameters pertain to maps that are to be output.) Resolution limits entered into CNS are ignored by RSRef, which uses the command-line values for density calculation.
Symmetry

RSRef and the embedding CNS need consistent information provided in different formats. Apologies for the confusion likely in specifying the symmetry in different formats, but the needs are different. CNS needs two types of expansion: (a) to an exact asymmetric unit for Fourier transformations, noting that all atoms are required; (b) to all neighboring atoms for stereochemical restaints. CNS is designed to operate always in the presence of lattice symmetry.

By contrast RSRef can exploit efficiencies if refining only a local subset of atoms, but needs to consider potentially overlapping density from neighbors that might come from the same molecule or multiple neighbors, if there is symmetry or lattice periodicity. Note that coordinate output will be handled by CNS, so none of the options specifying which symmetry-related coordinates to write have any effect.

Input files

local_symmetry

A file that will be evaluated (in python) as a tuple of operators. Each operator is a tuple of name (str), rotation (tuple) and translation (tuple of 3 Angstrom floats). The rotation is the matrix specified as three row-tuples, each as a tuple of 3 floats). The unit operator is implied and therefore optional. The example below specifies two additional symmetry equivalents:

(   ("p",    (
        (0.500000,       0.809000,      -0.309000),
        (-0.809000,       0.309000,      -0.500000),
        (-0.309000,       0.500000,       0.809000)
            ),
        (0.000000,       0.000000,       0.000000)),
    ("e",    (
        (0.309000,      -0.500000,       0.809000),
        (0.500000,       0.809000,       0.309000),
        (-0.809000,       0.309000,       0.500000)
            ),
       (0.000000,       0.000000,       0.000000)))

Command interpreter

The command interpreter is available for programs in stand-alone mode, but generally not when called from / embedded in another package.

Syntax:

Commands are entered at the prompt in a unix-shell style:
  • command [option(s)] [positional argument(s)]
  • where options can be provided as -x [value] (x is a single letter), or --option[=value] in “two-dash” long form.
  • Various standard short-cuts are pre-defined, and any command can be shortened to a unique abbreviation. Note that the abbreviation given in the help is not always long enough to be unique (a bug inherited from cmd2).
  • Comments can be included with “#”, with text to the end-of-line then ignored.

Shell:

The cmd2 shell-like interface is inherited, offering history, command editing and redirects. Redirects can be awkward, because of the conflict with logical operators (<, |, >) used in selections (which therefore need to be quoted).

Hierachical structure:

Sub-commands are only available after entering the command. Higher-level commands are generally not available in sub-commands. The exceptions are general utility commands such as shell, shortcuts & set. Help, by default, is specific to the command level, but this behaviour can be changed with the --all (-a) & --recursive (-r}) options. Note that load (“@”) & related commands do not transcend different command levels.

Just-in-time calculation & pre-requisites:

A number of efficiencies are possible by pre-calculating and repeatedly using objects. Rather than pre-calculating at startup all objects that might be needed, the program attempts to calculate the minimal needed, just-in-time. For the most part, the pre-requisites are figured out and tasks are executed when needed using pre-assigned (or default) parameters. One exception is that any command with a “parameterize” pre-requisite will issue an error message if not already performed (mind-reading is not an option!).

The order that commands are entered is sometimes important, particularly when the embedded python interpreter is invoked with “py” (see below). Given the flexibility of the “py” command, there is no way to figure out the pre-requisites. Users should be especially attentive to AttributeErrors that might indicate an unmet pre-requisite dependence.

Error recovery:

Inherited from cmd2, exceptions are captured at the Command level, printing at least an error message, but without aborting the whole program. On interactive use, this conveniently often offers a second chance. If run as a script, users should search the output for “Error”, lest one has scrolled by. The default is a terse error message, but this can be changed to a full traceback using “set debug True” (still does not abort).

Selected commands and implementation-specific limitations.

@FILE or load FILE:

Used to run commands from an external file. The limitation is that commands cannot descent/ascend through nested sub-commands. Thus, for example, commands within parameterize would have to be given separately. The same limitation applies to variants _relative_load (@@).

Command list

The most up-to-date documentation is generated from python -u -m rsref /dev/null /dev/null help exit:

Rsref v0.5.6, 10/22/15 (Command: /home/chapman/Devel/RSRef/FTatom/src/rsref.py) [?1034h Rsref> help

Documented commands (type help <topic>):

_load evaluate li parrot r select
_relative_load help list pause randomize set
analyze hi load pdbout refine shell
cmdenvironment history map perturb restraints shortcuts
ed image_refine neighbors profile run show
edit l parameterize py save test

Miscellaneous help topics:

SELECTION_EXPR

Undocumented commands:

EOF eof exit q quit

Rsref> exit

Command Help

(See also Program control / utilities, below.) The most up-to-date documentation is generated from python -u -m rsref then help -rl:

analyze

Analyze prior refinement: shifts, gradients & impact of parameters.

analyze> convergence

Analyze the gradients and shifts of prior refinement.

Usage: convergence [options] arg

Options:
-h, --help show this help message and exit
-s, --summary Summary only without itemizing groups or dihedrals.

analyze> dihedrals

Changes in torsion angles and their impact.

dihedrals> hinges | hi

Find hinges in dihedral changes.

Usage: hinges [options] arg

Options:
-h, --help show this help message and exit
-g INT, --gap=INT
 # residues w/o dihedral changes that can be bridged in a hinge
-t FLOAT, --threshold=FLOAT
 above which dihedral rotations considered hinges. Degrees if > 1.0, else fraction of total change.
-p, --pseudo combine phi_i with psi_i-1, else individual dihedrals.
-s START, --start=START
 start of an explicitly defined hinge: chain-residue, requires –end, repeatable.
-e END, --end=END
 end of an explicitly defined hinge: chain-residue, requires –end, repeatable.
dihedrals> impact

Estimate impact of refined torsion angle changes on superimposition.

Usage: impact [options] arg

Options:
-h, --help show this help message and exit
-i, --iterative
 Iteratively determines/applies highest impact dihedral changes. This rarely-needed option is compute- intensive, because it iteratively finds the single, highest impact dihedral, applies the rotation, and repeats. Otherwise, by default, impact is assessed more rapidly from the dot product of the gradient and shift vectors, integrated over refinement iterations.
-I IDENTIFY, --identify=IDENTIFY
 List this number of the top-impact dihedrals.
-p, --pseudo Use approximation to pseudo-torsion angles (phi_i + psi_{i-1}) instead of individual phi, psi. (Limits output options, but speeds option iterative.)
-R, --recover Use previous calculation of impact, other options invalid.
impact> color

Output coordinates, B-factors set to percent impact on fit of dihedrals.

Usage: color [options] output PDB file

Options:
-h, --help show this help message and exit
-p, --pseudo Use pseudo-torsion angle (phi_i + psi_i-1), else individual phi, psi; ignored if impact –pseudo.
impact> pickle

Save impact-needed data, so impact.py can replot without recaclulation (when tweaking plot). WARNING: objects picked by superpose must be compatible with impact - has not been checked recently.

Usage: pickle [options] FILE.jar - pickled impact object for later analysis.

Options:
-h, --help show this help message and exit
impact> plot

Graph impact of dihedrals on fit.

Usage: plot [options] arg

Options:
-h, --help show this help message and exit
-p <impact|change>, --prior=<impact|change>
 Replace impact or dihedral change with values from previous refinement (to superimpose restrained impact upon unrestrained dihedral changes).
-c, --changeOnly
 plot only dihedral changes, not estimate of impact.
-t, --tty Display graph on terminal (not recommended for background jobs).
-f FILE, --file=FILE
 save plot in graphics file of type given by extension (.emf, eps, jpeg, jpg, pdf, png, ps, raw, rgba, svg, svgz, tif, tiff)
-A FILE, --annotation=FILE
 File containing additional plot commands, overriding command-line input.
-O, --overall Dihedral changes (not impact) relative to reference (input) structure not refinement batch.
impact> print

Tabulate changes in dihedrals & impact on fit.

Usage: print [options] arg

Options:
-h, --help show this help message and exit
-O, --overall Dihedral changes (not impact) relative to reference (input) structure not refinement batch.
impact> done

Safe sub-menu exit, returning to higher level command.

dihedrals> paint

Output coordinates, B-factors set to dihedral change for visualization.

Usage: paint [options] output PDB file (w/ corrupted B-factors!)

Options:
-h, --help show this help message and exit
-p, --pseudo Use pseudo-torsion angle (phi_i + psi_i-1),else individual phi, psi.
-n, --normalized
 Scaled between 0 & 100.
dihedrals> done

Safe sub-menu exit, returning to higher level command.

analyze> done

Safe sub-menu exit, returning to higher level command.

analyze> done

Safe sub-menu exit, returning to higher level command.

evaluate

Calculate statistics comparing current model to map.

Usage: evaluate [options] [different-pdb-file]

Options:
-h, --help show this help message and exit
-v, --verbose Including scaling information.

image_refine

Refine image (map) parameters to maximize agreement with atomic model.

Usage: image_refine [options] [different-pdb-file]

Options:
-h, --help show this help message and exit
-M, --magnification
 Optimize magnification.
-B, --overall_B
 Optimize overall B (-B & -E mutually exclusive).
-E, --envelope Optimize EM envelope function (-B & -E mutually exclusive).
-R, --resolution
 Optimize resolution (low-pass filter on atomic model to best match map).
-C INT, --max_cycles=INT
 Maximum number of cycles.
-i FLOAT, --min_improvement=FLOAT
 End when per-cycle improvement falls below this value.
-g FLOAT, --min_grad=FLOAT
 End when gradient norm falls below this value.
-f, --finite_difference
 Numerical derivatives instead of analytical.
-v INT, --verbosity=INT
 Per-cycle logging: -1 (terse) to 5 (verbose).
--confirm Request Y/N conformation before acceptance.

map

Write density map(s) on the grid of the input map.

Model and difference density is calculated for atoms and their immediate (symmetry-related) neighbors that have been selected previously. This is consistent with RSRef’s modality of evaluating the fit over molecules rather than a 3D unit cell volume. If the structure has molecular or crystallographic symmetry, and the statistics of the map (scale, correlation, sigma) are to be analyzed externally throughout a parallelepiped, symexp.py should be used first to fill the volume (plus a margin) with the symmetry-related atoms. (Although popular, such volume-based assessments, without a molecular molecular mask, are discouraged, because the solvent volume massages the statistics.)

Usage: map [options] file name

Options:
-h, --help show this help message and exit
-c, --calculated
 Simulated from the atomic model using current imaging parameters.
-d, --difference
 Observed minus calculated, scaled & using imaging parameters.
-o, --observed Input experimental map, having applied any change in magnification.
-s, --scaled Input experimental map, having applied any change in magnification, and scaled to atomic model (required command-line option –scale_to_model.)
-t TITLE, --title=TITLE
 Title used in map header (quote to protect white-space & newlines).

neighbors

Identify neighbors within distance of (selected) atoms.

Usage: neighbors [options] arg

Options:
-h, --help show this help message and exit
-d FLOAT, --distance=FLOAT
 Searches for neighbors within distance of any atom (default: 3.5 A).

parameterize

For designated parameter type(s) (default positions), select atoms to be refined & how.

Usage: parameterize [options] arg

Options:
-h, --help show this help message and exit
-b, --bfactor Designate which B-factors to be refined.
-o, --occupancy
 Designate which occupancies to be refined.
-p, --position Designate which atomic positions (xyz) to be refined.

parameterize> SELECTION_EXPR

SELECTION_EXPR: logical expression of <Selection(s)> (string)
using array logical operators: &, |, ==, !=, ~, >, >=, (, ), ... evaluated in the task namespace. <Selection> is an existing instance of class Selection or a new one instantiated with S(<criterion>), where string <criterion> is a logical array expression using keywords such as residue number, chain, atom number, synonyms or unique abbreviations, see documentation for Selection. SELECTION_EXPR should be quoted to avoid shell commands / redirects etc.

Pre-defined selections/groups are created by top-level commands:

    > select -C collection -n name
    > select -C collection -a attribute

and are referred to in SELECTION_EXPR in one of two ways::

    > collection['name'] or collection['attribute'] (single selection).
    > collection (uses the logical OR of all selections in the Group).

Examples:

S('chain == A') & S('residue num <= 105')
S('(chai == C) & (residue_ty != HOH)')
"rigid['N_domain'] | S('resnam == ATP')"

Default: S(‘all’), i.e. all atoms.

parameterize> clear

Switch off all refinement of requested parameter type. (Individual parameterizations can be switched off with”group None”, “individual None”, “torsion None”, “overall False”)

parameterize> done

Safe sub-menu exit, returning to higher level command.

parameterize> group

Select atoms to be refined as one or more groups.

Usage: group [options] GROUP | COLLECTION | SELECTION | SELECTION_EXPR | GROUP_EXPR:

SELECTION: a previously saved single Selection, given as: COLLECTION[‘ITEM’].
(COLLECTION or [‘ITEM’] can be omitted if defaults were used in the corresponding select command.)

SELECTION_EXPR: boolean Selection expression, see help select.

GROUP_EXPR: list, dictionary of tuple of multiple quoted SELECTION_EXPR.

GROUP | COLLECTION: name of previously saved (cmd select) Selections.

Options:
-h, --help show this help message and exit

parameterize> individual

Select atoms to be refined individually. (More appropos with exptl. data than superposition.)

Usage: individual [options] SELECTION_EXPR | SELECTION:

SELECTION_EXPR: a boolean expression, see help select; SELECTION: name of a previously saved Selection, given as: COLLECTION[ITEM]. (COLLECTION or [ITEM] can be omitted if defaults were used in the corresponding select command.)
Options:
-h, --help show this help message and exit

parameterize> overall

Refine requested parameter type as a single group.

If just refining overall position, overlay is better (wider convergence/faster). Use overall when also simultaneously refining sub-groups, torsion angles etc..

Usage: overall [options] [True | False]

Options:
-h, --help show this help message and exit

parameterize> print

Print parameterization. (Options invoked until reset.)

Usage: print [options] arg

Options:
-h, --help show this help message and exit
-w INT, --width=INT
 Line width (def. 132).
-a SIZE, --abbreviate=SIZE
 Selections longer than SIZE abbreviated w/ ellipsis (def. 33).
-c, --count Report number of moving atoms in selections.
-b, --boolean Report selections as T/F boolean arrays (def).

parameterize> torsion

Select variable dihedrals and atoms whose positions depend on them.

Usage: torsion [options] [SELECTION_EXPR] - used by -f/-V and overules -d defailt.

Options:
-h, --help show this help message and exit
-d SELECTION_EXPR, --dihedrals=SELECTION_EXPR
 optimize all variable dihedrals (phi, psi) within this selection. See help select. Default: macromolecule
-a SELECTION_EXPR, --atoms=SELECTION_EXPR
 atoms to be moved by dihedral rotations if linked to variable bonds (which must be fully enclosed by this Selection). See help select. Default: all atoms - macromolecule
-f <phi|psi|pseudo>, --fix=<phi|psi|pseudo>
 Fix all dihedrals of named type within atoms in SELECTION_EXPR (must be only option/argument.
-t <NUMBER(int)|FRACTION(float)>, --top=<NUMBER(int)|FRACTION(float)>
 Fix all but these dihedral angles (must be only option/argument).
-V <phi|psi|pseudo>, --vary=<phi|psi|pseudo>
 Fix all dihedrals of named type within atoms in SELECTION_EXPR (must be only option/argument.
-v, --verbose List selected dihedrals (–top).

parameterize> done

Safe sub-menu exit, returning to higher level command.

pdbout

Write coordinates (symmetry expansion as per command-line arguments).

Usage: pdbout [options] [Header inserted into top of PDB file]

Options:
-h, --help show this help message and exit
-o FILE, --file=FILE
 Output file name (required argument).
-a, --anisotropic
 Output anisotropic U, riding w/ coordinates from input PDB.
-c, --c_alpha Output only C_alphas.

perturb

Perturb model & calculate statistics.

Usage: perturb [options] displacement (A; def. 1.0)

Options:
-h, --help show this help message and exit
-i, --individual
 Perturb atoms individually, else as rigid body.
-n INT, --steps=INT
 Divide displacement into n logarithmic steps (default: 1).
-r INT, --repeats=INT
 Number of random displacements used to calculate statistics (default: 1).
-d xyz-components, --direction=xyz-components
 Unit vector for direction of perturbation. Chosen randomly if None (default).
-s INT, --seed=INT
 Seed for random number generator.
-v, --verbose Additional statistics.

profile

Density of an isolated atom vs. distance from center, useful for setting cut-off radii.

Usage: profile [options] arg

Options:
-h, --help show this help message and exit
-B FLOAT, --B_factor=FLOAT
 Atomic B-factor (default: 10.0).
-a TYPE, --atom=TYPE
 Atom type (default: C).

randomize | r

Randomize coordinates (with normal error distributions).

Usage: randomize [options] arg

Options:
-h, --help show this help message and exit
-x FLOAT, --xyz=FLOAT
 Magnitude of desired RMS xyz positional displacement (default: 0.0).
-B FLOAT, --B_factor=FLOAT
 Desired standard deviation in B-factor (default: 0.0).
-O FLOAT, --occupancy=FLOAT
 Desired standard deviation in occupancy (default: 0.0).
-s INT, --seed=INT
 Seed for random number generator.

refine | r

Refine atomic model.

Usage: refine [options] arg

Options:
-h, --help show this help message and exit
-C INT, --max_cycles=INT
 Maximum number of cycles.
-i FLOAT, --min_improvement=FLOAT
 End when per-cycle improvement falls below this value.
-g FLOAT, --min_grad=FLOAT
 End when gradient norm falls below this value.
-n, --new_batch
 New batch (losing prior history), else continue prior (default) if exists/possible.
-r, --restart From original coordinates (implies -n).
-v INT, --verbosity=INT
 Per-cycle logging: -1 (terse) to 3 (verbose) [def. 0].

restraints | r

Information on supplementary restraints.

select

Name selection(s) of atoms.

Usage: select [options] [SELECTION_EXPR]

SELECTION_EXPR: logical expression of <Selection(s)> (string) using array logical operators: &, |, ==, !=, ~, >, >=, (, ), ... evaluated in the task namespace. <Selection> is an existing instance of class Selection or a new one instantiated with S(<criterion>), where string <criterion> is a logical array expression using keywords such as residue number, chain, atom number, synonyms or unique abbreviations, see documentation for Selection. SELECTION_EXPR should be quoted to avoid shell commands / redirects etc, and if provided as a named argument, must be devoid of white-space.

Examples:

select -C rigid -n N_domain S('chain == A') & S('residue num <= 105')
select -C protein -n C S('(chai == C) & (residue_ty != HOH)')
select -C all -n catalytic "rigid['N_domain'] | S('resnam == ATP')"
select -C mycollection -a resnum -F protein -S C
select --collection=chains --attr=chain
Options:
-h, --help show this help message and exit
-C NAME, --collection=NAME
 Collection (dictionary) into which selection is placed (default: None –> “collection” or name of attribute if -a specified).
-n NAME, --name=NAME
 Unique name to be given to selection (default: None –> “default”). (-a (–attr) & -n (–name) are mutually exclusive.)
-a ATOM_ATTR, --attr=ATOM_ATTR
 Selections are made (and named) for each unique value of ATOM_ATTR in the coordinates (see -d, -f). ATOM_ATTR must be specified as a single-word abbreviation/synonym recognized in Selection expressions, eg. –attr=chain might give selections A, B..., while –attr=resnum might give 23, 24,... (-a (–attr) & -n (–name) are mutually exclusive.)
-S NAME, --selection=NAME
 Selection or dictionary (group) from which –attr subset is to be drawn (default: None –> all atoms).

Additional considerations

Analyze

Run after refine, it provides statistics on shifts, useful when documenting a refinement. It also provides gradients, useful for retrospective adjustment of parameter scales (--unit* arguments), making sure that all parameters have opporunity for refinement. The issue of parameter scaling has been described in the section on Image Parameter Refinement. Although less common, it can also be an issue in atomic refinement, particularly with rigid group or torsional parameterizations for which the gradients can depend on fragment size, and the default internal calibrations may therefore not be good enough. Analyze provides diagnostics for the user to assess convergence.

Warning: if refine is run repeatedly, cycles will be concatonated as if a single run, providing that model parameters have not been changed. If there have been changes, refinement cycles will restart from zero and analyze will report only on the last batch, not the entire session.

Evaluate

Compares the current model (or the optional new_file) to the map, calculating the local correlation coefficient, real-space R-factor, RMS and residual. (By local, we mean map pixels closer to any atom than the distance set by the --map_use argument.)

The -v or --verbose option prints additional scaling statistics. These can be used by EMAN or Spider to put maps on an absolute scale, but this is now deprecated by the map command of RSRef that can do it internally.

Image_refine

Refines experimental parameters to improve the agreement between map and model. These parameters change how density is calculated from the atomic model. Maps that are output (map) will all be adjusted by changes in magnification. Calculated and difference maps, but not output observed maps will reflect changes in B, envelope and resolution.

Parameters:
Magnification
The relative magnification, a scale factor by which the voxel (or unit cell) dimensions of the map should be uniformly divided for optimal agreement with the model.
Overall_B
An isotropic temperature factor that is added to all B-factors, commonly used in crystallography. Atomic densities are calculated from corrected scattering factors: f = fo exp{-Boverall (λ/2d)2} Overall_B is covariant with envelope, so the two cannot be refined together. Both are added to the atomic B-factors before density calculation.
Envelope
A Gaussian attenuation that is commonly used in electron microscopy to account for beam incoherence, detector point-spread etc.. Atomic densities are calculated from the corrected scattering factors: f = fo exp{-EMenvelope (λ/d)2} noting the 4-fold difference compared to overall_B. Note that EM 3D reconstructions have often had an inverse envelope correction applied, so refinement could yield a possitive or negative incremental correction. Not only is envelope covariant with overall_B, but also, partially, with resolution, below, so it may be possible to refine only one of these parameters at any time.
Resolution
A “soft” resolution limit for the low-pass 5th order Butterworth filter that maximizes the consistency between model and map. The signal is attenuated smoothly either side of this limit, so it is distinct from the “hard” resolution of the high_resolution command-line argument, beyond which the signal is zero. Both can be applied, but with the “hard” limit higher (smaller number) than the “soft” limit. See comments on covariance with envelope above.
Options

Most of the options relate to convergence criteria. After any one is satisfied, the refinement is terminated.

Parrot

Equivalent to toggling with set echo True/False.

Pdbout

Refined coordinates are only output if this command is run!

The header is modified from that of the input. The output file may be non-standard, according to the Symmetry --output option selected. The default is non-standard addition of fragments from symmetry-equivalent molecules that are close (but this can be disabled).

Perturb

Calculates statistics as atoms are moved step-wise from current locations. This is helpful in determining the program parameters that might provide the best convergence radius with the available data.

In the default rigid-body mode, the displacement entered is the actual translation. With the --individual option, the displacement is interpreted as the target RMS after random displacements.

Statistics are somewhat dependent on the randomized directions of displacements, and are best averaged over 10 to 50 --repeats. (The --direction option only makes sense with --repeats=1.)

In calculating statistics, current settings of cut-off radii, resolutions, etc., will apply.

Output (stdout)

For each step (decreasing from the maximum perturbation to zero), a row consisting of:

  • rms model perturbation (Å).
  • Pearson correlation coefficient between model and map.
  • standard deviation of the correlation coefficient (among repeats).
  • Real-space R-factor.
  • standard deviation of the R-factor (among repeats).
  • Σ(ρobs - ρcalc)2
  • standard deviation of the above.
  • Correlation between the partials of the difference residual and the atomic error vectors. This is calculated from the flattened arrays of length repeat*atoms*3, i.e. the parameter vectors used in refinement, and it captures both direction and magnitude.
  • Gradient norm calculated from the magnitudes of the partials.
  • Dot product of normalized gradient and error vectors. Equal to the correlation coefficient in the special case of the mean values of the gradient and error vectors equal to zero.

The correlation between gradient and error vectors is perhaps the best estimate of likely convergence radius, and can guide the choice of density-calculation parameters to use. Correlation and residuals can indicate the sensitivity.

Refine

Parameterize is a pre-requisite, i.e. definition of what is to be refined (and how) must preceed this command.

Options

Most of the options relate to convergence criteria. After any one is satisfied, the refinement is terminated.

Restraints

Does nothing but report on supplementary restraints. (Whether restraints are used is controlled by command-line arguments.) Starting with v0.4.2 (11/04/13) restraint on B-factor variation between bonded atoms could be applied. Soon after v0.5.0 (2/18/15), van der Waal’s restraints will be added, and this might be all that is needed for rigid group or torsion angle refinement. (If full stereochemical restraints are needed for refinement of individual atomic positions, use the CNS-embedded implementation.)

Program control / utilities

(Available from top-level and all sub-menus.) The most up-to-date documentation is generated from python -u -m superpose then help -rl:

_load

Runs script of command(s) from a file or URL.

_relative_load

Runs commands in script at file or URL; if this is called from within an already-running script, the filename will be interpreted relative to the already-running script’s directory.

cmdenvironment

Summary report of interactive parameters.

edit | ed

ed: edit most recent command in text editor

ed [N]: edit numbered command from history ed [filename]: edit specified file name

commands are run after editor is closed. “set edit (program-name)” or set EDITOR environment variable to control which editing program is used.

help

Document command or list available commands.

Usage: help [options] [command]

Options:
-h, --help show this help message and exit
-a, --all Include commands inherited from higher levels. (Combining -a -r will be excessively repetitious.)
-l, --long Fully document all commands.
-r, --recursive
 Descend through nested command sets.

history | hi

history [arg]: lists past commands issued

no arg: list all
arg is integer: list one history item, by index
arg is string: string search
arg is /enclosed in forward-slashes/: regular expression search

Usage: history [options] (limit on which commands to include)

Options:
-h, --help show this help message and exit
-s, --script Script format; no separation lines

list | l | li

list [arg]: lists last command issued

no arg -> list most recent command arg is integer -> list one history item, by index a..b, a:b, a:, ..b -> list spans from a (or start) to b (or end) arg is string -> list all commands matching string search arg is /enclosed in forward-slashes/ -> regular expression search

load | l

Runs script of command(s) from a file or URL.

parrot

parrot [T[rue]|F[alse]|Y[es]|N[o]]: toggle or set command echoing (for log file).

pause

Displays the specified text then waits for the user to press RETURN.

py

py <command>: Executes a Python command. py: Enters interactive Python mode. End with Ctrl-D (Unix) / Ctrl-Z (Windows), quit(), ‘exit()`. Non-python commands can be issued with cmd("your command"). Run python code from external files with run("filename.py")

run | r

run [arg]: re-runs an earlier command

no arg -> run most recent command arg is integer -> run one history item, by index arg is string -> run most recent command by string search arg is /enclosed in forward-slashes/ -> run most recent by regex

save

save [N] [filename.ext]

Saves command from history to file.

N => Number of command (from history), or *;
most recent command if omitted

set

Sets named Cmd parameter or lists all; unambiguous abbreviations OK.

shell

execute a command as if at the OS prompt.

shortcuts

Lists single-key shortcuts available.

show

Shows value of a parameter.

Usage: show [options] arg

Options:
-h, --help show this help message and exit
-l, --long describe function of parameter

Python Interpreter for advanced functionality

An attempt has been made to balance flexibility with simplicity and ease of use, in deciding which functionalities and parameters are available through command-line or command interpreter control. Many others are accessible through an embedded python interpreter that is invoked with the command py command. It is executed in a namespace that is local to the command interpreter (and not very useful). The command-line options are imported as attributes of the object “option”, and most other needed objects can be accessed as attributes of self.task, for which the alias “my” is provided. The following examples illustrate:

  • Change chain name of target to match refining model: Rsref> py import numpy; my.atoms.chain = numpy.where(my.atoms.chain=='C', 'A', my.atoms.chain)

  • Remove one of the target alternative locator IDs to match refining model: Rsref> py import numpy; my.atoms.altloc=numpy.where(my.atoms.altloc=='A',' ',my.atoms.altloc)

  • Change the torsion-restraint weight during a refinement: Rsref> py option.torsion_weight = 2.0

  • Print values of command-line options: Rsref> py print option

  • Reset B-factors to a uniform 15.0: Rsref> py my.atoms.b = 15.0

  • Replace all B values with their mean: Rsref> py b=my.atoms.b; b.fill(b.mean())

  • Change the limits on the magnification to +/-10%: Rsref> py option.magnification_limits = (0.9, 1.1)

  • Change the internal unit conversion for rigid rotations from 10 to 100: Rsref> py option.units_per_rad = 100.0 (see also --units_per_rad` command-line option).

  • Scale observed map to model (instead of model to observed): Rsref> py map_calc.scale_model_to_observed=False (see also --scale_to_model` command-line option).

    (The default class attribute ModelMap.scale_model_to_observed=True is appropriate almost all of the time, and avoids trivial refinements where the residual is lowered merely by changing the scaling. However, with the default, the residual can be lowered by the model exiting the boundaries of the map (warning messages are printed). The non-default (False) can be used to calculate additional diagnostic statistics for unstable refinements, but the partial derivatives (and refinement) will be incorrect.)

Performance

Radii

The number of ρcalc estimations is dependent on the cube of the atom_extent radius. With the efficiencies of array-based calculations, empirically, performance is approximately NlogN, (N=atom_extent).

The accuracy of density statistics (correlation, least-squares residual etc.) improves with atom_extent. While larger is always better, there are diminishing returns as the contribution from distant atoms declines. Although likely B-factor-dependent, empirically, there is less gained beyond atom_extent greater than 2.5 x resolution.

Especially early in a refinement, it may be appropriate to compromise accuracy of statistics (& refinement objective functions) for speed. --relative_use=0.5 with --relative_extent=1.25} gives pretty good results, i.e. using density at grid points within a sphere of every atom that has a radius of 1.25 x resolution.

B-factors

Atomic density is calculated from a 1-D Fourier transform of an isolated atom. It depends on atom type and B-factor. Values are cached when possible to avoid recalculation if the B-factors are essentially the same. For low resolution refinements (eg. EM) or when there is a large overall B or envelope factor, detailed variation in B-factors may not be important. There may be little point in refining B-factors. Speed can also be increased by:

  1. Setting all B-factors to a uniform value
  2. Rounding their values

With either, atomic densities will be more frequently drawn from the cache rather than calculated from scratch.

Atom selections and groups

A parser is provided for flexible selection of atom groups within the command interpreter. It is applicable to stand-alone running (rsref.py, superpose.py etc.), but not usually accessible when modules are wrapped/embedded in other packages (eg. CNS) which then manage atom selections.

Selections

The selection syntax is terse, flexible, and (for better or worse) relies on Python evaluation of expressions. Thus, hopefully, it will be intuitive for many users.

It differs from some other programs in that selections are objects (instances of class Selection). Selection objects are an extension of boolean arrays, specific for a coordinate set (class Atoms). Selection expressions can be used directly in commands like parameterize, or user-named Selections can be pre-defined for repeated use or to simplify/clarify the definition of complicated Selections. Selections can be combined or assigned to new Selection instances through use of Python bit-wise logical operators (&,|,==,!=,~,>,>=,(,),...). This should make it convenient to write refinement scripts in which different subsets of the atomic parameters are refined at different stages, different sets of atoms are subject to positional, B-factor or occupancy refinement, and in which some atoms might be refined individually, others rigidly grouped for simultaneous refinement, etc.. This frees the user of constraints embodied in other programs.

In RSRef, “S” is a synonym of Selection, the class. A selection expression is defined as:

<selection>|S(<criterion>) [<operator> <selection>|S(<criterion>)]

where:

<selection>
is a pre-existing instance of class S.
<criterion>

is a non-quoted string to select atoms for a new instance of S, where <criterion> can be in simple form or compound:

Simple

contains a single operator, and is given without parentheses, such as:

  • chain == A
  • chai==B
  • Residue number >= 30
  • resnu <= 50
Compound

an expression combining operators, using parentheses to set precedence. Spaces in names must be replaced with underscores, and expressions should use lower case as case sensitive. Examples:

  • (chain == A) | (chai==B)
  • ((residue_numb >= 30) & (atnam == CA)) | (chain != C)

Warning

the compound parser will silently do unexpected things with syntax or spelling errors. Check that results are consistent with expectations!

Note

the inner operands are enclosed within parentheses, because the combination operators (&, |) have higher precendence than the comparison operators (>, >=, <, <=).

<operator>
is a bit-wise Python logical operator (&,|,==,!=,~,>,>=,(,),...) Further details of the syntax for making Selections are provided in atoms.Selection.

Selections are named and defined with the select command.

Groups

Groups are dictionary-like collections of named Selections, each constrained as a group in group refinement. (In individual atom refinement, a Group is treated as the logical OR between all the named Selections.) The Group class contains methods for checking that selections do not overlap (work-in-progress). In most commands, the name of a Group can be substituted for that of a Selection.

Groups are defined with the selection command, for example:

select --collection=domains --name=N S('chain == A') & S('residue num <= 105')
select -C domains -n C S('chain == A') & S('residue num > 105')

These 2 statements illustrate several varients of the syntax in together defining a Group called domains, with two Selections, named N & C that contain the N- and C-terminal parts of subunit A.

Troubleshooting

Density calculation

Form factors

Form-factor not found

form factor tables support atoms commonly used in each technique. If one is not found, consider:

  • whether the atom in the coordinate file is appropriate, eg. should an EM model contain hydrogens?
  • switching to a different form-factor table (-F command-line option).
  • adding an entry into your favorite table (formfactor.py).

Coordinates

PDB output

TER records inappropriately inserted between residues
This can occur if atoms / hetatoms in the input PDB file have been edited out without renumbering the atom records.

Selections

Run-time exceptions

Invalid operand for int and ndarray
(or something similar; with complex Selections.) Check that attribute comparisons such as (atnam==C) are enclosed within parentheses as the comparison operator has lower precedence than a combination operator such as & or !.

Parameterization

Torsion

Warning that C-psi is not a recognized dihedral, then exception
Likely because the C-terminal OXT is missing from the input coordinates. If it is inconvenient to repair the coordinate file, then the final psi may be excluded with a selection like: torsion --dihedrals="(chain==A)&~((resnum==202)&(atnam==C))" (if 202 was the recalcitrant terminal residue).

Refinement

Nothing changes from the start
min_grad might be larger than the 1st gradient, so deemed converged before start. Check iterate.dat or the log file for a convergence message. Remove min_grad from input (default is 0.0) or set to small value.
Refinement stuck - 10s of function estimates w/in an iteration

The search along the gradient direction is not giving a minimum:

  • max_cycles may be unreasonably large. Mostly, one needs multiple cycles in a non-linear refinement in which the effects of parameter changes are inter-dependent. There are other cases where where is little inter-dependence and the refinement will approximate linear. Examples would include refinement of groups that are well separated. In these cases, refinement should be complete in very few cycles, with random directions thereafter.
  • option:min_improvement or min_grad might be set to levels too low, requiring a precision that is not possible. Note that there are several approximations. For example, movement in atoms may change which grid points are being used, and therefore a non-smooth change in the objective function.

Credits

This is a new implementation of theory laid out in Chapman 1995, programmed by Michael Chapman.

Form factor tables have been modified from the CCP4 and TNT distributions. Andrew Trzynkaa assisted in programming methods to read the form factors.

Libraries used in rigid-group and torsion angle optimizations were programmed by Brynmor K. Chapman. Van der Waal’s restraints have been programmed by Leo Selker.

This new implementation relies heavily on experience gained with earlier rsref.c and C++ programs, to which several former members of the Chapman lab contributed: Eric Blanc, Zhi (James) Chen, Andrew Korostelev, Felcy Fabiola and Olga Kirillova.

Citations

Publications and database entries should acknowledge use of RSRef by citing [Chapman-1995] and [Chapman-2013].

[Chapman-1995]Chapman, M. S. Restrained Real-Space Macromolecular Atomic Refinement using a New Resolution-Dependent Electron Density Function. Acta Crystallographica A51, 69-80 (1995).
[Chapman-2013]Chapman, M. S., Trzynka, A., and Chapman, B. K. (2013) Atomic modeling of cryo-electron microscopy reconstructions - Joint refinement of model and imaging parameters, J Struct Biol 182, 10-21.

Changed in version 0.5.0: 2/18/15, converted to reStructuredText from Epydoc.

Changed in version 0.1: 12/10/10 Start.