RSRef Real space refinement: atomic structure vs. map

Section author: Michael S. Chapman <chapmanms@missouri.edu>

Authors:

Michael S. Chapman <chapmanms@missouri.edu>, Andrew Trzynka <trzynkaa@ohsu.edu>, Brynmor K. Chapman & Leo Selker,

Oregon Health & Science University and University of Missouri

Version:

1, Nov 26, 2024

Usage:
  • RSRef [-options] <input.pdb>

  • rsref.py [-options] <input.pdb>

Synopsis

Compares / refines an atomic model to an electron density or Coulombic potential map.

Supports a wide range of resolution regimes from low resolution electron microscopic reconstructions to high resolution x-ray crystallography.

The package serves dual purposes:

Standalone program:

  • Model / density comparison & statistics.

  • Refinements without full stereochemical restraints: rigid-group, torsion angle, etc..

  • Restrained refinement of atomic B-factors.

  • Optimization of image parameters by agreement with an atomic model.

  • Supports symmetry: crystal lattice and molecular (non-crystallographic).

Library or Application Programming Interface (API):

  • To support extension of other refinement programs (eg. CNS), thereby combining real-space fitting with stereochemical restraints and additional optimizers such as simulated annealing.

  • To develop new strategies of refinement by combining map-fitting with functionalities being developed in other libraries.

Sources of Documentation

All of the following should be referenced:

Program options

Brief explanations of the arguments for stand-alone runs are given with: rsref.py -h, further information below.

When called as a library or embedded, the programmer has the option of supporting the relevant subset of command-line options (as in our modified cns), or handling program control independently. Absent superseding documentation from the calling program, rsref’s command-line documentation offers a first approximation.

Commands

Within the program (pasto.rsref> prompt), commands are listed with pasto.rsref ‣ help with the -v option to list synopses, help command for details on command, help -l for details on all commands and help -rl to descend recursively through subcommands. The help text is reproduced later in this document.

When embedded, the calling program is responsible for program control, so commands, syntax, options and access to help will be different. For our CNS extension, we refer back to this document from the documentation for cns-rsref.

Examples

The examples directory in the installation contains scripts, data files, results and log files. Explanations are in the README.txt file in each examples subdirectory.

Details, API

Details are encoded within docstrings that are accessible to programmers using Interactive Development Environments (IDEs). They are also compiled with sphinx into html files, linked from the module index on the home page. This is the searchable, cross-linked (API) reference documentation that will explain the meaning of parameters, performance of different functions etc..

The documentation is accessed from (index.html) on-line or in the distribution directory doc/html. (Additional formats can be generated with sphinx.)

Concepts

Image Refinement

In real-space refinement, we optimize an atomic model by maximizing the agreement with a 3D image. Mostly, this involves adjustment of the parameters of the atomic model. However, the agreement also depends on how the molecule is rendered in the imaging experiment, and how this is accounted for in calculating the image expected of the object.

In Crystallography, the image depends on the resolution limits, and on the overall B-factor which accounts, in part, for individual atomic motions, but also for the average effects of lattice disorder, radiation damage etc. which could be termed loosely experimental or instrumental attenuations. In real-space, one could also add in the effects of phase errors which tend to become progressively worse, attenuating the high resolution signal.

In Electron Microscopy, the image depends even more on additional instrumental parameters: magnification, contrast transfer function (CTF) and a group of factors such as beam coherence, specimen stability etc. that are collectively described by an envelope function. There may be further attenuation of the signal as a function of resolution due to additional physical limitations of the instrument or the image averaging, alignment etc. that are part of the computer processing of the 3D reconstruction. The average effects of many of these are often approximated by a low-pass filter. Experimenters often apply approximate corrections for some or all of the effects - current practices vary widely.

RSRef provides the wherewithall to apply further corrections to the 3D image calculated from the atomic model. The rationale is that the best atomic parameters will be obtained in a refinement where the discrepancies between calculated and experimental images have been minimized. Thus, RSRef also provides the means to refine the parameters of the empirical correction functions to maximize the agreement between calculated and experimental maps as the atomic model improves. Nevertheless, there are important limitations on these image corrections as discussed below. Generally, then, one will want to apply best-estimate corrections during the 3D reconstruction, and then use RSRef’s corrections to reduce the residual discrepancy when compared to an atomic model.

In RSRef, parameters for image corrections are determined from comparison of the 3D map with the atomic model. This is fundamentally different from the algorithms used in EM reconstruction, and there are both advantages and disadvantages. The primary disadvantage is that corrections such as inverse-CTF should be applied to individual 2-D images, but RSRef is working only with the 3D reconstruction in which the 2D images have been integrated. The primary advantage is that the atomic model is a partially-independent external standard that can reveal systematic errors that are otherwise difficult to characterize. It is possible that RSRef could highlight changes to parameters for which it might be worth repeating the reconstruction.

The current version of RSRef no longer attempts to correct explicitly for the “average” 3D effect of systematic errors in CTF parameters applied to 2D images, or Wiener filters. Instead, if focuses on the parameters of simpler corrections that most affect comparison of 3D map to atomic model:

  • Magnification.

  • Envelope correction.

  • Additional attenuation / low-pass filters (approximating the average effects of Wiener filters, beam incoherence, etc.)

Purists might argue that atomic models should be refined against an uncorrected map, so that least-squares can provide the solution of least error. Statistics are usually very flattering, when refinement is performed against a map where the high resolution signal is attenuated! However, experience is that such maps are often devoid of the detail needed to perform a high quality refinement. Furthermore, at high resolution, much of the attenuation comes from a highly predicatable detector-based Gaussian attenuation. This has led to the popularity of inverse B-factor, white-noise and other empirical corrections, to sharpen the reconstructed map. Statistics will always be worse, because sharpening will always increase high resolution noise as well as signal (but the structure might nevertheless be more accurate).

That said, the program is neutral to prior processing. No assumptions are made about what corrections have already been applied. The philosophy is to apply incremental corrections using similiar functional forms to corrections in widespread use in Electron Microscopy. By applying incremental corrections for agreement between atomic model and images (maps) we hope to make up for previously under- or over- corrections during the data processing.

The corrections can be pre-set and/or least-squares refined in RSRef. For EM attenuation, RSRef supports Gaussian Envelopes; Butterworth or Gaussian low-pass filters, in addition to relative magnification. Users can either use 3rd party software to apply corrections with the refined parameters to their EM reconstructions, or they can continue to apply inverse corrections to the calculated model density through RSRef.

All corrections applied are isotropic, without directional dependence. Magnification corrections are applied to the map, but other corrections change the density calculated from the atomic model so that it more closely represents the image that would be generated from the EM/x-ray instrumentation.

Magnification

The parameter is defined, for this program, as the change in the reported microscope magnification. Thus, if images had been collected at a nominal 50,000 x, a magnification of 1.01 would be appropriate if best agreement with model were obtained if the actual magnification were 50,500 x. (This follows the convention for relative magnification in BSoft, but is the inverse of the Scale parameter in EMAN.)

Once refined, a magnification correction can be applied at program startup with the rsref -X argument. Alternatively, a corrected map can be output by pasto.rsref ‣ map –observed or independently by dividing the grid separation APIX) and origin by the magnification. If this is done, no further correction is needed in RSRef, and the model will superimpose on the map in molecular graphics programs.

(For crystallographic data, magnification is equivalent to an isotropic reduction in unit cell lengths, and is likely not very useful!)

Note that an increase in magnification decreases the number of grid points within a radius of each atom. Thus, there is a systematic decrease in least-squares residual which may spoil refinement. This can be mitigated in two ways, using the --normalize option of RSRef to correct for the changing voxel size. Alternatively, for searches (but not refinement), one can optimize the correlation coefficient instead of the scale-dependent residual.

Overall B or Envelope Function

The EM envelope correction and the crystallographic overall B-factor are exponential Gaussian attenuators added to the atomic B-factors and applied in reciprocal space when atomic densities are calculated from scattering factors. They are all co-variant and can not all be refined at the same time. It is always somewhat arbitrary what components are included in the atomic B-factors, and which are factored out into an overall B-factor.

For crystallographic data, the overall B-factor supports provides compatibility with those reciprocal-space refinement programs that use Wilson scaling / overall B-factors. There is no need to modify individual atomic B-factors. The overall B-factor may also account for resolution-dependent average effects of phase error in a map. A user could also choose to use a low-pass filter for this purpose (see below).

In electron microscopy (EM), the envelope parameter accounts for instrumental point-spread effects such as beam coherence (described above) and detector based attenuations that may be related to sampling. It is identical in form to the overall B-factor, except that, following conventions prevalent in each field, they differ by a factor of 4. Overall B uses the crystallographic convention: f = foexp(-Bs2/4), rsref --envelope uses the usual EM convention of EMAN (but not all EM software): f = foexp(-Bs2) [Saad-2001].

[Saad-2001]

Saad, A., Ludtke, S.J., Jakana, J., Rixon, F.J., Tsuruta, H., and Chiu, W. (2001). Fourier amplitude decay of electron cryomicroscopic images of single particles and effects on structure determination. J Struct Biol 133, 32-42.

The overall B and EM envelope parameters are combined into a single exponential attenuator, so their separate input / refinement is provided only as a convenience.

Only isotropic corrections are supported. (Anisotropic corrections are not possible within the density calculation algorithms used.)

Contrast Transfer Function (CTF)

Additional EM corrections beyond the envelope function are no longer supported, due to 2 challenges:

  • They can deviate significantly from spherical symmetry (spherical aberration etc.), and it is computationally tractable only to apply symmetric corrections with the density calculation algorithms in use.

  • CTF corrections applied to individual images during reconstruction are not really applicable to the whole map that we work with in refinement.

Earlier attempts to account for (merely) the spherically symmetric effects of just systematic errors in CTF correction have been deprecated. They were neither particularly successful or easily rationalized. The spherically symmetric effects of non-optimal CTF corrections are now accounted for with low-pass filtering of the model density.

Low-pass filter / Resolution

The recommended way to account for further attenuation is a Butterworth low pass filter that is, by default, 5th order, as in the EM software, Spider. The following section describes how this can be used to refine the effective (soft) resolution limit, which may be easier to estimate by comparison to an atomic model than by other means.

Some EM packages, notably EMAN, support a Gaussian attenuation. A Gaussian attenuation could also be applied in RSRef by increasing the overall B-factor or envelope constant (see above), but there is currently no support for calculating the additional B-factor corresponding to a desired soft resolution limit.

Image Parameter Refinement with image_refine

While imaging parameters could, in principle, be refined jointly with atomic parameters, a separate routine, image_refine is provided, for greater efficiency and to ensure that imaging parameters are refined using the full model and not the subset of atoms that might have been selected for a batch of local model refinement.

The convergence radius of image refinement is finite, particularly when multiple parameters are being refined. There are several reasons:

  • An increase in magnification decreases the number of grid points within the molecular envelope, systematically biasing the least-squares residual downwards. This can be mitigated in part with the rsref --normalize command-line option. However, refinement may still not be possible, especially if the anticipated change is substantial. A search for the maximal correlation coefficient may be more appropriate.

    • A search can be done by alternating the following commands with different values for the magnification:

    pasto.rsref ‣ evaluate

    pasto.rsref ‣ py my.map_calc.magnify(magnification=0.98, atoms=my.atoms)

    pasto.rsref ‣ evaluate

  • The effects of changing B-factor / envelope may be similar to changing the filter resolution, particularly with a low resolution map. Thus the parameters may be nearly co-linear and the refinement ill-conditioned.

  • The filter resolution may have little impact if its value is less than or near equal to the hard resolution limit.

  • The gradient vector may be so dominated by some parameters that others have negligible effect and will appear frozen.

  • Shift magnitudes that must be guessed on the first cycle, may move some parameters beyond reasonable ranges from which refinement can recover.

  • Analytical derivatives are determined by summing the effects of imaging parameters on the density of each atom only within the map_use radius (see below). It therefore underestimates some of the long range effects.

These problems tend to get worse at low resolution, but their impact also depends on the size of the refinement and therefore the number of grid points that are contributing.

The same types of problems could afflict both imaging and model refinements, but it is imaging refinement that is usually more challenging, due to the mix of parameter types.

There are several ways that the above affects can be mitigated, but they depend somewhat on the size, stage and resolution of the refinement. Thus program defaults may need some adjustment:

Units

The mathematics of least-squares optimization implicitly assumes that all refining parameters have the same units. Clearly atomic positions, B-factors and group rotations do not! Physical units are converted to internal units for refinement. Ideally, the scaling should give components of the gradient vector that:

  • Are in rough proportion to the effect of typical errors in this parameter type on the (fitting) objective function.

  • Are within an order of magnitude (or two) for different parameter types, so that refinement does not ignore some parameter types.

  • Do not take parameters out of reasonable ranges on the 1st cycle.

RSRef prints average values for gradient components. Unit scaling is changed by adjusting the --units_per_* command-line arguments.

Limits

The option for constraining parameters is internally set with limits that are, by default, +/- 5% in magnification & +/- 40% in resolution. Convergence may be improved with narrower limits, set with the --limit* command-line arguments.

Finite differences

For image refinement (not atomic refinement) a finite_difference option is available to switch to ~5 x slower numerical derivatives. This avoids some inaccuracies of approximation (above) and can help particularly at low resolution.

With the image_refine routine, it is anticipated that image parameters will be refined at the start, and occasionally as the model improves.

Model parameterization

RSRef provides several modes with which model parameters can be individually or collectively optimized in stand-alone mode. These complement, rather than replicate those available through embedded use in other programs.

Individual atom

Without restraints, this is only of limited use, perhaps for solvents and counterions, and requiring pretty high resolution.

Group

Groups can be arbitrarily defined, for example as domain or amino acids. Domain refinement might be appropriate for electron microscopy at resolutions worse than about 5 Å.

Groups are implemented to apply a consistent change to the component atomic parameters. Unlike many other programs, they are not required to share the same initial properties (although they can optionally be set to be so). Thus, one can refine an additional B-factor for a domain while keeping the underlying variation in individual atomic B-factors that might have come from a high resolution structure.

Torsion

Currently implemented is optimization of protein φ and ψ angles only (with riding rigid side chains). This offers a reduced parameterization that can be appropriate for modeling conformational changes to well-characterized underlying structures.

To refine all φ, ψ (without stereochemical restraints) requires a resolution at which adjacent backbone chains can be resolved in the density. Limited experience indicates that this may be possible with a good 5 Å map, but not at 7 Å resolution. At resolutions lower than ~5 Å, it will be necessary to limit the variable dihedrals to those within domain linkers, hinges etc.. (Currently, these would have to be specified manually.) It is also expected that the addition of van der Waals restraints will help avoid strands overlapping at low resolution.

At high resolutions (about 2 Å), the lack of side chain refinement is likely to be limiting.

Limited experience shows excellent performance in morphing a high resolution structure from one conformational state into a map of a different conformational state at resolutions of 5 to 2.5 Å, with a convergence radius that can exceed 3 Å.

The torsion angle algorithm performs rotations that are rigid either site of each dihedral. This is different from most implementations that window on short fragments of the structure (in turn), allowing structural deformations where it meets the rest of the (fixed) structure, deformations that are subsequently refined out through stereochemical restraints. The RSRef algorithm is best suited for large hinge rotations and shears of loops and domains, and thus complements the algorithms in Xplor and CNS that are better at local improvements more typically needed in a manually-built model. Thus, expect RSRef to perform better in a molecular replacement or conformational change situation, but other programs to better improve models that have been built from scratch.

Command-line options

The most up-to-date documentation is generated from rsref.py -h:

(Command: /trihome/chapmanms/Devel/RSRef/FTatom/pasto/rsref.py -h)

(Sources: /trihome/chapmanms/Devel/RSRef/FTatom/pasto v1.0.6)

usage: rsref.py [-h] [–max_shift DIST] [–b_overall B_OVERALL] [–local_symmetry FILE] [–space_group SYMBOL] [–lattice_translations UNIT_CELLS] [–completion residue_or_chain_or_none] [–output unique_or_local_or_full_or_none] [–unit_cell a b c alpha beta gamma] [–atom_extent SIZE | –relative_extent SIZE] [–map_use DIST | –relative_use DIST] [–map_require DIST | –require_relative DIST] [–form_factors XCCP4_or_ERSRef_or_ELECTRON_or_NCCP4_or_XTNT] [–magnification_limits MIN MAX [–filter_limits MIN MAX [–units_per_magnification FLOAT] [–units_per_envelope FLOAT] [–units_per_resolution FLOAT] [–map MAP] [–normalize | –no_normalization | –scale_to_model] [–orientation ZYX_or_YXZ_or_XZY_or_XYZ_or_YZX_or_ZXY] [–high_resolution LIMIT] [–resolution RESOLUTION] [–low_resolution LIMIT] [–em_envelope EM_ENVELOPE] [–magnification FACTOR] [–weight FLOAT] [–torsion_limit FLOAT] [–torsion_weight FLOAT] [–estimate_impact] [–units_per_A FLOAT] [–units_per_rad FLOAT] [–units_per_torsion FLOAT] [–units_per_A2 FLOAT] [–units_per_occ FLOAT] [–stereochemistry FLOAT] [–bond_length FLOAT] [–b_restraint FLOAT] [–van_der_waals FLOAT] [–in_nomenclature STR] [–out_nomenclature STR] [–version] [–infile FILE] [–outfile FILE] [–impact_annotation FILE] [INPUT.PDB] …

Real space refinement; model-map fit. (c) OHSU 2010-18; University of Missouri 2018-24, Michael S. Chapman

positional arguments:
… Program commands may follow required INPUT.PDB. Commands are space-separated, quoted if containing white

space.

options:
-h, --help

show this help message and exit

--atom_extent SIZE, -a SIZE

Distance beyond which atom’s density considered zero (A). Suggest max{resolution, vdW radius}. (default: 3.4)

--relative_extent SIZE, -A SIZE

Distance beyond which atom’s density considered zero, relative to –high_resolution (suggest 1.0). (Note that not relative to the refinable –resolution.)

--map_use DIST, -u DIST

Use available map grid points within this distance of atoms (Angstrom; Suggest max{d_min/2, vdW radius}). (default: 2.0)

--relative_use DIST, -M DIST

Use grid points within this distance of atoms, relative to –high_resolution (Suggest 0.5). (Note that not relative to the refinable –resolution.)

--map_require DIST, -r DIST

Require map grid points within this radius of atoms to be available (Angstrom; negative to disable, suggest <= map_use).

--require_relative DIST, -R DIST

Require map grid points within this radius of atoms to be available, relative to map_use / relative_use; negative disables). (default: 1.0)

--normalize, -N

Normalize map to mean of 0, stdev of 1 x voxel volume. (Recommended, refinement weighting less dependent on map scale, resolution.) (default: True)

--no_normalization, -n

Do not normalize the map. (Reported scale constants can be used to put map on absolute scale. default: False)

--scale_to_model

Scale map to selected atoms of model instead of scaling model to map. (Some types of refinement not supported, but used to put map on absolute scale.) (default: False)

--version, -v

show program’s version number and exit

--infile FILE

Redirected standard input, like “<”. (default: <_io.TextIOWrapper name=’<stdin>’ mode=’r’ encoding=’utf-8’>)

--outfile FILE

Redirected standard output, like “>”. (default: <_io.TextIOWrapper name=’<stdout>’ mode=’w’ encoding=’utf-8’>)

Input model parameters:
--max_shift DIST, -s DIST

Atomic shift that will trigger recalculations (neighbors etc., A, default (None) –> high_resolution/2). (default: None)

--b_overall B_OVERALL, -B B_OVERALL

Isotropic B-factor to be added to all atoms (A^2, equivalent to EM_ENVELOPE * 4). (default: 0.0)

INPUT.PDB Input PDB file. (default: None)

Symmetry (local & crystal lattice):
--local_symmetry FILE, -l FILE

File of local (molecular) symmetry operators (Cartesian Angstrom). (default: None)

--space_group SYMBOL, -G SYMBOL

Hermann-Mauguin symbol (Int. Tables; None if isolated particle (EM)). (default: None)

--lattice_translations UNIT_CELLS

Number of translations to search in each direction for neighbors. Usually 1 suffices, 2 sometimes adds more, but takes longer (check). Zero disables. (default: 1)

--completion residue_or_chain_or_none

Expand symmetry neighbors to include all atoms within each “residue” or “chain”. (default: None)

--output unique_or_local_or_full_or_none

Select which previously expanded symmetry equivalent atoms to output to PDB. “unique” (“None”) for no equivalents; “local” for NCS symmetry; or “full” for local + neighbors that are related by crystal symmetry operators & lattice translations (that will be output together in a non-standard PDB file). (default: local)

--unit_cell a_and_b_and_c_and_alpha_and_beta_and_gamma, -U a_and_b_and_c_and_alpha_and_beta_and_gamma

Unit cell parameters (over-rides those in header of some maps). (default: None)

Density calculation & comparison:
--form_factors XCCP4_or_ERSRef_or_ELECTRON_or_NCCP4_or_XTNT, -F XCCP4_or_ERSRef_or_ELECTRON_or_NCCP4_or_XTNT

Form factor table for calculation of atomic density. XCCP4 (X-ray), ERSRef (electronic), ELECTRON (electronic), NCCP4 (neutron) or XTNT (X-ray). (default: XCCP4)

Image refinement:
--magnification_limits MIN_and_MAX

Relative magnification, refinement limits. (default: (0.95, 1.05))

--filter_limits MIN_and_MAX

Low-pass filter resolution attenuation, refinement limits, fractional, relative to –resolution. (default: (0.6, 1.5))

--units_per_magnification FLOAT

Parameter scaling in refinement: magnification. Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

--units_per_envelope FLOAT

Parameter scaling in refinement: EM envelope. Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 1.0)

--units_per_resolution FLOAT

Parameter scaling in refinement: Resolution for low-pass filter. Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

Experiment / Map parameters:
--map MAP, -m MAP

Map file (format indicated by extension: .xplor, .cns., .mrc., .ccp4 etc.). (default: None)

--orientation ZYX_or_YXZ_or_XZY_or_XYZ_or_YZX_or_ZXY, -O ZYX_or_YXZ_or_XZY_or_XYZ_or_YZX_or_ZXY

Map orientation file - axis changing fastest, medium, slowest (ignored for xplor/cns). (default: ZYX)

--high_resolution LIMIT, -H LIMIT

Hard high resolution limit of map (d_min, Angstrom). (default: 3.0)

--resolution RESOLUTION, -S RESOLUTION

Resolution estimate of data (soft; low pass filter applied to model) (Angstrom). (default: None)

--low_resolution LIMIT, -L LIMIT

Low resolution limit of max (d_max, Angstrom). (default: 999.9)

--em_envelope EM_ENVELOPE, -E EM_ENVELOPE

Exponent for EM envelope function (A^2). (Low-pass attenuator; equivalent to B_OVERALL/4). (default: 0.0)

--magnification FACTOR, -X FACTOR

Correction for (EM) magnification (factor by which pixel size is decreased). (default: 1.0)

Optimization / target function:
--weight FLOAT, -W FLOAT

Weight: experimental component in optimization target function. (default: 1.0)

--torsion_limit FLOAT, -P FLOAT

Sum of torsion angle changes (deg.) beyond which a restraint is imposed. (default: 0.0)

--torsion_weight FLOAT, -p FLOAT

Weight: restraint on total torsion angle change. (default: None)

--estimate_impact, -T

Impact of dihedrals, used for filtering, tables, estimated from gradients or sequential changes (see impact command options), else from dihedral changes in a prior refinement. It is users responsibility to ensure that prior refinement isperformed with an L1-Norm restraint. (default: False)

Model parameterization for refinement:
--units_per_A FLOAT

Parameter scaling in refinement: Angstrom (positions). Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 1.0)

--units_per_rad FLOAT

Parameter scaling in refinement: Rigid-group rotations. Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

--units_per_torsion FLOAT

Parameter scaling in refinement: Dihedral bond rotations (rad). Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

--units_per_A2 FLOAT

Parameter scaling in refinement: Thermal/displacement (B-)factors. Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 0.01)

--units_per_occ FLOAT

Parameter scaling in refinement: Occupancies (fractional). Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

Optimization restraints:
--stereochemistry FLOAT, -g FLOAT

Weight, overall, on stereochemical restraint. (default: 0.0)

--bond_length FLOAT, -D FLOAT

Weight on bond length restraints. (default: 1.0)

--b_restraint FLOAT, -b FLOAT

Weight: restraint on B-factor differences between adjacent atoms. (default: 0.0)

--van_der_waals FLOAT, -d FLOAT

Weight: van der Waals repulsive restraint. (default: 0.0)

Nomenclature translation:
--in_nomenclature STR, -q STR

From input: bmrb, cif, cns, diana, midas, msi, pdb92, sc, sybyl, ucsf, xplor, lax; lax for no translation, None for auto-determine (default: None)

--out_nomenclature STR, -Q STR

To output: bmrb, cif, cns, diana, midas, msi, pdb92, sc, sybyl, ucsf, xplor; default: same as –in_nomenclature (-q) if defined, else internal convention if None (default: None)

Output options:
--impact_annotation FILE, -I FILE

Additional plot commands for annotating impact graph. (default: None)

+<file> inserts options from <file>, one per line.

Deviations for RSRef embedded in other programs

Currently, only a CNS-embedded version has been implemented, but it is expected that other wraps would be very similar.

CNS-embedded version

RSRef control is excercised through command-line parameters that are passed through CNS. CNS does not use command-line parameters, so conflicts on input are not expected.

Many option (categories) are not relevant to the embedded functionality are are ignored without warning. Some of these may be similar to CNS parameters which take precedence. There are a few cases where similar input to both RSRef and CNS are expected, likely points of confusion:

Input model parameters

The coordinates are read from the file specified in the CNS input file, not through RSRef specification. --b_overall is the only model-based argument that is applied (so that an EM envelope correction can be applied in addition to previously determined crystallographic B-factors). An equivalent of --max_shift is available through CNS.

Ignored arguments

help, version, also all arguments in the groups Model parameterization and Image refinement. Image refinement can only be run in stand-alone mode, but the resulting parameters can be input to a cns run.

Density calculation & comparison

These arguments should be supplied. Note that for density calculation, form_factors supplied here are used and not those that may be specified in the CNS input file.

Experiment & Map parameters

These are all needed. It is the map supplied here that will be used as the experimental (observed) input map. (CNS map parameters pertain to maps that are to be output.) Resolution limits entered into CNS are ignored by RSRef, which uses the command-line values for density calculation.

Symmetry

RSRef and the embedding CNS need consistent information provided in different formats. Apologies for the confusion likely in specifying the symmetry in different formats, but the needs are different. CNS needs two types of expansion: (a) to an exact asymmetric unit for Fourier transformations, noting that all atoms are required; (b) to all neighboring atoms for stereochemical restaints. CNS is designed to operate always in the presence of lattice symmetry.

By contrast RSRef can exploit efficiencies if refining only a local subset of atoms, but needs to consider potentially overlapping density from neighbors that might come from the same molecule or multiple neighbors, if there is symmetry or lattice periodicity. Note that coordinate output will be handled by CNS, so none of the options specifying which symmetry-related coordinates to write have any effect.

Input files

local_symmetry

A file that will be evaluated (in python) as a tuple of operators. Each operator is a tuple of name (str), rotation (tuple) and translation (tuple of 3 Angstrom floats). The rotation is the matrix specified as three row-tuples, each as a tuple of 3 floats). The unit operator is implied and therefore optional. The example below specifies two additional symmetry equivalents:

(   ("p",    (
        (0.500000,       0.809000,      -0.309000),
        (-0.809000,       0.309000,      -0.500000),
        (-0.309000,       0.500000,       0.809000)
            ),
        (0.000000,       0.000000,       0.000000)),
    ("e",    (
        (0.309000,      -0.500000,       0.809000),
        (0.500000,       0.809000,       0.309000),
        (-0.809000,       0.309000,       0.500000)
            ),
       (0.000000,       0.000000,       0.000000)))

Command interpreter

The command interpreter is used for program control in stand-alone mode, but generally not when called from / embedded in another package. After the program has been invoked with options/arguments that are parsed, program flow is controlled with a series of user-entered interactive commands.

These run-time commands are interpreted with an extenstion (cmd2nest) of the cmd2 package that adds support for nested sub-commands. The following will summaraize the essentials of program control and note differences from cmd2. The cmd2 documentation provides additional detail.

Syntax:

Commands are entered at the prompt in a unix-shell style:
  • command [option(s)] [positional argument(s)]

  • where options can be provided as -x [value] (x is a single letter), or --option[[=]value] in “two-dash” long form.

  • Various standard short-cuts are pre-defined, and tab completion is available for commands and arguments.

    Error

    The abbreviation given in the help is not always long enough to be unique (a bug inherited from cmd2.

Error

Tab completion fails after exiting a sub-command.

Shell:

The cmd2 shell-like interface is inherited, offering history, command editing and redirects. Redirects should work (<, |, >).

Hierachical structure:

Sub-commands are only available after entering the command. Higher-level commands are generally not available in sub-commands. The exceptions are general utility commands such as shell, shortcuts & set. Help, by default, is specific to the command level. The --descend (-d) documents one sub-command level deeper & --recursive --long (-rl) decends through all levels exhaustively. Note that load (“@”) & related commands and history, do not transcend different command levels.

Just-in-time calculation & pre-requisites:

A number of efficiencies are possible by pre-calculating and repeatedly using objects. Rather than pre-calculating at startup all objects that might be needed, the program attempts to calculate the minimal needed, just-in-time. For the most part, the pre-requisites are figured out and tasks are executed when needed using pre-assigned (or default) parameters. One exception is that any command with a “parameterize” pre-requisite will issue an error message if not already performed (mind-reading is not an option!).

The order that commands are entered is sometimes important, particularly when the embedded python interpreter is invoked with “py” (see below). Given the flexibility of the “py” command, there is no way to figure out the pre-requisites. Users should be especially attentive to AttributeErrors that might indicate an unmet pre-requisite dependence.

Error recovery:

Inherited from cmd2, exceptions are captured at the Command level, printing at least an error message, but without aborting the whole program. On interactive use, this conveniently often offers a second chance. If run as a script, users should search the output for “Error”, lest one has scrolled by. The default is a terse error message, but this can be changed to a full traceback using “set debug True” (still does not abort).

Selected commands - implementation-specific extensions & limitations.

@FILE or run_script FILE

Used to run commands from an external file. The limitation is that commands cannot descend/ascend through nested sub-commands. Thus, for example, commands within parameterize would have to be given separately. The same limitation applies to variants _relative_run_script (@@).

Command-line commands

Tokens following any of the program’s required positional parameters (e.g. INPUT.PDB) will be run as top-level commands. There is no support for transcending sub-commands or for command options:

Example: rsref.py /dev/null help shortcuts exit

Options can be incorporated by using the cmd2 run_script shortcut on the command line (Example1), where cmd.txt has one command per line that can include options and whitespace. Even better, use the --infile option (imported from argparser)

Example1: rsref.py /dev/null @cmd.txt exit
Example2: rsref.py /dev/null --infile=cmd.txt

done, exit and quit

These are near synonyms to mitigate a problem with cmd2’s error-handling in scripted runs with sub-commands. On an exception within a sub-command, the program terminates (just) the sub-command, and continues reading commands that had been intended for the terminated sub-command, but applying them mistakenly to continued exectuation in the higher level from where the sub-command was invoked. Should a quit (or exit), intended for a sub-command, be encountered at top-level, the program can terminate before any results are saved. To avoid this, the base cmd2 commands have been overridden:

  • exit is only available from the top level.

  • done is only available from sub-commands.

  • quit is unsafely available from both.

Thus, if done is used exclusively to finish a sub-command, if it is invoked accidentally at the top level, it will lead to an unrecognized command error, and remaining top-level commands (eg. saving results) will executed before the program is terminated with exit. The unsafe quit can be used interactively and repeatedly to bail out of a failed run if the sub-command level is unclear.

python interpreter

py (without statement) opens a python shell within which multiple statements can be executed, terminating the shell with Cntrl-D, exit() or quit(). These provide powerful ways of customizing the programs and extending functionality beyond the commands that are provided.

In our extension of cmd2, namespace my provides access to objects within the task-space of the program. Thus, for example, atomic B-factors could be printed or manipulated using my.atoms.b, and program options with my.option.resolution (for example).

The python shell is executed in its own namespace, so modules (such as sys or numpy) have to be imported explicitly.

Additional examples are given in sections “Python Interpreter for advanced functionality” in the documentation for specific programs.

Limitations, bugs & work-arounds

The single-line variant, py statement, executing a single python statement, was deprecated in package cmd2 v2.4. There may be legacy scripts that will need updating.

  • Annotations

    py print("\nAnnotation for stdout") was a common use. Consider the alternative: shell echo -e "\nAnnotation for stdout" which will avoid additional output from interpreter start and termination. The shell alternative is only possible if access to python attributes is not required.

  • Redirects (> , < and |) on the py command line

    are captured by the cmd2 parser, not any shell that might redirect io for the python interpreter. This means that shell heredoc file and variables are not supported within py.

  • Support for compound single-line statements (py stmt1; stmt2)

    with a ‘;’ separator disappeared from cmd2 prior to v2.4. A common usage was to import a module, then use a module attribute. Subsequently, code following ‘;’ was ignored silently, affecting some legacy scripts.

All of these issues are by-passed by invoking the full python shell instead of the single-line py command.

Other commands available in all applications

Use application help <command> for further details:

alias

Manage aliases

edit

Run a text editor and optionally open a file with it

exit

Safe program exit from top level, not subcommands.

help

List available commands or help for one or all commands

history

View, run, edit, save, or clear previously entered commands

macro

Manage macros

parrot

Echo commands (or not)

py

Invoke Python shell, ending exit(); single line “py command” no longer supported; see comments above and also shell

quit

Exit this application

run_pyscript

Run a Python script file inside the console

run_script

Run commands in script file that is encoded as either ASCII or UTF-8 text

set

Set a settable parameter or show current settings of parameters

shell

Execute a command as if at the OS prompt

shortcuts

List available shortcuts

test

test [<name>]: run test (developers only)

Command list

The most up-to-date documentation is generated from rsref.py /dev/null help exit:

Documented commands (use 'help -v' for verbose/'help <topic>' for details):
===========================================================================
alias     help          neighbors     profile    restraints    set
analyze   history       parameterize  py         rotamer       shell
edit      image_refine  parrot        quit       run_pyscript  shortcuts
evaluate  macro         pdbout        randomize  run_script    test
exit      map           perturb       refine     select

Miscellaneous help topics:
==========================
SELECTION_EXPR

pasto.rsref>  exit

Command Help

The most up-to-date documentation is generated from rsref.py then help -rl

alias

Usage: alias [-h] SUBCOMMAND ...

Manage aliases

An alias is a command that enables replacement of a word by another string.

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  SUBCOMMAND
    create    create or overwrite an alias
    delete    delete aliases
    list      list aliases

See also:
  macro

analyze

usage: analyze [-h] [-s]

Analyze prior refinement: shifts, gradients & impact of parameters.

options:
  -h, --help     show this help message and exit
  -s, --summary  Summary only without itemizing groups or dihedrals.

analyze sub-commands

convergence
usage: convergence [-h] [-s]

Analyze the gradients and shifts of prior refinement.

options:
  -h, --help     show this help message and exit
  -s, --summary  Summary only without itemizing groups or dihedrals.
dihedrals
usage: dihedrals [-h]

Changes in torsion angles and their impact

options:
  -h, --help  show this help message and exit
dihedrals sub-commands
hinges
usage: hinges [-h] [-i] [-g INT] [-t FLOAT] [-p] [-s START] [-e END]

Find hinges in dihedral changes.

options:
  -h, --help            show this help message and exit
  -i, --information     Report statistics, then exit immediately.
  -g INT, --gap INT     # residues w/o dihedral changes that can be bridged in a hinge
  -t FLOAT, --threshold FLOAT
                        above which dihedral rotations considered hinges. Degrees if > 1.0, else fraction of total change.
  -p, --pseudo          combine phi_i with psi_i-1, else individual dihedrals.
  -s START, --start START
                        start of an explicitly defined hinge: chain-residue, requires --end, repeatable.
  -e END, --end END     end of an explicitly defined hinge: chain-residue, requires --end, repeatable.
impact
usage: impact [-h] [-i] [-I IDENTIFY] [-p] [-R]

Estimate impact of refined torsion angle changes on superimposition.

options:
  -h, --help            show this help message and exit
  -i, --iterative       Iteratively determines/applies highest impact dihedral changes. This rarely-needed option is compute-
                        intensive, because it iteratively finds the single, highest impact dihedral, applies the rotation, and
                        repeats. Otherwise, by default, impact is assessed more rapidly from the dot product of the gradient and
                        shift vectors, integrated over refinement iterations.
  -I IDENTIFY, --identify IDENTIFY
                        List this number of the top-impact dihedrals.
  -p, --pseudo          Use approximation to pseudo-torsion angles (phi_i + psi_{i-1}) instead of individual phi, psi. (Limits
                        output options, but speeds option iterative.)
  -R, --recover         Use previous calculation of impact, other options invalid.
impact sub-commands
color
usage: color [-h] [-p] file

Output coordinates, B-factors set to percent impact on fit of dihedrals.

positional arguments:
  file          PDB output

options:
  -h, --help    show this help message and exit
  -p, --pseudo  Use pseudo-torsion angle (phi_i + psi_i-1), else individual phi, psi; ignored if impact --pseudo.
pickle
usage: pickle [-h] file

Save impact-needed data, so impact.py can replot without recaclulation (when tweaking plot). WARNING: objects picked by superpose
must be compatible with impact - has not been checked recently.

positional arguments:
  file        output jar format, pickled impact object for later analysis.

options:
  -h, --help  show this help message and exit
plot
usage: plot [-h] [-p impact|change] [-c] [-t] [-f FILE] [-A FILE] [-O]

Graph impact of dihedrals on fit.

options:
  -h, --help            show this help message and exit
  -p impact|change, --prior impact|change
                        Replace impact or dihedral change with values from previous refinement (to superimpose restrained impact
                        upon unrestrained dihedral changes).
  -c, --changeOnly      plot only dihedral changes, not estimate of impact.
  -t, --tty             Display graph on terminal (not recommended for background jobs).
  -f FILE, --file FILE  save plot in graphics file of type given by extension (.emf, eps, jpeg, jpg, pdf, png, ps, raw, rgba, svg,
                        svgz, tif, tiff)
  -A FILE, --annotation FILE
                        File containing additional plot commands, overriding command-line input.
  -O, --overall         Dihedral changes (not impact) relative to reference (input) structure not refinement batch.
print
usage: print [-h] [-O]

Tabulate changes in dihedrals & impact on fit.

options:
  -h, --help     show this help message and exit
  -O, --overall  Dihedral changes (not impact) relative to reference (input) structure not refinement batch.

(Also: alias, done, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent dihedrals command.) (end impact sub-commands)


paint
usage: paint [-h] [-p] [-n] file

Output coordinates, B-factors set to dihedral change for visualization.

positional arguments:
  file              PDB output

options:
  -h, --help        show this help message and exit
  -p, --pseudo      Use pseudo-torsion angle (phi_i + psi_i-1),else individual phi, psi.
  -n, --normalized  Scaled between 0 & 100.

Output PDB file has B-factors replaced by magnitude of phi/psi changes!

(Also: alias, done, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent analyze command.) (end dihedrals sub-commands)


done
usage: done [-h]

Safe sub-menu exit, returning to higher level command.

options:
  -h, --help  show this help message and exit

(Also: alias, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent pasto.rsref command.) (end analyze sub-commands)


edit

Usage: edit [-h] [file_path]

Run a text editor and optionally open a file with it

The editor used is determined by a settable parameter. To set it:

  set editor (program-name)

positional arguments:
  file_path   optional path to a file to open in editor

optional arguments:
  -h, --help  show this help message and exit

evaluate

usage: evaluate [-h] [-v] [FILE]

Calculate statistics comparing current model to map.

positional arguments:
  FILE           [different-pdb-file]

options:
  -h, --help     show this help message and exit
  -v, --verbose  Including scaling information.

exit

usage: exit [-h]

Safe program exit from top level, not subcommands.

options:
  -h, --help  show this help message and exit

help

usage: help [-h] [-v] [-d] [-l] [-r] [command] ...

List available commands or help for one or all commands

positional arguments:
  command          [command-name, optional]
  subcommands      subcommand(s) to retrieve help for

options:
  -h, --help       show this help message and exit
  -v, --verbose    single-line description of each command
  -d, --descend    Document not the command(s) but sub-commands thereof
  -l, --long       Full (multi-line) help for each command.
  -r, --recursive  Recursively descend through nested command sets.

history

Usage: history [-h] [-r | -e | -o FILE | -t TRANSCRIPT_FILE | -c] [-s] [-x] [-v] [-a] [arg]

View, run, edit, save, or clear previously entered commands

positional arguments:
  arg                   empty               all history items
                        a                   one history item by number
                        a..b, a:b, a:, ..b  items by indices (inclusive)
                        string              items containing string
                        /regex/             items matching regular expression

optional arguments:
  -h, --help            show this help message and exit
  -r, --run             run selected history items
  -e, --edit            edit and then run selected history items
  -o, --output_file FILE
                        output commands to a script file, implies -s
  -t, --transcript TRANSCRIPT_FILE
                        output commands and results to a transcript file,
                        implies -s
  -c, --clear           clear all history

formatting:
  -s, --script          output commands in script format, i.e. without command
                        numbers
  -x, --expanded        output fully parsed commands with any aliases and
                        macros expanded, instead of typed commands
  -v, --verbose         display history and include expanded commands if they
                        differ from the typed command
  -a, --all             display all commands, including ones persisted from
                        previous sessions

image_refine

usage: image_refine [-h] [-M] [-B | -E] [-R] [-C INT] [-i FLOAT] [-g FLOAT] [-f] [-v INT] [-?]

Refine image (map) parameters to maximize agreement with atomic model.

options:
  -h, --help            show this help message and exit
  -M, --magnification   Optimize magnification.
  -B, --overall_B       Optimize overall B (-B & -E mutually exclusive).
  -E, --envelope        Optimize EM envelope function (-B & -E mutually exclusive).
  -R, --resolution      Optimize resolution (low-pass filter on atomic model to best match map).
  -C INT, --max_cycles INT
                        Maximum number of cycles.
  -i FLOAT, --min_improvement FLOAT
                        End when per-cycle improvement falls below this value.
  -g FLOAT, --min_grad FLOAT
                        End when gradient norm falls below this value.
  -f, --finite_difference
                        Numerical derivatives instead of analytical.
  -v INT, --verbosity INT
                        Per-cycle logging: -1 (terse) to 5 (verbose).
  -?, --confirm         Request Y/N conformation before acceptance.

macro

Usage: macro [-h] SUBCOMMAND ...

Manage macros

A macro is similar to an alias, but it can contain argument placeholders.

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  SUBCOMMAND
    create    create or overwrite a macro
    delete    delete macros
    list      list macros

See also:
  alias

map

usage: map [-h] [-c] [-d] [-o] [-s] [-t TITLE] FILE

Write density map(s) on the grid of the input map.

positional arguments:
  FILE                  map

options:
  -h, --help            show this help message and exit
  -c, --calculated      Simulated from the atomic model using current imaging parameters.
  -d, --difference      Observed minus calculated, scaled & using imaging parameters.
  -o, --observed        Input experimental map, having applied any change in magnification.
  -s, --scaled          Input experimental map, having applied any change in magnification, and scaled to atomic model (required
                        command-line option --scale_to_model.)
  -t TITLE, --title TITLE
                        Title used in map header (quote to protect white-space & newlines).

Model and difference density is calculated for atoms and their immediate (symmetry-related) neighbors that have been selected
previously. This is consistent with RSRef's modality of evaluating the fit over molecules rather than a 3D unit cell volume. If
the structure has molecular or crystallographic symmetry, and the statistics of the map (scale, correlation, sigma) are to be
analyzed externally throughout a parallelepiped, symexp.py should be used first to fill the volume (plus a margin) with the
symmetry-related atoms. (Although popular, such volume-based assessments, without a molecular molecular mask, are discouraged,
because the solvent volume massages the statistics.)

neighbors

usage: neighbors [-h] [-d FLOAT]

Identify neighbors within distance of (selected) atoms.

options:
  -h, --help            show this help message and exit
  -d FLOAT, --distance FLOAT
                        Searches for neighbors within distance of any atom (default: 3.5 A).

parameterize

usage: parameterize [-h] [-b] [-o] [-p]

Deisgnate how chosen atomic parameter(s) are to be refined.

options:
  -h, --help       show this help message and exit
  -b, --bfactor    Designate which B-factors to be refined.
  -o, --occupancy  Designate which occupancies to be refined.
  -p, --position   Designate which atomic positions (xyz) to be refined.

Repeatable, to include additional parameters in coming refinements.

parameterize sub-commands

SELECTION: NAME | EXPRESSION
NAME: pre-defined selection created by the top level command “select”.

Examples: rigid[‘N_domain’] or my_subdomain (if previously defined).

EXPRESSION: array expression using keywords such as residue number,

chain, atom number, synonyms or unique abbreviations. Criteria can be combined with python operators: &, |, ^, ==, !=, ~, >, >=, <, <=, (, ), ... or Fortran synonyms which will be translated into python: .and., .OR., .xor., .NEQV., .eq., /=, .NOT., .gt., .GE., etc. Use quotes to escape white space, and to avoid command line parsing redirection (>) or piping (|), use .GT., .ge., .OR. or ‘^’. Quoting EXPRESSION avoids shell interpretation of special characters Examples: chain==B ‘(chai == C) & (residue_ty != HOH)’ ‘(residue num <= 105) ^ ((resnum .ge. 300) & (resnam != ATP))’

clear
usage: clear [-h]

Switch off all refinement of requested parameter type.

options:
  -h, --help  show this help message and exit

Individual parameterizations can be switched off with "group None", "individual None", "torsion None", "overall False"
done
usage: done [-h]

Safe sub-menu exit, returning to higher level command.

options:
  -h, --help  show this help message and exit
group
usage: group [-h] [group]

Select atoms to be refined as one or more groups.

positional arguments:
  group       GROUP | COLLECTION | SELECTION | SELECTION_EXPR | GROUP_EXPR

options:
  -h, --help  show this help message and exit

SELECTION: a previously saved single Selection, given as: COLLECTION['ITEM']. (COLLECTION or ['ITEM'] can be omitted if defaults
were used in the corresponding select command.) SELECTION_EXPR: boolean Selection expression, see help select. GROUP_EXPR: list,
dictionary of tuple of multiple quoted SELECTION_EXPR. GROUP | COLLECTION: name of previously saved (cmd select) Selections.
individual
usage: individual [-h] [selection]

Select atoms to be refined individually. (More appropos with exptl. data than superposition.)

positional arguments:
  selection   SELECTION_EXPR | SELECTION:

options:
  -h, --help  show this help message and exit

SELECTION_EXPR: a boolean expression, see help select; SELECTION: name of a previously saved Selection, given as:
COLLECTION[ITEM]. (COLLECTION or [ITEM] can be omitted if defaults were used in the corresponding select command.)
overall
usage: overall [-h] [overall]

Refine requested parameter type as a single group.

positional arguments:
  overall     [True (default)| False]

options:
  -h, --help  show this help message and exit

If just refining overall position, command overlay is better (wider convergence/faster). Use command overall when also
simultaneously refining sub-groups, torsion angles etc..
print
usage: print [-h] [-w INT] [-i SIZE] [-a SIZE] [-c | -b]

Print parameterization. (Options invoked until reset.)

options:
  -h, --help            show this help message and exit
  -w INT, --width INT   Line width (def. 132).
  -i SIZE, --items SIZE
                        Groups are truncated at SIZE Selections, followed by summary statistics (def. 10).
  -a SIZE, --abbreviate SIZE
                        Selections longer than SIZE abbreviated w/ ellipsis (def. 33).
  -c, --count           Report number of moving atoms in selections.
  -b, --boolean         Report selections as T/F boolean arrays (def).
torsion
usage: torsion [-h] [-d SELECTION] [-a SELECTION] [-f phi|psi|pseudo] [-t NUMBER(int)|FRACTION(float)] [-V phi|psi|pseudo] [-v]
               [-s SELECTION] [-S SIZE OFFSET SIZE OFFSET] [-w SELECTION]
               [selection]

Select variable dihedrals and atoms whose positions depend on them.

positional arguments:
  selection             used by -f/-V & resets -d default; See help SELECTION.

options:
  -h, --help            show this help message and exit
  -d SELECTION, --dihedrals SELECTION
                        optimize all variable dihedrals (phi, psi) within SELECTION. Default: macromolecule
  -a SELECTION, --atoms SELECTION
                        atoms to be moved by dihedral rotations if linked to variable bonds (which must be fully enclosed in
                        SELECTION). Default: macromolecule
  -f phi|psi|pseudo, --fix phi|psi|pseudo
                        Fix all dihedrals of named type within atoms in SELECTION (must be only option/argument.
  -t NUMBER(int)|FRACTION(float), --top NUMBER(int)|FRACTION(float)
                        Fix all but these dihedral angles (must be only option/argument; requires prior refinement; to be followed
                        by refine --restart; mostly superseded by --torsion_weight command line option).
  -V phi|psi|pseudo, --vary phi|psi|pseudo
                        Vary all dihedrals of named type within atoms in SELECTION (must be only option/argument.
  -v, --verbose         List selected dihedrals (--top).
  -s SELECTION, --segment SELECTION
                        optimize all variable dihedrals (phi, psi) in a segment. (Repeat -s for each segment; incompatible w/
                        -adftvSV)
  -S SIZE OFFSET SIZE OFFSET, --auto_segment SIZE OFFSET SIZE OFFSET
                        Split SELECTION into segments of SIZE residues, starting at 0th residue + OFFSET in "within"
  -w SELECTION, --within SELECTION
                        Auto-segment only within these residues, default: macromolecule

(Also: alias, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent pasto.rsref command.) (end parameterize sub-commands)


parrot

usage: parrot [-h] [parrot [True|False|Yes|No]]

Echo commands (or not)

positional arguments:
  parrot [True|False|Yes|No]
                        abbrev. OK; default is to toggle

options:
  -h, --help            show this help message and exit

(similar to set echo True|False.)

pdbout

usage: pdbout [-h] [-o FILE] [-a] [-c] [altfile] [header]

Write coordinates (symmetry expansion as per command-line arguments).

positional arguments:
  altfile               [file name]
  header                [optional quoted text]

options:
  -h, --help            show this help message and exit
  -o FILE, --file FILE  Output file name (required argument).
  -a, --anisotropic     Output anisotropic U, riding w/ coordinates from input PDB.
  -c, --c_alpha         Output only C_alphas.

perturb

usage: perturb [-h] [-i] [-n INT] [-r INT] [-d x y z] [-s INT] [-v] FLOAT

Perturb model & calculate statistics.

positional arguments:
  FLOAT                 displacement (A; def. 1.0)

options:
  -h, --help            show this help message and exit
  -i, --individual      Perturb atoms individually, else as rigid body.
  -n INT, --steps INT   Divide displacement into n logarithmic steps (default: 1).
  -r INT, --repeats INT
                        Number of random displacements used to calculate statistics (default: 1).
  -d x y z, --direction x y z
                        Unit vector for direction of perturbation. Chosen randomly if None (default).
  -s INT, --seed INT    Seed for random number generator.
  -v, --verbose         Additional statistics.

profile

usage: profile [-h] [-B FLOAT] [-a TYPE]

Density of single atom vs. distance; a help in setting cut-off radii

options:
  -h, --help            show this help message and exit
  -B FLOAT, --B_factor FLOAT
                        Atomic B-factor (default: 10.0).
  -a TYPE, --atom TYPE  Atom type (def.: C).

py

Usage: py [-h]

Run an interactive Python shell

optional arguments:
  -h, --help  show this help message and exit

quit

Usage: quit [-h]

Exit this application

optional arguments:
  -h, --help  show this help message and exit

randomize

usage: randomize [-h] [-x FLOAT] [-B FLOAT] [-O FLOAT] [-s INT]

Randomize coordinates (with normal error distributions).

options:
  -h, --help            show this help message and exit
  -x FLOAT, --xyz FLOAT
                        Magnitude of desired RMS xyz positional displacement (default: 0.0).
  -B FLOAT, --B_factor FLOAT
                        Desired standard deviation in B-factor (default: 0.0).
  -O FLOAT, --occupancy FLOAT
                        Desired standard deviation in occupancy (default: 0.0).
  -s INT, --seed INT    Seed for random number generator.

refine

usage: refine [-h] [-m {l_bfgs_b,bfgs,Powell,CG,Newton_CG,TNC}] [-i FLOAT] [-a FLOAT] [-g FLOAT] [-C INT] [-n] [-r] [-v INT]

Refine atomic model, per previously defined parameterization.

options:
  -h, --help            show this help message and exit
  -m {l_bfgs_b,bfgs,Powell,CG,Newton_CG,TNC}, --method {l_bfgs_b,bfgs,Powell,CG,Newton_CG,TNC}
                        Optimization method, see scipy.optimize.minimize documentation. (Find l_bfgs_b more stable than bfgs, more
                        efficient than others.)
  -C INT, --max_cycles INT
                        Max iterations (Powell, bfgs, Newton-CG, l_bfgs_b, TNC).
  -n, --new_batch       New batch (losing prior history), else continue prior (default) if exists/possible.
  -r, --restart         From original coordinates (implies -n).
  -v INT, --verbosity INT
                        Per-cycle logging: -1 (terse) to 3 (verbose) [def. 0].

Convergence criteria for ending:
  applicable to specified methods, else ignored

  -i FLOAT, --min_improvement FLOAT
                        Per-cycle minimal residual fractional improvement (l-bfgs-b, TNC).
  -a FLOAT, --accuracy FLOAT
                        Maximal relative estimated error for any parameter (Powell, Newton_CG).
  -g FLOAT, --min_grad FLOAT
                        Minimal gradient norm, internal (pre-conditioned) units (CG, bfgs, l_bfgs_b, TNC).

restraints

usage: restraints [-h] [-i]

Input (and statistics) for supplementary restraints.

options:
  -h, --help         show this help message and exit
  -i, --information  Report statistics, then exit immediately.

restraints sub-commands

distance
usage: distance [-h] [-c] [-i] [-I] [-M MAX] [-m MIN] [-w WEIGHT]
                SELECTION_1 [AND] SELECTION_2 [SELECTION_1 [AND] SELECTION_2 ...]

SELECTION_1 [AND] SELECTION_2

positional arguments:
  SELECTION_1 [AND] SELECTION_2
                        Atom selections restrained to each other

options:
  -h, --help            show this help message and exit
  -c, --clear           Reinitialize before adding distance restraint(s).
  -i, --intramolecular  Consider only distances within one chain.
  -I, --intermolecular  Consider only distances between two different chains.
  -M MAX, --max MAX     Maximum distance between selected atoms
  -m MIN, --min MIN     Minimum distance (if any) between selected atoms
  -w WEIGHT, --weight WEIGHT
                        Weight on the restraint

All atoms in Selection_1 will be retrained, each to any in Selection_2; The two Selections either need to be separated w/ AND, or
white space (& special characters) need to be protected in quotes or parentheses; See main-menu help SELECTION. eg. distance
--min=6.2 -M 15.0 -w 2.5 ((atype==O) & (resnum==17)) "(resnum == 19) & (atype == O)"
done
usage: done [-h]

Safe sub-menu exit, returning to higher level command.

options:
  -h, --help  show this help message and exit

Report statistics on ancillary restraints.

(Also: alias, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent pasto.rsref command.) (end restraints sub-commands)


rotamer

usage: rotamer [-h] [-v INT] [-S] [-p PROBABILITY] [--Nof2d] [--NoFlipSym] [--FlipPseudoSym] [-V VDW] SELECTION

Compare to, and optionally reset amino acids to standard rotamers.

positional arguments:
  SELECTION             Atom selection for residues to be examined / changed (def.: macromolecule). See help SELECTION.

options:
  -h, --help            show this help message and exit
  -v INT, --verbosity INT
                        Print statistics. Option: 0) silent, 1) terse (default), 2) verbose, 3) detailed.
  -S, --standardize     Adjust residue Chi angles to the closest rotamer library entries ('--Nof2d' argument) or standard rotamers
                        best fitting the density (default). Changes atomic structure.
  -p PROBABILITY, --probability PROBABILITY
                        Only consider library rotamers with probability (%) equal to or above this value. Default: 0.0 (%).
  --Nof2d               Do not consider fit to density. Select standard rotamer per agreement of Chi angles.
  --NoFlipSym           No chi+180 rotations in matching rotamers of symmetric sidechains (eg. chi_2 in Asp carboxylate).
  --FlipPseudoSym       Test whether chi+180 rotations of pseudo-symmetric sidechains give better match to standard rotamers (eg.
                        chi_2 His).
  -V VDW, --vdw VDW     Weight on any van der Waals term to be combined with density-fitting. (NOT YET IMPLEMENTED)

run_pyscript

Usage: run_pyscript [-h] script_path ...

Run a Python script file inside the console

positional arguments:
  script_path       path to the script file
  script_arguments  arguments to pass to script

optional arguments:
  -h, --help        show this help message and exit

run_script

Usage: run_script [-h] [-t TRANSCRIPT_FILE] script_path

Run commands in script file that is encoded as either ASCII or UTF-8 text

Script should contain one command per line, just like the command would be
typed in the console.

If the -t/--transcript flag is used, this command instead records
the output of the script commands to a transcript for testing purposes.

positional arguments:
  script_path           path to the script file

optional arguments:
  -h, --help            show this help message and exit
  -t, --transcript TRANSCRIPT_FILE
                        record the output of the script as a transcript file

select

usage: select [-h] [-C NAME] [-n NAME] [-a ATOM_ATTR] [-S NAME] [selection_expr]

Name selection(s) of atoms.

positional arguments:
  selection_expr        [SELECTION_EXPR] SELECTION_EXPR: logical expression of <Selection(s)> (string) using array logical
                        operators: &, |, ==, !=, ~, >, >=, (, ), ... or Fortran synonyms, which circumvent the cmd2 parsing of '>'
                        & '|' as redirects before translation into python: .and., .OR., .xor., .NEQV., .eq., /=, .NOT., .gt.,
                        .GE., etc. evaluated in the task namespace. <Selection> is an existing instance of class Selection or a
                        new one instantiated with S(<criterion>), where string <criterion> is a logical array expression using
                        keywords such as residue number, chain, atom number, synonyms or unique abbreviations, see documentation
                        for Selection. SELECTION_EXPR should be quoted to avoid shell commands / redirects etc, and if provided as
                        a named argument, must be devoid of white-space. Examples:: select -C rigid -n N_domain S('chain == A') &
                        S('residue num <= 105') select -C protein -n C S('(chai == C) & (residue_ty != HOH)') select -C all -n
                        catalytic "rigid['N_domain'] | S('resnam == ATP')" select -C mycollection -a resnum -F protein -S C select
                        --collection=chains --attr=chain

options:
  -h, --help            show this help message and exit
  -C NAME, --collection NAME
                        Collection (dictionary) into which selection is placed (default: None --> "collection" or name of
                        attribute if -a specified).
  -n NAME, --name NAME  Unique name to be given to selection (default: None --> "default"). (-a (--attr) & -n (--name) are
                        mutually exclusive.)
  -a ATOM_ATTR, --attr ATOM_ATTR
                        Selections are made (and named) for each unique value of ATOM_ATTR in the coordinates (see -d, -f).
                        ATOM_ATTR must be specified as a single-word abbreviation/synonym recognized in Selection expressions, eg.
                        --attr=chain might give selections A, B..., while --attr=resnum might give 23, 24,... (-a (--attr) & -n
                        (--name) are mutually exclusive.)
  -S NAME, --selection NAME
                        Selection or dictionary (group) from which --attr subset is to be drawn (default: None --> all atoms).

set

usage: set [-h] [-v] [param] [value]

Set a settable parameter or show current settings of parameters

positional arguments:
  param          parameter to set or view
  value          new value for settable

options:
  -h, --help     show this help message and exit
  -v, --verbose  include description of parameters when viewing

shell

Usage: shell [-h] command ...

Execute a command as if at the OS prompt

positional arguments:
  command       the command to run
  command_args  arguments to pass to command

optional arguments:
  -h, --help    show this help message and exit

shortcuts

Usage: shortcuts [-h]

List available shortcuts

optional arguments:
  -h, --help  show this help message and exit


test [<name>]: run test (developers only)
pasto.rsref>  exit

Additional considerations

Analyze

Run after refine, it provides statistics on shifts, useful when documenting a refinement. It also provides gradients, useful for retrospective adjustment of parameter scales (--unit* arguments), making sure that all parameters have opporunity for refinement. The issue of parameter scaling has been described in the section on Image Parameter Refinement. Although less common, it can also be an issue in atomic refinement, particularly with rigid group or torsional parameterizations for which the gradients can depend on fragment size, and the default internal calibrations may therefore not be good enough. Analyze provides diagnostics for the user to assess convergence.

Warning: if refine is run repeatedly, cycles will be concatonated as if a single run, providing that model parameters have not been changed. If there have been changes, refinement cycles will restart from zero and analyze will report only on the last batch, not the entire session.

Evaluate

Compares the current model (or the optional new_file) to the map, calculating the local correlation coefficient, real-space R-factor, RMS and residual. (By local, we mean map pixels closer to any atom than the distance set by the --map_use argument.)

The -v or --verbose option prints additional scaling statistics. These can be used by EMAN or Spider to put maps on an absolute scale, but this is now deprecated by the map command of RSRef that can do it internally.

Image_refine

Refines experimental parameters to improve the agreement between map and model. These parameters change how density is calculated from the atomic model. Maps that are output (map) will all be adjusted by changes in magnification. Calculated and difference maps, but not output observed maps will reflect changes in B, envelope and resolution.

Parameters:
Magnification

The relative magnification, a scale factor by which the voxel (or unit cell) dimensions of the map should be uniformly divided for optimal agreement with the model.

Overall_B

An isotropic temperature factor that is added to all B-factors, commonly used in crystallography. Atomic densities are calculated from corrected scattering factors: f = fo exp{-Boverall (λ/2d)2} Overall_B is covariant with envelope, so the two cannot be refined together. Both are added to the atomic B-factors before density calculation.

Envelope

A Gaussian attenuation that is commonly used in electron microscopy to account for beam incoherence, detector point-spread etc.. Atomic densities are calculated from the corrected scattering factors: f = fo exp{-EMenvelope (λ/d)2} noting the 4-fold difference compared to overall_B. Note that EM 3D reconstructions have often had an inverse envelope correction applied, so refinement could yield a possitive or negative incremental correction. Not only is envelope covariant with overall_B, but also, partially, with resolution, below, so it may be possible to refine only one of these parameters at any time.

Resolution

A “soft” resolution limit for the low-pass 5th order Butterworth filter that maximizes the consistency between model and map. The signal is attenuated smoothly either side of this limit, so it is distinct from the “hard” resolution of the --high_resolution command-line argument, beyond which the signal is zero. Both can be applied, but with the “hard” limit higher (smaller number) than the “soft” limit. See comments on covariance with envelope above.

Options

Most of the options relate to convergence criteria. After any one is satisfied, the refinement is terminated.

Parrot

Equivalent to toggling with set echo True/False.

Pdbout

Refined coordinates are only output if this command is run!

The header is modified from that of the input. The output file may be non-standard, according to the Symmetry --output option selected. The default is non-standard addition of fragments from symmetry-equivalent molecules that are close (but this can be disabled).

Perturb

Calculates statistics as atoms are moved step-wise from current locations. This is helpful in determining the program parameters that might provide the best convergence radius with the available data.

In the default rigid-body mode, the displacement entered is the actual translation. With the --individual option, the displacement is interpreted as the target RMS after random displacements.

Statistics are somewhat dependent on the randomized directions of displacements, and are best averaged over 10 to 50 --repeats. (The --direction option only makes sense with --repeats <1>.)

In calculating statistics, current settings of cut-off radii, resolutions, etc., will apply.

Output (stdout)

For each step (decreasing from the maximum perturbation to zero), a row consisting of:

  • rms model perturbation (Å).

  • Pearson correlation coefficient between model and map.

  • standard deviation of the correlation coefficient (among repeats).

  • Real-space R-factor.

  • standard deviation of the R-factor (among repeats).

  • Σ(ρobs - ρcalc)2

  • standard deviation of the above.

  • Correlation between the partials of the difference residual and the atomic error vectors. This is calculated from the flattened arrays of length repeat*atoms*3, i.e. the parameter vectors used in refinement, and it captures both direction and magnitude.

  • Gradient norm calculated from the magnitudes of the partials.

  • Dot product of normalized gradient and error vectors. Equal to the correlation coefficient in the special case of the mean values of the gradient and error vectors equal to zero.

The correlation between gradient and error vectors is perhaps the best estimate of likely convergence radius, and can guide the choice of density-calculation parameters to use. Correlation and residuals can indicate the sensitivity.

Refine

Parameterize is a pre-requisite, i.e. definition of what is to be refined (and how) must preceed this command.

Options

Most of the options relate to convergence criteria. After any one is satisfied, the refinement is terminated.

Restraints

Does nothing but report on supplementary restraints. (Whether restraints are used is controlled by command-line arguments.) Starting with v0.4.2 (11/04/13) restraint on B-factor variation between bonded atoms could be applied. Soon after v0.5.0 (2/18/15), van der Waal’s restraints will be added, and this might be all that is needed for rigid group or torsion angle refinement. (If full stereochemical restraints are needed for refinement of individual atomic positions, use the CNS-embedded implementation.)

Python Interpreter for advanced functionality

An attempt has been made to balance flexibility with simplicity and ease of use, in deciding which functionalities and parameters are available through command-line or command interpreter control. Many others are accessible through an embedded python interpreter that is invoked with the command py command. It is executed in a namespace that is local to the command interpreter (and not very useful). The command-line options are imported as attributes of the object “option”, and most other needed objects can be accessed as attributes of self.task, for which the alias “my” is provided. The following examples illustrate:

  • Change chain name of target to match refining model: pasto.rsref ‣ py

    import numpy
    my.atoms.chain = numpy.where(my.atoms.chain=='C', 'A', my.atoms.chain)
    exit()
    
  • Remove one of the target alternative locator IDs to match refining model: pasto.rsref ‣ py

    import numpy
    my.atoms.altloc=numpy.where(my.atoms.altloc=='A',' ',my.atoms.altloc)
    exit()
    
  • Change the torsion-restraint weight during a refinement: pasto.rsref ‣ py option.torsion_weight = 2.0

  • Print values of command-line options: pasto.rsref ‣ py print(option)

  • Reset B-factors to a uniform 15.0: pasto.rsref ‣ py my.atoms.b = 15.0

  • Replace all B values with their mean: pasto.rsref ‣ py my.atoms.b.fill(my.atoms.b.mean())

  • Change the limits on the magnification to +/-10%: pasto.rsref ‣ py option.magnification_limits = (0.9, 1.1)

  • Change the internal unit conversion for rigid rotations from 10 to 100: pasto.rsref ‣ py option.units_per_rad = 100.0 (see also rsref --units_per_rad command-line option).

  • Scale observed map to model (instead of model to observed): pasto.rsref ‣ py map_calc.scale_model_to_observed=False (see also rsref --scale_to_model command-line option).

    (The default class attribute ModelMap.scale_model_to_observed=True is appropriate almost all of the time, and avoids trivial refinements where the residual is lowered merely by changing the scaling. However, with the default, the residual can be lowered by the model exiting the boundaries of the map (warning messages are printed). The non-default (False) can be used to calculate additional diagnostic statistics for unstable refinements, but the partial derivatives (and refinement) will be incorrect.)

Performance

Radii

The number of ρcalc estimations is dependent on the cube of the --atom_extent radius. With the efficiencies of array-based calculations, empirically, performance is approximately NlogN, (N=atom_extent).

The accuracy of density statistics (correlation, least-squares residual etc.) improves with --atom_extent. While larger is always better, there are diminishing returns as the contribution from distant atoms declines. Although likely B-factor-dependent, empirically, there is less gained beyond --atom_extent greater than 2.5 x high_resolution.

Especially early in a refinement, it may be appropriate to compromise accuracy of statistics (& refinement objective functions) for speed. --relative_use <0.5> with --relative_extent <1.25>} gives pretty good results, i.e. using density at grid points within a sphere of every atom that has a radius of 1.25 x high_resolution.

B-factors

Atomic density is calculated from a 1-D Fourier transform of an isolated atom. It depends on atom type and B-factor. Values are cached when possible to avoid recalculation if the B-factors are essentially the same. For low resolution refinements (eg. EM) or when there is a large overall B or envelope factor, detailed variation in B-factors may not be important. There may be little point in refining B-factors. Speed can also be increased by:

  1. Setting all B-factors to a uniform value

  2. Rounding their values

With either, atomic densities will be more frequently drawn from the cache rather than calculated from scratch.

Atom selections and groups

A parser is provided for flexible selection of atom groups within the command interpreter. It is applicable to stand-alone running (rsref.py, superpose.py etc.), but not usually accessible when modules are wrapped/embedded in other packages (eg. CNS) which then manage atom selections.

Selections

The selection syntax is terse, flexible, and (for better or worse) relies on Python evaluation of expressions. Thus, hopefully, it will be intuitive for many users.

It differs from some other programs in that selections are objects (instances of class Selection). Selection objects are an extension of boolean arrays, specific for a coordinate set (class Atoms). Selection expressions can be used directly in commands like parameterize, or user-named Selections can be pre-defined for repeated use or to simplify/clarify the definition of complicated Selections. Selections can be combined or assigned to new Selection instances through use of Python bit-wise logical operators (&,|,==,!=,~,>,>=,(,),...). This should make it convenient to write refinement scripts in which different subsets of the atomic parameters are refined at different stages, different sets of atoms are subject to positional, B-factor or occupancy refinement, and in which some atoms might be refined individually, others rigidly grouped for simultaneous refinement, etc.. This frees the user of constraints embodied in other programs.

In RSRef, “S” is a synonym of Selection, the class. A selection expression is defined as:

<selection>|S(<criterion>) [<operator> <selection>|S(<criterion>)]

where:

<selection>

is a pre-existing instance of class S.

<criterion>

is a non-quoted string to select atoms for a new instance of S, where <criterion> can be in simple form or compound:

Simple

contains a single operator, and is given without parentheses, such as:

  • chain == A

  • chai==B

  • Residue number >= 30 (or Residue number .GE. 30, avoiding cmd2 redirect parsing)

  • resnu <= 50

Compound

an expression combining operators, using parentheses to set precedence. Spaces in names must be replaced with underscores, and expressions should use lower case as case sensitive. Examples:

  • (chain == A) | (chai==B)

  • ((residue_numb >= 30) & (atnam == CA)) | (chain != C)

Warning

the compound parser will silently do unexpected things with syntax or spelling errors. Check that results are consistent with expectations!

Note

the inner operands are enclosed within parentheses, because the combination operators (&, |) have higher precendence than the comparison operators (>, >=, <, <=).

<operator>

is a bit-wise Python logical operator (&,|,==,!=,~,>,>=,(,),...) Further details of the syntax for making Selections are provided in atoms.Selection.

Selections are named and defined with the select command.

Groups

Groups are dictionary-like collections of named Selections, each constrained as a group in group refinement. (In individual atom refinement, a Group is treated as the logical OR between all the named Selections.) The Group class contains methods for checking that selections do not overlap (work-in-progress). In most commands, the name of a Group can be substituted for that of a Selection.

Groups are defined with the selection command, for example:

select --collection=domains --name=N S('chain == A') & S('residue num <= 105')
select -C domains -n C S('chain == A') & S('residue num .gt. 105')

These 2 statements illustrate several varients of the syntax in together defining a Group called domains, with two Selections, named N & C that contain the N- and C-terminal parts of subunit A.

Troubleshooting

Density calculation

Form factors

Form-factor not found

form factor tables support atoms commonly used in each technique. If one is not found, consider:

  • whether the atom in the coordinate file is appropriate, eg. should an EM model contain hydrogens?

  • switching to a different form-factor table (-F command-line option).

  • adding an entry into your favorite table (formfactor.py).

Coordinates

PDB output

TER records inappropriately inserted between residues

This can occur if atoms / hetatoms in the input PDB file have been edited out without renumbering the atom records. It also occurs with multiconformer structures from CNS or X-plor that designate conformer in with the segid columns rather than the standard alt conf column. In supporting several PDB variants, PaStO merges information from the segid and alt conf attributes with the result that a change in alternate conformer designator causes new chain detection. CNS and X-plor PDB files must first be converted to the standard designation.

The following script can be used to re-standardize a CNS multi-conformer PDB file:

#! /bin/bash
# usage segid2alt.sh <segid> <alt_designator> <pdb_input> [> <pdb_output>]
# eg. ./segid2alt AC2 B my.pdb > new.pdb
# Adds a standard alternative conformation designator to atoms with a particular segid
# Needed because CNS drops the alt designator
sed -e "/${1}/s/\(^ATOM  ..........\)./\1${2}/" -e "/${1}/s/\(^HETATM..........\)./\1${2}/" ${3}
exit

Selections

Run-time exceptions

Invalid operand for int and ndarray

(or something similar; with complex Selections.) Check that attribute comparisons such as (atnam==C) are enclosed within parentheses as the comparison operator has lower precedence than a combination operator such as & or !.

SyntaxError: unexpected EOF while parsing …

Likely that a > in a logical expression has been mistakenly parsed as a cmd2 output redirect. Substitute the Fortran .GT. operator (eg. resnum .gt. 20) which will get past the cmd2 parser before being translated into the python resnum > 20.

Pipe process exited with code 0 before command could run

Likely that a | in a logical expression has been mistakenly parsed as a cmd2 pipe output redirect. Substitute the Fortran .OR. operator (eg. (chain==A).or.(chain==B)) which will get past the cmd2 parser before being translated into the python (chain==A)|(chain==B). It is also often possible to substitute the python xor operator, (chain==A)^(chain==B), noting that it is bitwise operators that are used for the ndarray atom attributes.

Parameterization

Torsion

Warning that C-psi is not a recognized dihedral, then exception

Likely because the C-terminal OXT is missing from the input coordinates. If it is inconvenient to repair the coordinate file, then the final psi may be excluded with a selection like: torsion --dihedrals="(chain==A)&~((resnum==202)&(atnam==C))" (if 202 was the recalcitrant terminal residue).

Refinement

Nothing changes from the start

--min_grad might be larger than the 1st gradient, so deemed converged before start. Check iterate.dat or the log file for a convergence message. Remove min_grad from input (default is 0.0) or set to small value.

Refinement stuck - 10s of function estimates w/in an iteration

The search along the gradient direction is not giving a minimum:

  • --max_cycles may be unreasonably large. Mostly, one needs multiple cycles in a non-linear refinement in which the effects of parameter changes are inter-dependent. There are other cases where where is little inter-dependence and the refinement will approximate linear. Examples would include refinement of groups that are well separated. In these cases, refinement should be complete in very few cycles, with random directions thereafter.

  • option:!–min_improvement or --min_grad might be set to levels too low, requiring a precision that is not possible. Note that there are several approximations. For example, movement in atoms may change which grid points are being used, and therefore a non-smooth change in the objective function.

Credits

This started a new implementation of theory laid out in Chapman 1995, then substantially extended by Michael Chapman.

Form factor tables have been modified from the CCP4 and TNT distributions. Andrew Trzynkaa assisted in programming methods to read the form factors.

Libraries used in rigid-group and torsion angle optimizations were programmed by Brynmor K. Chapman. Van der Waal’s restraints have been programmed by Leo Selker.

This new implementation relies heavily on experience gained with earlier rsref.c and C++ programs, to which several former members of the Chapman lab contributed: Eric Blanc, Zhi (James) Chen, Andrew Korostelev, Felcy Fabiola and Olga Kirillova.

Citations

Publications and database entries should acknowledge use of RSRef by citing [Chapman-1995] and [Chapman-2013].

[Chapman-1995]

Chapman, M. S. Restrained Real-Space Macromolecular Atomic Refinement using a New Resolution-Dependent Electron Density Function. Acta Crystallographica A51, 69-80 (1995).

[Chapman-2013]

Chapman, M. S., Trzynka, A., and Chapman, B. K. (2013) Atomic modeling of cryo-electron microscopy reconstructions - Joint refinement of model and imaging parameters, J Struct Biol 182, 10-21.

Colophon

This introductory documentation is generated from the source file: rsref_doc.rst. Its dependents are extracted from specific python modules using documentation.py, see instructions within.

Changed in version 12/10/10: Started

Changed in version 0.5.0: 2/18/15, converted to reStructuredText from Epydoc.

Changed in version 0.5.5: 07/03/17 Rotamer command being added in

Changed in version 1.0.0: 09/26/20 Python 2.7 –> 3.6 including cmd2: optparse –> argparse

Changed in version 1.0.5: 07/10/24 Python 3.12, cmd2 2.4, pasto package.