rsref module

Real space refinement; model-map fit.

usage:rsref.py [options] [map-file] [input-pdb-file]

Module author: Michael Chapman <chapmami@ohsu.edu>

Authors:

Michael S. Chapman <chapmami@ohsu.edu>

Oregon Health & Science University

Version:

0.5, March 23, 2016

Changed in version 12/10/10: Started

Todo

Extend Map.calculate (and evaluate) to accept aggregate of selections should the user only want to include refining atoms & neighbors in statistrics. Expect up to 10-fold increase in speed for local refinement or high symmetry cases.

Todo

Support optional input of the grad convergence criterion in fractional units relative to the starting objective function value.

Todo

copy impact from superpose - might require heavy duty consolidation.

class rsref.Arguments(imports=[], main=True, *args, **kwargs)

Bases: argparser.ArgumentParser

Subclass, adding program-specific command-line options.

Parameters:
  • imports ((list of) ArgumentParser method(s)) – method(s) adding arguments to ArgumentParser instance. (Alternative to ArgumentParser parents w/ better maintained groups.)
  • main (bool) – Parser for main program (i.e. not listed within an ArgumentParser parents argument). Adds default program information.
static export()

Adding options used in both stand-alone and imported modes.

rsref.RSRef

alias of Tasks

class rsref.Tasks

Bases: atoms.Tasks, optimize.Tasks

Program-specific tasks, supporting just-in-time execution.

Methods with names ending _prereq define tasks whose prerequisites will be met, if possible, prior to execution.

analyze(itemize=True)

Analyze the gradients and shifts of prior refinement.

analyze_prereq(**kwargs)

Pass-through, trying to meet the prerequisites.

atoms_being_evaluated
atoms_refined
calculate_map(quiet=False, verbose=False)

Calculate atomic density for points in map.

calculate_map_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

default_group
diagnostics()

Cumulative statistics/breakdown on calculation of derivatives.

evaluate(quiet=False, verbose=False, new_file=None)

Statistics comparing model & map (correlation, R, residual).

evaluate_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

image_refine(optimize_magnification=False, optimize_overall_b=False, optimize_envelope=False, optimize_filter=False, max_cycles=10, min_improvement=0.0, min_grad=0.0, finite_difference=False, defaults=None, verbosity=0, confirm=False)

Optimization of imaging (map) parameters for agreement w/ model.

Warning

Convergence is often poor when refining both magnification and and (overall B or envelope) - try refining B/envelope first.

See also

More discussion of convergence in density.ModelMap.image_refine().

Parameters:
  • optimize_magnification (bool) – optimize magnification.
  • optimize_overall_b (bool) – optimize overall_b.
  • optimize_envelope (bool) – optimize envelope.
  • optimize_filter (bool) – optimize the resolution of a low-pass filter to be applied to the model.
  • max_cycles (int) – maximum number of max_cycles (hard limit for bfgs; approx. for constrained l-bfgs, as converted to a corresponding limit on function / gradient calculations).
  • min_improvement (float) – refinement (constrained l_bfgs_b only) is stopped when relative change in function < min_improvement, i.e. 1.e-2 would stop when change < 1% of current residual value.
  • min_grad (float) – refinement stopped when the largest derivative magnitude falls below this value (internal parameter units).
  • defaults (object or NoneType) – optional object with attributes units_per_magnification, units_per_envelope & units_per_resolution, with which local defaults are over-written if all present.
  • confirm (bool) – request user confirmation before accepting results.
image_refine_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

implied_parameterization(refining_xyz=True, refining_b=False, refining_occ=False)

Creates AtomicParameterization from Atoms.evaluate.

This is applicable (in embedded implementations) when there is only a single flag designating whether an atom is to be refined, and there is not the fine-grain control native to RSRef. Even though parameterization (rigid group, torsion, etc.) might be controlled by the embedding application, RSRef needs an AtomicParameterization instance to control which partial derivatives are to be calculated.

Note

requires self.atoms.evaluate to have been populated (see import_coord).

Parameters:
  • refining_xyz (bool) – turn on the calculation of partial derivatives needed for any refinement of atomic positions.
  • refining_b (bool) – turn on the calculation of partial derivatives needed for any refinement of atomic B-factors.
  • refining_occ (bool) – turn on the calculation of partial derivatives needed for any refinement of occupancies.

Note

don’t like this implementation, because all partials for all atoms will be calculated.

import_coord(attribute, mylist)

Imports attribute into Atoms object from mylist array.

Parameters:
  • attribute (str) – name of atoms attribute or x, y, or z for one of the xyz coordinates.
  • mylist (list) – value for each atom
neighbors_deprecated(target_atoms=None, distance=None, verbose=True)

Identify self.atom neighbors of target (or self.atoms) + symm equivs.

Pass through routine for Atoms.neighbors or Symmetry.neighbors.

Parameters:distance (float) – Angstrom within which considered close. Default calculated from Q.option.density_radius + Q.option.use_pts.

Todo

Consider adding to distance to allow for coordinate movement. May not be important as density_radius typically generous.

neighbors_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

parameterization_init()

Initialize parameterization - directives on what/how to refine.

parameterization_init_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

parameterize()

Pass-back after checking prerequisites for parameterize sub-comds.

parameterize_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

perturb(individual=False, displacement=1.0, direction=None, steps=1, repeats=1, verbose=True, exception=False, seed=None)

Shifts coordinates in steps, calculating comparison stats & gradients.

Purpose: to evaluate the convergence radius with the current options.

Perturbation may be rigid group or individual-atom, usually in random directions, and repeated to generate mean values & standard deviations. For a single rigid perturbation, a direction may be set. For rigid-group perturbation, displacement is the actual value, whereas for individual-atom, it is the mean of a normal distribution.

Parameters:
  • individual (bool) – shift atoms independently, else as a rigid body.
  • displacement (float) – overall Angstrom shift (rigid body) or rms shift (individual atom).
  • direction (nd.array-like, shape(3,1) or Nonetype) – unit vector for rigid displacement or None if to be chosen randomly.
  • steps (int) – number of steps into which the displacement is to be divided.
  • repeats (int) – Number of random-direction shifts over-which statistics are to be averaged.
  • verbose (bool) –
  • exception (bool) – terminate on error, else return None
  • seed (int) – for random number generator
Returns:

(magnitude(s), correlation coefficient(s), rms, R-factor(s)).

Return type:

tuple of 4 {float or nd.array.size(steps)}

Attention

over-writes atomic map, but as the last loop is with zero-error, this should be benign.

perturb_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

profile(b=10.0, atom_type='C', max_dist=4.0, points=17)

Print calculated density profile of an isolated atom.

Parameters:
  • b (float) – thermal factor
  • atom_type (str) –
  • max_dist (float) – tabulate up to this distance from atom center
  • points (int) – number of columns in table
randomize(**arguments)
randomize_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

reconcile_types()

Reconcile atom types of coordinate file with those of form factor data base.

reconcile_types_prereq()

Pass-through, trying to meet the prerequisites.

refine(continuation=False, restart=False, method='l_bfgs_b', max_cycles=0, min_improvement=0.0, min_grad=0.0, reference=None, verbosity=0)

Refine atomic model (currently free-atom w/o stereochemical restraints).

Warning

Joint refinement of occupancy and B-factors for individual atoms requires very high resolution data.

Note

Convergence can be improved by setting stringent convergence criteria, eg. min_improvement = 1e.-9.

Todo

arguments for link, unlink (brought from input).

See also

Until joint refinement implemented here, see alternative image optimization in image_refine() etc.

Parameters:
  • continuation (bool) – If not restart, treat as a continuation of a prior batch if no change in parameterization and atoms since last refinement. False - force a new batch.
  • restart (bool) – from start of prior refinement.
  • method (str) – currently supported are ‘l_bfgs_b’ (limited memory; recommended) and ‘bfgs’ (faster for small structures).

Note

l_bfgs is usually recommended as the memory usage of bfgs increases with the square of the number of parameters (atoms). l_bfgs also supports limit-constrained refinement.

Parameters:
  • min_improvement (float) – refinement (l_bfgs_b) is stopped when relative change in function < min_improvement, i.e. precision of 1.e-2 stops when change < 1% of current residual value.
  • max_cycles (int) – maximum number of max_cycles (hard limit for bfgs; approx. for l-bfgs, as converted to a corresponding limit on function / gradient calculations).
  • min_grad (float) – refinement stopped when the largest derivative magnitude falls below this value (internal parameter units as reported by analyze).
  • reference (Atoms or NoneType) – if torsion angle parameter changes are to be restrained (option.torsion_weight), an optional reference state of the moving structure (same atoms, order etc.) can be provided (eg. input structure); else any restraint will apply from the start of each refinement batch.
  • verbosity (int) – -1 for silent, 0 for terse, 1 for verbose, > 1 for (debugging) information on each function or gradient evaluation.

Publications describing work using l_bfgs_b, or commercial products using it, should cite one of the Nocedal references [Byrd-1995] [Zhu-1997].

[Byrd-1995]R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound Constrained Optimization, (1995), SIAM Journal on Scientific and Statistical Computing , 16, 5, pp. 1190-1208.
[Zhu-1997]C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization (1997), ACM Transactions on Mathematical Software, Vol 23, Num. 4, pp. 550 - 560.

Todo

Run extract at end with refined[0] to save a copy of final parameters for later analysis if required.

refine_magnification()

Refine (only) magnification.

Deprecated since version (mostly?): see self.image_refine, so have not defined prerequisites.

refine_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

residual

Weighted local (rhoo - rho:sub:’c`)2, calculated w/ current options.

Return type:float
residual_db

Weighted partials of local residual re B-factors: d((rhoo - rho:sub:’c`)2)/dB.

Return type:1D ndarray length=#atoms
residual_do

Weighted partials of residual re occupancies: d((rhoo - rho:sub:’c`)2)/dO.

Return type:1D ndarray length=#atoms
residual_dx

Weighted partials of residual re x atomic coordinates: d((rhoo - rho:sub:’c`)2)/dx.

Return type:1D ndarray length=#atoms
residual_dxyz

Weighted partials of residual re atomic positions: d((rhoo - rho:sub:’c`)2)/dxyz.

Return type:ndarray shape(3,#atoms)
residual_dy

Weighted partials of residual re y atomic coordinates: d((rhoo - rho:sub:’c`)2)/dy.

Return type:1D ndarray length=#atoms
residual_dz

Weighted partials of residual re z atomic coordinates: d((rhoo - rho:sub:’c`)2)/dz.

Return type:1D ndarray length=#atoms
restrain_to_reference = True
restraints(verbose=True)

Information on ancilliary restraints (B-factor).

Parameters:verbose (bool) – print rms deviations.
Returns:rms deviations (B-factors).
Return type:tuple of float(s)
restraints_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

run_test(name, arguments=None)

Program development testing.

symmetry_current

True if current coordinates were same as last-expanded.

symmetry_expand(target_atoms=None, center=None, distance=None)

Identify self.atom neighbors of target (or self.atoms) + symm equivs.

Pass through for Symmetry.neighbors.

Parameters:
  • center ([float, float, float]) – atoms w/in distance of this point instead of other atoms.
  • distance (float) – Angstrom within which considered close.
symmetry_expand_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

symmetry_init(distance=None, force=False)

Initialize symmetry calculation & identify neighbors.

By default, initialization is conditional upon presence of space_group or local_symmetry, so lattice symmetry will only be applied if the space group is declared as at least P1. Unit_cell is not sufficient to invoke lattice symmetry, because it is required to interpret input maps and could be either a real lattice repeat or an arbitrary volume enclosing a single particle (in EM).

Parameters:
  • distance (float) – neighbor cut-off; default = density radius + use_pts.
  • force (bool) – initialize even if no space_group or local_symmetry.
symmetry_init_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

verbose = False
write_map(filename, observed=False, calculated=False, difference=False, scaled=False, title='')

Write map file

Parameters:
  • filename (str) – output map
  • observed (bool) – output experimental map
  • scaled (bool) – output experimental map, scaled to atomic model
  • calculated (bool) – output map calculated from atomic model
  • difference (bool) – output difference map, observed minus model
  • title (str) – put in the map header
write_map_prereq(*args, **kwargs)

Pass-through, trying to meet the prerequisites.

class rsref.TopCommands(tasks, initial_commands=None, py_locals={}, parser=None)

Bases: atoms.Commands, optimize.Commands, cmd2nest.TopCmd, cmd2nest.ProgCmd

Top-level commands for Program Prog.

Parameters:
  • tasks (Contol) – class instance that defines methods to be called.
  • initial_commands (str or list of str) – file(s) of commands to run before interactive loop at the top level, and before any command-line commands.
  • py_locals (dict) – objects added to the local namespace in the python interpreter invoked by ‘py’ commands.
  • parser (ArgumentParser (argparse') or :py:class:`OptionParser (optparse)) – to which -t / –test option will be added for cmd2 regression testing. Should only be used once / program.
do_evaluate(instance, arg)

Calculate statistics comparing current model to map. Usage: evaluate [options] [different-pdb-file]

Options:
-h, --help show this help message and exit
-v, --verbose Including scaling information.
do_image_refine(instance, arg)

Refine image (map) parameters to maximize agreement with atomic model. Usage: image_refine [options] [different-pdb-file]

Options:
-h, --help show this help message and exit
-M, --magnification
 Optimize magnification.
-B, --overall_B
 Optimize overall B (-B & -E mutually exclusive).
-E, --envelope Optimize EM envelope function (-B & -E mutually exclusive).
-R, --resolution
 Optimize resolution (low-pass filter on atomic model to best match map).
-C INT, --max_cycles=INT
 Maximum number of cycles.
-i FLOAT, --min_improvement=FLOAT
 End when per-cycle improvement falls below this value.
-g FLOAT, --min_grad=FLOAT
 End when gradient norm falls below this value.
-f, --finite_difference
 Numerical derivatives instead of analytical.
-v INT, --verbosity=INT
 Per-cycle logging: -1 (terse) to 5 (verbose).

-?, –confirm Request Y/N conformation before acceptance.

do_map(instance, arg)

Write density map(s) on the grid of the input map.

Model and difference density is calculated for atoms and their immediate (symmetry-related) neighbors that have been selected previously. This is consistent with RSRef’s modality of evaluating the fit over molecules rather than a 3D unit cell volume. If the structure has molecular or crystallographic symmetry, and the statistics of the map (scale, correlation, sigma) are to be analyzed externally throughout a parallelepiped, symexp.py should be used first to fill the volume (plus a margin) with the symmetry-related atoms. (Although popular, such volume-based assessments, without a molecular molecular mask, are discouraged, because the solvent volume massages the statistics.)

Usage: map [options] file name

Options:
-h, --help show this help message and exit
-c, --calculated
 Simulated from the atomic model using current imaging parameters.
-d, --difference
 Observed minus calculated, scaled & using imaging parameters.
-o, --observed Input experimental map, having applied any change in magnification.
-s, --scaled Input experimental map, having applied any change in magnification, and scaled to atomic model (required command-line option –scale_to_model.)
-t TITLE, --title=TITLE
 Title used in map header (quote to protect white-space & newlines).
do_neighbors(instance, arg)

Identify neighbors within distance of (selected) atoms. Usage: neighbors [options] arg

Options:
-h, --help show this help message and exit
-d FLOAT, --distance=FLOAT
 Searches for neighbors within distance of any atom (default: 3.5 A).
do_parameterize(instance, arg)

For designated parameter type(s) (default positions), select atoms to be refined & how. Usage: parameterize [options] arg

Options:
-h, --help show this help message and exit
-b, --bfactor Designate which B-factors to be refined.
-o, --occupancy
 Designate which occupancies to be refined.
-p, --position Designate which atomic positions (xyz) to be refined.
do_perturb(instance, arg)

Perturb model & calculate statistics. Usage: perturb [options] displacement (A; def. 1.0)

Options:
-h, --help show this help message and exit
-i, --individual
 Perturb atoms individually, else as rigid body.
-n INT, --steps=INT
 Divide displacement into n logarithmic steps (default: 1).
-r INT, --repeats=INT
 Number of random displacements used to calculate statistics (default: 1).
-d (‘x’, ‘y’, ‘z’), –direction=(‘x’, ‘y’, ‘z’)
Unit vector for direction of perturbation. Chosen randomly if None (default).
-s INT, --seed=INT
 Seed for random number generator.
-v, --verbose Additional statistics.
do_profile(instance, arg)

Density of an isolated atom vs. distance from center, useful for setting cut-off radii. Usage: profile [options] arg

Options:
-h, --help show this help message and exit
-B FLOAT, --B_factor=FLOAT
 Atomic B-factor (default: 10.0).
-a TYPE, --atom=TYPE
 Atom type (default: C).
do_randomize(instance, arg)

Randomize coordinates (with normal error distributions). Usage: randomize [options] arg

Options:
-h, --help show this help message and exit
-x FLOAT, --xyz=FLOAT
 Magnitude of desired RMS xyz positional displacement (default: 0.0).
-B FLOAT, --B_factor=FLOAT
 Desired standard deviation in B-factor (default: 0.0).
-O FLOAT, --occupancy=FLOAT
 Desired standard deviation in occupancy (default: 0.0).
-s INT, --seed=INT
 Seed for random number generator.
do_refine(instance, arg)

Refine atomic model. Usage: refine [options] arg

Options:
-h, --help show this help message and exit
-C INT, --max_cycles=INT
 Maximum number of cycles.
-i FLOAT, --min_improvement=FLOAT
 End when per-cycle improvement falls below this value.
-g FLOAT, --min_grad=FLOAT
 End when gradient norm falls below this value.
-n, --new_batch
 New batch (losing prior history), else continue prior (default) if exists/possible.
-r, --restart From original coordinates (implies -n).
-v INT, --verbosity=INT
 Per-cycle logging: -1 (terse) to 3 (verbose) [def. 0].
do_restraints(arg)

Information on supplementary restraints.

do_test(rest)

Run a program developer’s test.

help_test()
rsref.startup()

Program initialization, reading options etc..

Returns:parser
Return type:Arguments.ArgumentParser