rsref module¶
Real space refinement; model-map fit.
usage: | rsref.py [options] [map-file] [input-pdb-file] |
---|
Module author: Michael Chapman <chapmami@ohsu.edu>
Authors: | Michael S. Chapman <chapmami@ohsu.edu> Oregon Health & Science University |
---|---|
Version: | 0.5, March 23, 2016 |
Changed in version 12/10/10: Started
Todo
Extend Map.calculate (and evaluate) to accept aggregate of selections should the user only want to include refining atoms & neighbors in statistrics. Expect up to 10-fold increase in speed for local refinement or high symmetry cases.
Todo
Support optional input of the grad convergence criterion in fractional units relative to the starting objective function value.
Todo
copy impact from superpose - might require heavy duty consolidation.
-
class
rsref.
Arguments
(imports=[], main=True, *args, **kwargs)¶ Bases:
argparser.ArgumentParser
Subclass, adding program-specific command-line options.
Parameters: - imports ((list of) ArgumentParser method(s)) – method(s) adding arguments to ArgumentParser instance. (Alternative to ArgumentParser parents w/ better maintained groups.)
- main (bool) – Parser for main program (i.e. not listed within an ArgumentParser parents argument). Adds default program information.
-
static
export
()¶ Adding options used in both stand-alone and imported modes.
-
class
rsref.
Tasks
¶ Bases:
atoms.Tasks
,optimize.Tasks
Program-specific tasks, supporting just-in-time execution.
Methods with names ending _prereq define tasks whose prerequisites will be met, if possible, prior to execution.
-
analyze
(itemize=True)¶ Analyze the gradients and shifts of prior refinement.
-
analyze_prereq
(**kwargs)¶ Pass-through, trying to meet the prerequisites.
-
atoms_being_evaluated
¶
-
atoms_refined
¶
-
calculate_map
(quiet=False, verbose=False)¶ Calculate atomic density for points in map.
-
calculate_map_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
default_group
¶
-
diagnostics
()¶ Cumulative statistics/breakdown on calculation of derivatives.
-
evaluate
(quiet=False, verbose=False, new_file=None)¶ Statistics comparing model & map (correlation, R, residual).
-
evaluate_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
image_refine
(optimize_magnification=False, optimize_overall_b=False, optimize_envelope=False, optimize_filter=False, max_cycles=10, min_improvement=0.0, min_grad=0.0, finite_difference=False, defaults=None, verbosity=0, confirm=False)¶ Optimization of imaging (map) parameters for agreement w/ model.
Warning
Convergence is often poor when refining both magnification and and (overall B or envelope) - try refining B/envelope first.
See also
More discussion of convergence in
density.ModelMap.image_refine()
.Parameters: - optimize_magnification (bool) – optimize magnification.
- optimize_overall_b (bool) – optimize overall_b.
- optimize_envelope (bool) – optimize envelope.
- optimize_filter (bool) – optimize the resolution of a low-pass filter to be applied to the model.
- max_cycles (int) – maximum number of max_cycles (hard limit for bfgs; approx. for constrained l-bfgs, as converted to a corresponding limit on function / gradient calculations).
- min_improvement (float) – refinement (constrained l_bfgs_b only) is stopped when relative change in function < min_improvement, i.e. 1.e-2 would stop when change < 1% of current residual value.
- min_grad (float) – refinement stopped when the largest derivative magnitude falls below this value (internal parameter units).
- defaults (object or NoneType) – optional object with attributes units_per_magnification, units_per_envelope & units_per_resolution, with which local defaults are over-written if all present.
- confirm (bool) – request user confirmation before accepting results.
-
image_refine_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
implied_parameterization
(refining_xyz=True, refining_b=False, refining_occ=False)¶ Creates AtomicParameterization from Atoms.evaluate.
This is applicable (in embedded implementations) when there is only a single flag designating whether an atom is to be refined, and there is not the fine-grain control native to RSRef. Even though parameterization (rigid group, torsion, etc.) might be controlled by the embedding application, RSRef needs an AtomicParameterization instance to control which partial derivatives are to be calculated.
Note
requires self.atoms.evaluate to have been populated (see import_coord).
Parameters: - refining_xyz (bool) – turn on the calculation of partial derivatives needed for any refinement of atomic positions.
- refining_b (bool) – turn on the calculation of partial derivatives needed for any refinement of atomic B-factors.
- refining_occ (bool) – turn on the calculation of partial derivatives needed for any refinement of occupancies.
Note
don’t like this implementation, because all partials for all atoms will be calculated.
-
import_coord
(attribute, mylist)¶ Imports attribute into Atoms object from mylist array.
Parameters: - attribute (str) – name of atoms attribute or x, y, or z for one of the xyz coordinates.
- mylist (list) – value for each atom
-
neighbors_deprecated
(target_atoms=None, distance=None, verbose=True)¶ Identify self.atom neighbors of target (or self.atoms) + symm equivs.
Pass through routine for Atoms.neighbors or Symmetry.neighbors.
Parameters: distance (float) – Angstrom within which considered close. Default calculated from Q.option.density_radius
+Q.option.use_pts
.Todo
Consider adding to distance to allow for coordinate movement. May not be important as density_radius typically generous.
-
neighbors_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
parameterization_init
()¶ Initialize parameterization - directives on what/how to refine.
-
parameterization_init_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
parameterize
()¶ Pass-back after checking prerequisites for parameterize sub-comds.
-
parameterize_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
perturb
(individual=False, displacement=1.0, direction=None, steps=1, repeats=1, verbose=True, exception=False, seed=None)¶ Shifts coordinates in steps, calculating comparison stats & gradients.
Purpose: to evaluate the convergence radius with the current options.
Perturbation may be rigid group or individual-atom, usually in random directions, and repeated to generate mean values & standard deviations. For a single rigid perturbation, a direction may be set. For rigid-group perturbation, displacement is the actual value, whereas for individual-atom, it is the mean of a normal distribution.
Parameters: - individual (bool) – shift atoms independently, else as a rigid body.
- displacement (float) – overall Angstrom shift (rigid body) or rms shift (individual atom).
- direction (nd.array-like, shape(3,1) or Nonetype) – unit vector for rigid displacement or None if to be chosen randomly.
- steps (int) – number of steps into which the displacement is to be divided.
- repeats (int) – Number of random-direction shifts over-which statistics are to be averaged.
- verbose (bool) –
- exception (bool) – terminate on error, else return None
- seed (int) – for random number generator
Returns: (magnitude(s), correlation coefficient(s), rms, R-factor(s)).
Return type: tuple of 4 {float or nd.array.size(steps)}
Attention
over-writes atomic map, but as the last loop is with zero-error, this should be benign.
-
perturb_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
profile
(b=10.0, atom_type='C', max_dist=4.0, points=17)¶ Print calculated density profile of an isolated atom.
Parameters: - b (float) – thermal factor
- atom_type (str) –
- max_dist (float) – tabulate up to this distance from atom center
- points (int) – number of columns in table
-
randomize
(**arguments)¶
-
randomize_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
reconcile_types
()¶ Reconcile atom types of coordinate file with those of form factor data base.
-
reconcile_types_prereq
()¶ Pass-through, trying to meet the prerequisites.
-
refine
(continuation=False, restart=False, method='l_bfgs_b', max_cycles=0, min_improvement=0.0, min_grad=0.0, reference=None, verbosity=0)¶ Refine atomic model (currently free-atom w/o stereochemical restraints).
Warning
Joint refinement of occupancy and B-factors for individual atoms requires very high resolution data.
Note
Convergence can be improved by setting stringent convergence criteria, eg. min_improvement = 1e.-9.
Todo
arguments for link, unlink (brought from input).
See also
Until joint refinement implemented here, see alternative image optimization in
image_refine()
etc.Parameters: - continuation (bool) – If not restart, treat as a continuation of a prior batch if no change in parameterization and atoms since last refinement. False - force a new batch.
- restart (bool) – from start of prior refinement.
- method (str) – currently supported are ‘l_bfgs_b’ (limited memory; recommended) and ‘bfgs’ (faster for small structures).
Note
l_bfgs is usually recommended as the memory usage of bfgs increases with the square of the number of parameters (atoms). l_bfgs also supports limit-constrained refinement.
Parameters: - min_improvement (float) – refinement (l_bfgs_b) is stopped when relative change in function < min_improvement, i.e. precision of 1.e-2 stops when change < 1% of current residual value.
- max_cycles (int) – maximum number of max_cycles (hard limit for bfgs; approx. for l-bfgs, as converted to a corresponding limit on function / gradient calculations).
- min_grad (float) – refinement stopped when the largest derivative magnitude falls below this value (internal parameter units as reported by analyze).
- reference (Atoms or NoneType) – if torsion angle parameter changes are to be restrained (option.torsion_weight), an optional reference state of the moving structure (same atoms, order etc.) can be provided (eg. input structure); else any restraint will apply from the start of each refinement batch.
- verbosity (int) – -1 for silent, 0 for terse, 1 for verbose, > 1 for (debugging) information on each function or gradient evaluation.
Publications describing work using l_bfgs_b, or commercial products using it, should cite one of the Nocedal references [Byrd-1995] [Zhu-1997].
[Byrd-1995] R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound Constrained Optimization, (1995), SIAM Journal on Scientific and Statistical Computing , 16, 5, pp. 1190-1208. [Zhu-1997] C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization (1997), ACM Transactions on Mathematical Software, Vol 23, Num. 4, pp. 550 - 560. Todo
Run extract at end with refined[0] to save a copy of final parameters for later analysis if required.
-
refine_magnification
()¶ Refine (only) magnification.
Deprecated since version (mostly?): see self.image_refine, so have not defined prerequisites.
-
refine_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
residual
¶ Weighted local (rhoo - rho:sub:’c`)2, calculated w/ current options.
Return type: float
-
residual_db
¶ Weighted partials of local residual re B-factors: d((rhoo - rho:sub:’c`)2)/dB.
Return type: 1D ndarray length=#atoms
-
residual_do
¶ Weighted partials of residual re occupancies: d((rhoo - rho:sub:’c`)2)/dO.
Return type: 1D ndarray length=#atoms
-
residual_dx
¶ Weighted partials of residual re x atomic coordinates: d((rhoo - rho:sub:’c`)2)/dx.
Return type: 1D ndarray length=#atoms
-
residual_dxyz
¶ Weighted partials of residual re atomic positions: d((rhoo - rho:sub:’c`)2)/dxyz.
Return type: ndarray shape(3,#atoms)
-
residual_dy
¶ Weighted partials of residual re y atomic coordinates: d((rhoo - rho:sub:’c`)2)/dy.
Return type: 1D ndarray length=#atoms
-
residual_dz
¶ Weighted partials of residual re z atomic coordinates: d((rhoo - rho:sub:’c`)2)/dz.
Return type: 1D ndarray length=#atoms
-
restrain_to_reference
= True¶
-
restraints
(verbose=True)¶ Information on ancilliary restraints (B-factor).
Parameters: verbose (bool) – print rms deviations. Returns: rms deviations (B-factors). Return type: tuple of float(s)
-
restraints_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
run_test
(name, arguments=None)¶ Program development testing.
-
symmetry_current
¶ True if current coordinates were same as last-expanded.
-
symmetry_expand
(target_atoms=None, center=None, distance=None)¶ Identify self.atom neighbors of target (or self.atoms) + symm equivs.
Pass through for Symmetry.neighbors.
Parameters: - center ([float, float, float]) – atoms w/in distance of this point instead of other atoms.
- distance (float) – Angstrom within which considered close.
-
symmetry_expand_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
symmetry_init
(distance=None, force=False)¶ Initialize symmetry calculation & identify neighbors.
By default, initialization is conditional upon presence of space_group or local_symmetry, so lattice symmetry will only be applied if the space group is declared as at least P1. Unit_cell is not sufficient to invoke lattice symmetry, because it is required to interpret input maps and could be either a real lattice repeat or an arbitrary volume enclosing a single particle (in EM).
Parameters: - distance (float) – neighbor cut-off; default = density radius + use_pts.
- force (bool) – initialize even if no space_group or local_symmetry.
-
symmetry_init_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
verbose
= False¶
-
write_map
(filename, observed=False, calculated=False, difference=False, scaled=False, title='')¶ Write map file
Parameters: - filename (str) – output map
- observed (bool) – output experimental map
- scaled (bool) – output experimental map, scaled to atomic model
- calculated (bool) – output map calculated from atomic model
- difference (bool) – output difference map, observed minus model
- title (str) – put in the map header
-
write_map_prereq
(*args, **kwargs)¶ Pass-through, trying to meet the prerequisites.
-
-
class
rsref.
TopCommands
(tasks, initial_commands=None, py_locals={}, parser=None)¶ Bases:
atoms.Commands
,optimize.Commands
,cmd2nest.TopCmd
,cmd2nest.ProgCmd
Top-level commands for Program Prog.
Parameters: - tasks (
Contol
) – class instance that defines methods to be called. - initial_commands (str or list of str) – file(s) of commands to run before interactive loop at the top level, and before any command-line commands.
- py_locals (dict) – objects added to the local namespace in the python interpreter invoked by ‘py’ commands.
- parser (
ArgumentParser
(argparse') or :py:class:`OptionParser
(optparse
)) – to which -t / –test option will be added for cmd2 regression testing. Should only be used once / program.
-
do_evaluate
(instance, arg)¶ Calculate statistics comparing current model to map. Usage: evaluate [options] [different-pdb-file]
- Options:
-h, --help show this help message and exit -v, --verbose Including scaling information.
-
do_image_refine
(instance, arg)¶ Refine image (map) parameters to maximize agreement with atomic model. Usage: image_refine [options] [different-pdb-file]
- Options:
-h, --help show this help message and exit -M, --magnification Optimize magnification. -B, --overall_B Optimize overall B (-B & -E mutually exclusive). -E, --envelope Optimize EM envelope function (-B & -E mutually exclusive). -R, --resolution Optimize resolution (low-pass filter on atomic model to best match map). -C INT, --max_cycles=INT Maximum number of cycles. -i FLOAT, --min_improvement=FLOAT End when per-cycle improvement falls below this value. -g FLOAT, --min_grad=FLOAT End when gradient norm falls below this value. -f, --finite_difference Numerical derivatives instead of analytical. -v INT, --verbosity=INT Per-cycle logging: -1 (terse) to 5 (verbose). -?, –confirm Request Y/N conformation before acceptance.
-
do_map
(instance, arg)¶ Write density map(s) on the grid of the input map.
Model and difference density is calculated for atoms and their immediate (symmetry-related) neighbors that have been selected previously. This is consistent with RSRef’s modality of evaluating the fit over molecules rather than a 3D unit cell volume. If the structure has molecular or crystallographic symmetry, and the statistics of the map (scale, correlation, sigma) are to be analyzed externally throughout a parallelepiped, symexp.py should be used first to fill the volume (plus a margin) with the symmetry-related atoms. (Although popular, such volume-based assessments, without a molecular molecular mask, are discouraged, because the solvent volume massages the statistics.)
Usage: map [options] file name
- Options:
-h, --help show this help message and exit -c, --calculated Simulated from the atomic model using current imaging parameters. -d, --difference Observed minus calculated, scaled & using imaging parameters. -o, --observed Input experimental map, having applied any change in magnification. -s, --scaled Input experimental map, having applied any change in magnification, and scaled to atomic model (required command-line option –scale_to_model.) -t TITLE, --title=TITLE Title used in map header (quote to protect white-space & newlines).
-
do_neighbors
(instance, arg)¶ Identify neighbors within distance of (selected) atoms. Usage: neighbors [options] arg
- Options:
-h, --help show this help message and exit -d FLOAT, --distance=FLOAT Searches for neighbors within distance of any atom (default: 3.5 A).
-
do_parameterize
(instance, arg)¶ For designated parameter type(s) (default positions), select atoms to be refined & how. Usage: parameterize [options] arg
- Options:
-h, --help show this help message and exit -b, --bfactor Designate which B-factors to be refined. -o, --occupancy Designate which occupancies to be refined. -p, --position Designate which atomic positions (xyz) to be refined.
-
do_perturb
(instance, arg)¶ Perturb model & calculate statistics. Usage: perturb [options] displacement (A; def. 1.0)
- Options:
-h, --help show this help message and exit -i, --individual Perturb atoms individually, else as rigid body. -n INT, --steps=INT Divide displacement into n logarithmic steps (default: 1). -r INT, --repeats=INT Number of random displacements used to calculate statistics (default: 1). - -d (‘x’, ‘y’, ‘z’), –direction=(‘x’, ‘y’, ‘z’)
- Unit vector for direction of perturbation. Chosen randomly if None (default).
-s INT, --seed=INT Seed for random number generator. -v, --verbose Additional statistics.
-
do_profile
(instance, arg)¶ Density of an isolated atom vs. distance from center, useful for setting cut-off radii. Usage: profile [options] arg
- Options:
-h, --help show this help message and exit -B FLOAT, --B_factor=FLOAT Atomic B-factor (default: 10.0). -a TYPE, --atom=TYPE Atom type (default: C).
-
do_randomize
(instance, arg)¶ Randomize coordinates (with normal error distributions). Usage: randomize [options] arg
- Options:
-h, --help show this help message and exit -x FLOAT, --xyz=FLOAT Magnitude of desired RMS xyz positional displacement (default: 0.0). -B FLOAT, --B_factor=FLOAT Desired standard deviation in B-factor (default: 0.0). -O FLOAT, --occupancy=FLOAT Desired standard deviation in occupancy (default: 0.0). -s INT, --seed=INT Seed for random number generator.
-
do_refine
(instance, arg)¶ Refine atomic model. Usage: refine [options] arg
- Options:
-h, --help show this help message and exit -C INT, --max_cycles=INT Maximum number of cycles. -i FLOAT, --min_improvement=FLOAT End when per-cycle improvement falls below this value. -g FLOAT, --min_grad=FLOAT End when gradient norm falls below this value. -n, --new_batch New batch (losing prior history), else continue prior (default) if exists/possible. -r, --restart From original coordinates (implies -n). -v INT, --verbosity=INT Per-cycle logging: -1 (terse) to 3 (verbose) [def. 0].
-
do_restraints
(arg)¶ Information on supplementary restraints.
-
do_test
(rest)¶ Run a program developer’s test.
-
help_test
()¶
- tasks (
-
rsref.
startup
()¶ Program initialization, reading options etc..
Returns: parser Return type: Arguments.ArgumentParser