energy module

Energy for stereochemical restraint.

Module author: Michael S. Chapman <chapmanms@missouri.edu>

Authors:

Michael S. Chapman <chapmanms@missouri.edu>

University of Missouri

Version:

1, Nov 15, 2025

Added in version 1.1.0: 09/19/25 Start

Changed in version 1.1.1: 11/05/25 Assume that openmm-required missing atoms have already been added and capping performed in symmetry expansion, simplifying EnergyRestraint.

Todo

Tidy up cap H. Cap.aa2ace, but also bad code in prune. (What did I mean by bad code in prune?)

Todo

check that the openmm and Atoms records and partials remain aligned

Todo

Do I need to maintain a dictionary for returning to original symmetry chains - or can I use expanded_from to get them.

class energy.Arguments(imports=[], main=None, *args, **kwargs)

Bases: Arguments

Manager for command-line arguments from main and imported modules.

This is a template to be copied into modules, for the handling of command- line options.

methods export() and domestic() should be overridden by the module subclass with declarations of the command-line arguments needed by the module.

Parameters:
  • imports ((list of) ArgumentParser(s)) – (list of) module(s) from which to obtain “parent” ArgumentParser objects for inclusion.

  • main (bool|NoneType) – Add information and arguments appropriate for a main program (or not); if None, will determine by whether this class is defined in __main__.

domestic()

Defines options used only when main program and not when imported.

export()

Defines options used in both stand-alone and imported modes.

class energy.Energy(coordinates)

Bases: object

Potential energy and forces (derivatives) for stereochemical restraint

Todo

compare internally generated H over runs and to external reduce.

Todo

support for alternate conformers that is not intrinsic to openmm may have to be added by (inefficient) replicate systems whose energies and forces are summed, or by patching later (for local regions).

Todo

constraining positions of HETATMs. With energy minimization, HOH & ions are destabilizing. If this remains true with map-fitting, a solution may be to provide bounds for (just) these atoms, constraining w/in a maximal shift. Google “scipy minimize l-bfgs-b constrain some variables”.

Note

Coordinates in Atoms and Openmm are transposed. Atoms.xyz has shape (3,n), i.e. atom positions are column vectors that are efficient for matrix transformations. In openmm each atom is tuple, so a list of atoms asNumpy becomes n row vectors. Internally, Energy._self_row stores atoms as rows (openmm-like). Input and output is with atoms as column vectors as in Energy.xyz. Transposes are used to inter-convert.

Instantiate an openmm system.

Parameters:

coordinates (string) – atomic coordinates - filename

Todo

atoms object (optional Selection)

For coordinates containing fragmented chains: Fragments must have terminated (OXT atom for peptides), but a TER record is not required, and the protonation of the N-terminus is not critical. Internally, the fragments will be given different chain IDs in the openmm.system. (The original chain and residue names are available from atom.residue.chain.id & atom.residue.id should they be needed, perhaps not.)

Todo

Internally unique chain IDs for fragments. This will be important for symmetry expansion by complete residues (only complete chains would work now). Energy instances return only energy, forces and positions, so names can be updated internally without any outside impact. Will want to check whether can use multi-character chain IDs, particularly if not writing out. For atoms input, should make deep copy (of chain) then update the chain names in each fragment, eg. A1, A2, etc.. chain.id is not writable, everything must be complete before creating topology. Local (internal) change of chain names is indeed preferable, because it will not confound Selection(s).

property energy

Potential energy of the current state

Returns:

energy (kJ / mol)

Return type:

openmm.unit.Quantity

f(coordinates, column=True)

Potential energy with flattened new coordinates

Param:

coordinates in Angstrom

Type:

ndarray.shape(3n)

Parameters:

column (bool) – xyz are a (flattened) array shape(3,n) with atom positions are column vectors (as in Atoms). If False, atom xyzs are row vectors within flattened coordinates.shape(n,3).

Returns:

energy (kJ / mol)

Return type:

float

property fixed
forcefields = ['amber19-all.xml', 'amber19/tip3pfb.xml']
Variables:

forcefields – list for openmm to be used throughout for consistency.

property forces

Forces on atoms (negative derivatives) of the current state

Warning

if H or other atoms inserted, will not align w/ input atoms.

Todo

something like align forces, tracking input atoms, forces.

Returns:

forces (derivatives)

Return type:

ndarray.shape(3,n)

fprime(coordinates, column=True)

Partial derivatives of energy (negative forces) re: atomic coordinates

Param:

coordinates in Angstrom (flattened from shape (-1,3))

Type:

ndarray.shape(3n)

Parameters:

column (bool) – xyz are a (flattened) array shape(3,n) with atom positions are column vectors (as in Atoms). If False, atom xyzs are row vectors within flattened coordinates.shape(n,3).

Returns:

forces (derivatives)

Return type:

ndarray.shape(3n)

model_stats()

Stereochemical statistics pass-through (using PDB validation tool)

Returns:

rmsdbond, rmsdangle, ramachandran[favored, allowed, generously allowed]

Return type:

float, float, list(float*3)

property numbers

Numbers of atoms

Returns:

total, moving, fixed, formatted

Return type:

int, int, int, string

randomize(shift=0.1, verbose=True)

Randomize atomic positions

Parameters:

shift (float) – target mean of random shifts

Returns:

average norm shift, perturbed xyz

Return type:

float, ndarray.shape(3,-1) (Atoms column vectors)

Note that assuming column vectors (like Atoms), not row-like openmm. Should work as self.xyz will set internal coordinates.

refine(max_cycles=100, min_improvement=0.0001, accuracy=0.01, min_grad=0.0, method='l_bfgs_b', verbosity=5)

Refinement of previously instantiated parameters.

See also

Slightly skinny derivative of optimize.Optimizer.refine

Parameters:
  • max_cycles (int) – maximum number of max_cycles (hard limit for bfgs; approx. for l-bfgs, as have to guess a corresponding limit on function calculations).

  • min_improvement (float) – refinement (l_bfgs_b) is stopped when relative change in function < min_improvement, i.e. precision of 1.e-2 stops when change < 1% of current residual value.

  • accuracy (float) – criteria for convergence if estimated fractional error for all parameters is less than accuracy.

  • min_grad (float) – refinement stopped when the largest derivative magnitude falls below this value (internal parameter units as reported by analyze).

  • method (str) – for optimization. Tested are ‘l_bfgs_b’, ‘bfgs’, but ‘Powell’, ‘Newton-CG’ & ‘TNC’ are also available (see scipy.optimize.minimize documentation) bfgs approximates full Hessian (2nd derivatives; possible with small number of parameters); l_bfgs_b for limited memory approximation (most atomic models). `` l_bfgs_b`` also supports constrained refinement and appears more stable.

  • verbosity (int) – -1 for silent, 0 for terse, 1 for verbose, > 1 for (debugging) information on each function or gradient evaluation. If None, use value given on instantiation.

Warning

It is critical that constraints=HBonds not be used in self.system=forcefield.createSystem(). OpenMM extends l-bfgs to convert constraints into supplementary restraints, but this is not supported here. Without this, l-bfgs becomes unstable, likely because the system is enforcing the constraint on update before each calculation of energy or force, leading to inconsistency of the Jacobian and Hessian (derivative matrices) with the current parameters. The current use does not anticipate molecular dynamics, the main reason for using constraints=HBonds, so there are no plans to reverse engineer similar conversion of constraints to restraints.

Todo

add default ‘auto’ method that would set according to number of parameters.

Publications describing work using l_bfgs_b, or commercial products using it, should cite one of the Nocedal references [Byrd-1995] [Zhu-1997].

Todo

Once SciPy ver >= 0.12 becomes prevalent, remove callback_supported from Objective, and callBackEmulator from optimize.

Todo

On migration to scipy.optimize.minimize, the old code using from fmin_l_bfgs_b and fmin_bfgs has been left under elif blocks and needs to be fully deprecated when confident of changes.

terms(include, *terms)

Terms to include in force & energy calculations.

Parameters:
  • include (bool) – following terms are used (True) or excluded (False)

  • *terms (string) – each argument is a (collection of) component(s) to use (or not), e.g. ‘NonBonded’, ‘CenterOfMass’ (see self.newgroups.keys()). If not provided, terms defaults to all possible components.

Returns:

self

Return type:

Energy

Then specify in getState(…, groups=self.forcegroups) for all calls with getEnergy=True & getForces=True. This is needed when energy and forces are calculated directly. It is not documented in openmm, because it is done implicitly in simulation methods like minimizeEnergy() by LocalEnergyMimimizer().

Tested using overlapping.pdb that has 2 overlapping chains.

..todo:: forcegroups could provide a path for adjusting the weight on

different stereochemical terms, calculating seperately and weighting before summing. Caution: this would undermine the force field self-consistency.

testpartial(atom=34, delta=0.0009, tolerance=0.001)

Spot-check analytical force calculation vs. finite difference partial

Parameters:
  • atom (int) – atom index

  • delta (float) – increment for finite difference calculation

  • tolerance (float) – exception raised if the discrepancy exceeds this

default delta came from search for best finite difference for 1 atom, so this might not be general.

use_minimize = True
Variables:

use_minimize (bool) – migrate to the wrapper available with scipy 1.6 with more consistent input than fmin_bgfs(), fmin_l_bfgs_b() etc..

write_pdb(file='../../Sandbox/Openmm/output.pdb')

Write PDB file of coordinates as curated for openmm

Warning

Expect H to be inserted, B, residue number & Occupancy reset

property xyz

Atomic positions (for openmm system) as Angstrom numpy array

Warning

Expect H to be inserted, B, residue number & Occupancy reset

Returns:

coordinates in Angstrom for atoms in openmm system

Type:

ndarray.shape(3,n)

class energy.EnergyRestraint(atoms, weight=1, symmetry=None, exclude=['centerOfMass'])

Bases: object

Restraint from force field potential energy

Changed in version 1.1.1: 11/05/25 OpenMM requires models with OXT, H. Originally, kept EnergyRestraint models quarantined to allow to differ from rest of program. Now expect input of curated models and symmetry to cap fragments, so that internal models are congruent w/ outside.

Todo

wrap in an “EnergiesRestraint” that would sum independent components either from alternate conformers (not directly supported in OpenMM) or splitting and re-weighting energy terms (bonded & non-bonded).

Parameters:
  • atoms (Atoms) – coordinates (used to extract atom types etc., but positions will be updated for each restraint calculation). This should be the unique protomer from which symmetry equivalents will be generated.

  • weight (float) – multiplier for this energy restraint. It can be used to weight different energy terms, but the sum should be 1.0, with the overall weighting determined by options.weight (applied to the map term).

  • symmetry (Symmetry/NoneType) – defines molecular & lattice symmetry, if any.

  • exclude (list(string(s))) – force groups to exclude

Todo

exclude will need re-implementing if want to exclude covalent.

OpennMM requires “complete” coordinates, including H, X-terminal OXT. Furthermore, peptide fragments of one residue are disallowed, and electrostatic forces can be a dominating artifact if the unnatural N- and C-terminal ends of fragments are not capped. Thus, OpenMM requires modifications of the input coordinates, including insertion of modeled additions. A user might choose to start with coordinates that include (some of) these modifications, but the premise of EnergyRestraint is that they are generated as needed and maintained internally. Regeneration for every function calculation would be time-consuming and perhaps not viable as add_missing() depends on random numbers. Thus, EnergyRestraint will maintain concurrency of coordinate sets in different formats (below), such that they can be updated with just the positions coming from the optimizer.

Variables:
  • plusNei (Atoms) – atoms with symmetry neighbors added.

  • _map2orig (numpy.ma.MarkedArray, dtype=int, shape=len(plusNei)) – unmasked elements provide the index of the corresponding atom in atoms. Masked elements would be openMM additions and there should be none in the 1.1.1 implementation.

f(atoms)

Objective function for potential energy restraint

Parameters:

atoms (Atoms) – current (trial) coordinates

Returns:

energy * self.weight

Return type:

float

f_grad(atoms)

Partial derivatives of energy re: each coordinate.

Parameters:

atoms (Atoms) – coordinates for current (trial) structure.

Returns:

∂U/∂xi, where U is the (total) potential energy, with respect to the position, x, of the ith atom.

Return type:

(ndarray[N, 3],)

This is a pass-through to Energy.fprime, returning the partials which are the negatives of the forces.

get_history()

get the history for f and f_grad

get_neighbors(atoms, lazy=True)

Get the neighbors, using the symmetry module.

Parameters:
  • atoms (Atoms) – Unique set of coordinates.

  • lazy (bool) – use previously calculated list of neighboring atoms

Returns:

symmetry-expanded coordinates

Return type:

Atoms, ndarray, ndarray

With tolerance large (Symmetry.neighbors, below), the IDs of neighbors will only be updated with lazy=False. EnergyRestraint should be instantiated with lazy=False (to ensure independence from prior Symmetry instances). Thereafter atom positions should be updated using lazy=True (and a large tolerance for Symmetry.neighbors() so that the identities of neighbor atoms in the OpenMM system are not changing.

Todo

options to exclude same residue, conformer

update(hist, value)

update the history

Parameters:
  • hist (str) – ‘f’ or ‘f_grad’ designating the history to extend

  • value (str) – text

energy.main()

Regression testing

vs. openmm.app.simulation.Simulation.minimizeEnergy():

  • Above method is a pass through openmm.openmm.LocalEnergyMinimizer.minimize() to _openmm.LocalEnergyMinimizer_minimize in a .so library. Would need source to view (on github.com).

  • https://docs.openmm.org/6.1.0/userguide/theory.html states that LocalEnergyMinimizer implements L-BFGS with distance constraints implemented by iterative escalation of a restraining force (not applicable).

energy.model_stats(coords)

Stereochemical statistics (using PDB validation tool)

Parameters:

coords (Atoms|Energy|string) – coordinates from polmorphic objects or file-name

Returns:

rmsdbond, rmsdangle, ramachandran[favored, allowed, generously allowed]

Return type:

float, float, list(float*3)

Sledgehammer to crack a nut. Runs RCSB validation software in a temporary directory (takes several seconds) to extract a couple of covalent bond RMSDs and Ramachandran statistics.

energy.termini(atoms, rename=True, verbose=True)

Identify terminal residues and rename per Amber

Parameters:
  • atoms (Atoms)

  • rename (bool) – Residue type XXX changed to NXXX or CXXX

  • verbose (bool) – output or silent

Returns:

modified atoms

Return type:

Atoms

Warning

Notwithstanding google, Amber’s terminal-specific residue types (eg. NASN & CASN) are not resolving the ValueError exception in addHydrogens(), so advising against use of termini(). The 4-character terminal residue type are Amber-specific and not compatible with the standard PDB format. The required support in Atoms.from_file() and Atoms.write() has been rolled back with comments dated 10/17/25.

The function may provide a template for the capping of terminal symmetry equivalent oligopeptides.