atoms module¶

Atomic coordinates and properties.

Module author: Michael S. Chapman <chapmanms@missouri.edu>

Authors:

Michael S. Chapman <chapmanms@missouri.edu> Brynmor K. Chapman,

Oregon Health & Science University & University of Missouri

Version:

1, Nov 26, 2024

Changed in version 02/14/2010.

Changed in version 11/4/12: - split from atomic_density.py

Changed in version 0.4.2: (10/24/13)

Changed in version 0.5.0: (3/12/15) ReStructured Text documentation

Changed in version 0.5.2: (8/26/15) is_cterm() support for bonded_main in torsion

Changed in version 0.5.4: 9/29/15 Optional echoing of input ANISOU to output & CA only.

Changed in version 0.5.5: 10/13/15 Anisotropic Us ride with atom rotations.

Changed in version 0.5.6: 01/11/16 Comparison of anisotropic Bs.

Changed in version 0.5.9: 11/21/17 Added atoms.angle() from Davis Catolico.

Changed in version 0.6.0: 06/20/20 Support of different atom-name nomenclatures; write outputs segid if input & not using symmetry extensions of PDB format.

Changed in version 0.6.1: 08/24/20 Support of CNS/X-plor segid designation of alternative conformers;

Changed in version 0.6.2: 08/24/20 Connectivity determined from covalent radii rather than absolute distances, adding support for hydrogens;

Changed in version 1.0.0: 09/26.20 Python 2.7 –> 3.6 Optparse has been deprecated, requiring changes to cmd2 use.

class atoms.Anisotropics(elements)¶

Bases: object

Anisotropic atomic displacement factors for an array of atoms.

Variables:: U (numpy.ma.ndarray.shape(n,3,3)) – Symmetric U tensors, Å:sup:’2’ units, for n atoms.

For atoms without anisotropic U, self.U.mask is True. The class provides methods rotate() to rotate the tensors of a selection of atoms (as one would also rotate atom positions), and iso_like() to rescale U elements to be compatible with (changed) isotropic B-values.

Unit testing is partial, but supplemented by one-off regression testing. In the latter, Ethan Merritt’s coruij.f (Acta D55: 1997-2004 (1999)) was used to compare the anisotropic Us for arginine kinase determined at ambient temperature in two unit cells. On a first run, the two structures were provided unchanged, along with a superimposition matrix. On a second run, the structures were superposed by PaStO and compared using coruij.f with a unit matrix. ADP comparison statistics, including SUV, were nearly identical (there might have been small differences in the superimposition.) This validates __init__(), isotropic(), rotate() and triangular(). Unit tests for ccuij(), cciso(), isotropize(), suij() are available based also on regression from coruij.f.

Initialize anisotropic U tensors from unique elements in PDB

Parameters:: elements (ndarray) – U-tensor [1,1], [2,2], [3,3], [1,2], [1,3], [2,3] units: Å², noting that above indices start at 1, but array elements from 0.

alignment(other, paired=True, scale=False, individual=False, intermediate=(0.0, 1.0), three_axes=False, self_atoms=None, other_atoms=None, diagnostics=[0, 50.0])¶

Attempts to measure alignment of U principle axes.

Parameters:

other (Anisotropics) – unlike self, other must contain only the Us for paired atoms, reordered according to the atom order in self. This will usually be the Anisotropics instance of a Pairing.reordered set of atoms.
scale (Group or Selection or bool) – normalize for average b_iso atom size within group(s) of atoms in self (if specified), treating all non-selected atoms as an additional group.
individual (bool) – scaling by individual atom, else by average isotropic B
intermediate ((float, float)) – beyond these prolateness limits, the ellipsoid will be considered pure oblate and prolate respectively, transitioning between the two with a linear ramp between the limits.
three_axes (bool) – calculate alignment in an additional way using all three axes, permuting to find the best alignment.
self_atoms (Atoms) – from which self’s Us taken (for debugging only).
other_atoms (Atoms) – from which other’s Us taken (for debugging only).
diagnostics ([int, float]) – print diagnostics for 1st n cases where alignment stats differ by more than threshold.

Returns:

misalignment [x 3 if three_axes]

Return type:

MaskedArray [x 3 if three_axes]

Misalignment is calculated by VecList.align_unique() which, in over-simplified explanation, could be considered to be the rotation from best alignment of the unique axes from paired ellipsoid atoms.

Misalignment can optionally also be calculated (if three_axes) as the average (cosine) deviation between vectors in self and other, after permuting self for the maximum sum of dot products between paired vectors. This has been deprecated from Pairing, because inspection of discrepancies between the two misalignment calculations showed that this calculation was not sensitive to ~90 deg rotations of ellipsoids.

analyze()¶

Anisotropy and prolateness (cigar-like shape).

Returns:: anisotropy, prolateness
Return type:: (numpy.ma.ndarray, numpy.ma.ndarray)

Anisotropy is defined conventionally: The eigenvalues of the U tensor are the squared displacements along the principal axes, and it is their ratios that are used. Anisotropy = minimal axial displacement² / maximal, so its range is from 0.0 (highly anisotropic) to 1.0 (spherical);

Prolateness is arbitrarily defined here by the ratios of displacement: (axis_max - axis_mid)/(axis_max - axis_min), so its range is from 0.0 (oblate, pancake) to 1.0 (prolate, cigar). Note that prolateness is calculated with the displacements (not squared) on which the density ellipsoids depend, whereas anisotropy is calculated conventionally using displacement².

axes()¶

Magnitudes and principal axes of displacement.

Returns:: (square displacements along principal axes, unit vectors along each row (fastest dimension), so the m^th vector of tensor n is in axes[n, m, :])
Return type:: (numpy.ma.ndarray.shape=(n,3), numpy.ma.ndarray.shape(n,3,3))

cciso()¶

Anisotropic atom density correlation w/ its isotropic equivalent.

Used in Merritt, E.A. Acta D55: 1997-2004 (1999).

ccuij(other, scale=False)¶

Density correlations between this and another set of anisotropic atoms

Equation 6 of Merritt, E.A. Acta D55: 1997-2004 (1999)

Parameters:

other (Anisotropics)
scale (bool) – normalize for atom size, as in numerator of eqn 8 of Merritt

Return ccuij:

Correlation coefficient for densities of each atom where anisotropic.

Rtype MaskedArray:

See also

method suij for normalized shape correlation

check(atoms=None, verbose=True)¶

Raises diagnostic exceptions if fails integrity checks.

Atoms without anisotropic U are masked.

Parameters:: atoms (Atoms) – from which Us were derived, used for optional diagnostics
Return self:: (just so that check can be chained w/ other methods.)
Rtype Anisotropics:

divergence(other, paired=True, scale=False, individual=False)¶

Kullback-Liebler divergence between this and another set of anisotropic atoms

Equation 23 of Murshudov et al. Acta Cryst. D67: 255-67 (2011); Equation 8 of Merritt, E.A. Acta Cryst. A67: 512-16 (2011)

Parameters:

other (Anisotropics) – unlike self, other must contain only the Us for paired atoms, reordered according to the atom order in self. This will usually be the Anisotropics instance of a Pairing.reordered set of atoms.
scale (Group or Selection or bool) – normalize for average b_iso atom size within group(s) of atoms in self (if specified), treating all non-selected atoms as an additional group.
individual (bool) – scaling by individual atom, else by average isotropic B

Returns:

KL divergence in density for each atom pair where anisotropic.

Rtype MaskedArray:

See also

method suij for normalized shape correlation

The scaling represents extensions over Murshdov and Merritt who were considering situations where B-factors would be similar and not for (say) different temperatures. Don’t know how will behave.

Empirically find that scaling does not affect the commutivity of the symmetrized K.-L. divergence, i.e. divergence(a, b) = divergence(b, a)

iso_like(isotropic, selection=None, copy=False)¶

Scale the anisotropic U(s) to match the isotropic B(s).

Parameters:

isotropic (numpy.ma.array) – B-factors for each atom in self.U
selection (ndarray or Selection or NoneType) – True for scaled atoms or None for all.
copy (bool) – returns modified copy of self.U; else in-place, modifying self

Returns:

modified U if copy, else self, U changed (only) at n selected atoms

Return type:

ndarray or Anisotropics

See also

atoms.iso_like() for single atom.

isotropic()¶

Convert anisotropic U(s) to isotropic B(s).

Parameters:

start (int) – index of 1st atom.
end (int) – index of atom after last.

Returns:

isotropic B-factor(s) as a masked array where can be calculated

Return type:

numpy.ma.ndarray

See also

atoms.isotropic() for single atom.

isotropize(selection=None, copy=False)¶

Spherically symmetric equivalent anisotropic U tensors.

Parameters:

isotropic (numpy.ma.array) – B-factors for each atom in self.U
selection (ndarray or Selection or NoneType) – True for scaled atoms or None for all.
copy (bool) – returns modified copy of self.U; else modified self

Returns:

modified U if copy, else self, U changed (only) at n selected atoms

Return type:

ndarray or Anisotropics

join(added)¶: Join two instances.

Warning

Untested as of 10/17/15

Warning

use with care as result will cease to be aligned with Atoms object.

rotate(operator, selection=None, check=False)¶

Transform Us of selected atoms by rotational component of operator.

Parameters:

operator (numpy.matrix)
selection (ndarray or Selection or NoneType) – True for transformed atoms or None for all.
check (bool) – cross-check rotational invariants (for debugging).

Returns:

self, U changed (only) at n selected atoms

Return type:

Anisotropics

Internal checking was coded, because a fraction of atoms displayed by Coot seemed to have ellipsoids with axes rotated by up to 20 degrees or altered ellipticities. Subsequently, viewing in PyMol is much more satisfying, so we think that it is a bug in Coot. Internal testing includes: (1) that magnitidues of the principal axes are ~invariant with rotation. (2) if reference frame axes are introduced, then the direction cosines of all principal axes remain ~invariant with rotation of both the anisotropic Us and the reference axial basis. These were all checked with hexagonal insulin (non-orthogonal). Precision of ~1.e-05 and ~0.3 deg. seems commensurate with the 5-digit roundoff of ANISOU records in a PDB file.

scale(other, paired=True, uniform=None, individual=False)¶

Scale U in a copy of self so that B_iso is the same as in other.

Scaling U tensor is equivalent to a resizing of the real-space atom ellipsoid.

Parameters:

other (Anisotropics) – reference whose Us against which self.U is scaled.
paired (Selection|bool) – atoms of self that are paired with atoms in other.
uniform (Group or Selection or NoneType) – group(s) of atoms in self, within which scaling by average <B_iso> is applied. All atoms not selected are combined into another group for scaling. Thus, the default (None) is all atoms scaled uniformly as one group.
individual (Group or Selection or bool) – group(s) of atoms in self for which scaling is by each B_iso. True for all atoms, False for none.

Returns:

new

Return type:

Anisotropics

scale_individual(other, paired=True, selection=None)¶

Scale so that each selected atom in copy of self.U has same B_iso as in other.

Scaling U tensor is equivalent to a resizing of the real-space atom ellipsoid.

Parameters:

other (Anisotropics)
paired (Selection or bool) – atoms of self that are paired with atoms in other.
selection (Selection or NoneType) – True for selected atoms; None for all atoms

Return new:

Rtype Anisotropics:

scale_uniform(other, paired=True, selection=None)¶

Scale all atoms in selection so that <B_iso> of paired atoms is same as in other.

Scaling U tensor is equivalent to a resizing of the real-space atom ellipsoid.

Parameters:

other (Anisotropics)
paired (Selection|bool) – atoms of self that are paired with atoms in other.
selection (Selection|Group|NoneType) – True for selected atoms; None for all atoms

Return new:

Rtype Anisotropics:

selection(selection)¶

Subset of Anisotropics instance, where selection is True.

Parameters:: selection (Selection) – True where to be copied
Returns:: New set of anisotropic Us
Return type:: Anisotropics

Warning

Untested as of 10/17/15.

Warning

incompatible with update_pdb()

Warning

try to avoid use as result will cease to be aligned with Atoms object. Better to apply selection (or selection.u to self.u) to all-atom arrays.

shuffle()¶

Copy with the anisotropic tensors shuffled randomly between atoms.

Can be used to calculated “randomized” statistics.

subset(start, end=None)¶

Subset of Anisotropics instance, starting at start, and ending at end-1.

Parameters:

start (int) – 1st atom (starting at 0)
end (int) – last atom (default is start+1 for single atom)

Warning

Untested as of 10/17/15.

Warning

incompatible with update_pdb()

Warning

use with care as result will cease to be aligned with Atoms object.

suij(other)¶

Shape correlations between this and another set of anisotropic atoms

After normalization for size. Equation 8 of Merritt, E.A. Acta D55: 1997-2004 (1999); except that the numerator is squared as in the 18-Apr-1999 correction of Merritt’s coruij.f that should use square. (Ethan confirmed ~1/10/16 that square should be used and that the square was used to calculate the statistics in the paper.)

Parameters:: other (Anisotropics)
Return suij:: Correlation coefficient for densities of each atom where anisotropic.
Rtype MaskedArray:

See also

method ccuij for un-normalized correlation

triangular()¶

Top right triangle of U for PDB output.

Returns:: U-tensors [1,1], [2,2], [3,3], [1,2], [1,3], [2,3] for n atoms; units: Å², noting that above indices start at 1, but array elements from 0.
Return type:: numpy.ma.ndarray.shape(6, n)

See also

atoms.triangular() for single atom.

class atoms.Arguments(imports=[], main=None, *args, **kwargs)¶

Bases: Arguments

Manager for command-line arguments from main and imported modules.

This is a template to be copied into modules, for the handling of command- line options.

methods export() and domestic() should be overridden by the module subclass with declarations of the command-line arguments needed by the module.

Parameters:

imports ((list of) ArgumentParser(s)) – (list of) module(s) from which to obtain “parent” ArgumentParser objects for inclusion.
main (bool|NoneType) – Add information and arguments appropriate for a main program (or not); if None, will determine by whether this class is defined in __main__.

domestic()¶: Defines options used only when main program and not when imported.

export()¶: Defines options used in both stand-alone and imported modes.

class atoms.Atoms(filename=None)¶

Bases: object

Arrays for the atomic coordinates.

Coordinates (xyz), type, occupancy (o) and B-factor (b) are attributes required by core RSRef methods. Optional attributes should be assigned if available from the PDB file, but will tolerate omissions in non-standard PDB files. Arrays will be of length, N = number of (ATOM + HETATM) records, in the last dimension. Unavailable attributes are set to None. Atom_type is used to select the form factor and is derived from element and charge (if available) or atnam in the many PDB files lacking element.

Changed in version 05/20/14: - evaluate and close changed from Selection to ndarray to avoid recursive inclusion of Atoms within Selection.

Todo

Torsion angle parameterization is no longer dependent on MMTK, so simplifications to Atoms are possible: (1) link_torsion (re-binding xyz to external array) not needed, although remote possibility that useful with another external optimizer. (2) polymer_chain is not needed. (3) setattr() is not as crucial, but may still be useful with other (CNS) optimizers. Might want profile to see if worth the overhead. (4) property for xyz can be removed with link_torsion, but could be retained for remotely possible future use if profiling indicates little overhead.

Parameters:

filename (str) – PDB file with which to initiate, else instantiates empty Atoms object.

Variables:

~.xyz (ndarray(type=float,shape=(3,N)) – atom positions in orthogonal Angstrom coordinates, required.
o (ndarray(type=float)) – atomic occupancies, required.
b (ndarray) – atomic temperature factors, required.
atype (ndarray(type=string)) – atom atype, required.
~.element (ndarray(type=string)) – element symbol, recommended.
charge (ndarray(type=string)) – eg “++”, recommended.
atnam (ndarray(type=string)) – atom name, required if element absent.
nomenclature (str|NoneType) – convention (if known) for atom names, eg. cif, pdb92, cns
altloc (ndarray(type=string)) – alternative location indicator, optional.
resnam (ndarray(type=string)) – residue name, optional.
chain (ndarray(type=string)) – chain identifier, optional.
resnum (ndarray(type=int)) – residue number, 1-999, optional.
~.insert (ndarray(type=string)) – residue insertion indicator, optional.
segid (ndarray(type=string)) – segment identifier, optional.
_alt_as_segid (ndarray(type=string)) – non-standard CNS alt conformation identifier as segid, optional.
~.bonded (list(list())) – list (each atom) of hash lists of connected intra-chain atom numbers assigned by calling connectivity().
~.evaluate (ndarray(type=bool)) – atom included in comparison of model & experimental density; a superset of refining atoms.
close (ndarray(type=bool)) – neighboring atom whose density might contribute to atoms being evaluated.
linenum (ndarray(type=int)) – atom’s line number from input file.
limits (ndaray(type=float,shape=(2,3)))) – lower and upper limits (x,y,z) for coordinate bounding box.
margin (float) – margin surrounding atoms used in bounds (limits) calculation.

class ChainName¶

Bases: object

Unique chain_ids for symmetry equivalents, if possible.

warning = '* Using %s to get unique chain identifiers.*'¶

altloc_in_segid()¶

Standard PDB altloc designation from CNS segid alternative conformer

CNS does not use the standard alternative location indicator, but a segid of the form AC1, AC2, … If the CNS designator is found, the standard altloc designator is set (if not already), the segid AC designator is saved for output in self._alt_in_segid, then self.segid is cleared (for these records) so that segid can be treated in the standard way.

angle(i, j, k)¶

Angle between atoms i, j, and k. Corrected to fit within 180 degrees

Parameters:: i,j,k (int) – atom indices
Returns:: angle (degrees)
Return type:: float

angle_cos(i, j, k)¶

Angle between atoms i, j, and k. Corrected to fit within 180 degrees

Parameters:: i,j,k (int) – atom indices
Returns:: angle (degrees)
Return type:: float

angle_cterm_dc(i, j, k)¶

The dot product of atoms i, j, and k divided by the magnitude of the distance(i,j) and distance(k,j) This method is only used to obtain x where arccos(x) would calculate the angle between i, j, and k

Parameters:: i,j,k (Atoms) – coordinates
Authors:: Davis Catolico
Deprecated:: appears confused as dot will expect vectors, distance will expect atom numbers. The name seems to refer to the cosine (term) of the dot product involved in angle calculation.

angle_dc(i, j, k)¶

Angle between atoms i, j, and k. Corrected to fit within 180 degrees

Parameters:: i,j,k (Atoms) – coordinates
Authors:: Davis Catolico
Deprecated:: appears confused as dot will expect vectors, distance will expect atom numbers

are_close(attr, value, atol=0.01, rtol=0.0, nan_exception=True)¶

Indices for atoms whose attribute close to value.

Parameters:

attr (str) – one of the one-dimensional float arrays, eg. b, occ
value (float) – target
atol (float) – finds absolute(array[:]-value) <= (atol + rtol*value)
rtol (float) – see above
nan_exception (bool) – raise exception if isnan(value), else return empty list

Returns:

list of indices where close

Return type:

bool ndarray

Note

think about input precision of PDB file when setting atol.

See also

is_close

are_neighbors(distance=3.0, verbose=True)¶

Find atoms that might be close to those being evaluated.

Parameters:: distance (float) – within which considered close.
Returns:: True if atoms i & j within distance
Return type:: ndarray(shape=(n,n), dtype=bool)

The boolean array is very sparse, so its full storage might be a concern. For 20,000 atoms, 50 MB is needed (as boolean), and the calculation takes 2s. A dictionary of 1D booleans for only refining atoms would usually take about half the space. A dictionary of the sparse indices for neighbors would cut the storage to 0.2MB, but at the cost of higher computation each time it is used. Neither of these alternatives are coded up, but below the return are blocks that are simpler implementations of the boolean array that might be more easily adapted to dictionaries / indices.

property bonded¶

Hash list of intra-chain atom connectivities.

Returns:: Hash list of intra-chain atom connectivities.
Return type:: list (each atom) of lists of connected atom numbers

bounds(margin=0.0)¶

Limits on orthogonal box containing evaluate coordinates w/ margin.

Parameters:: margin (float) – distance (Angstrom) added around the coordinates.
Returns:: If M{m=}margin, M{ [[x_min-m, x_max+m] [y_min-m, y_max+m] [z_min-m, z_max+m]]}, empty if not any(self.evaluate).
Return type:: ndarray(type=float, shape=(2,3))

checks_off()¶

Disable bounds checking on Atoms arrays.

Returns:: self
Return type:: class Atoms

Warning

could lead to fragile interfaces with external packages if arrays are shared.

checks_on()¶

Enable bounds checking on Atoms arrays.

Returns:: self
Return type:: class Atoms

property conformer¶

Conformers sorted 1, 2, 3… by occupancy then alphabetically.

Occupancy is rounded to 3 decimal places (PDB is 2 places).

connectivity(rtol=0.1, atol=0.1, docTol=2.0, within=76, disulfide=False, metal=True, verbose=False)¶

Hash list of intra-chain atom connectivities.

Parameters:

shortest (float) – minimum distance (A) considered covalent connection.
longest (float) – maximum distance (A) considered covalent connection.
rtol (float) – deviation from ideal bond length, still to be considered as connected. Calculated as abs(length-ideal) <= (atol + rtol*abs(ideal)). Ideal is calculated from covalent radii, with an rms error of ~0.05A, a lower limit on tolerance. Tolerance should be small enough to exclude: (a) next nearest neighbors which can be ~1.4 bonds lengths distant, and (b) hydrogen-bond contacts when hydrogens are present, starting at about 1.4A, for a 2.4A strong donor–acceptor distance. Probably want 0.15 to 0.3A.
atol (float) – (continued) limit on designation as bond.
docTol (float) – document near-misses that are within docTol of the above tolerances
within (int) – +/- search window (atoms); 999999 for slower full search; the default (34) is 3 x size of largest residue (Trp or Phe w/ Hs = 25) plus 1. Three times size allows these residues to be the furthest separated of 2-alternates at each site.
disulfide (bool) – include disulfides (want false if using connectivity for torsion angle refinement)
metal (bool) – metal atoms can be covalently bonded. Else all assumed to be ions, regardless of atom or residue name.

Returns:

Hash list of intra-chain atom connectivities.

Return type:

list (each atom) of lists of connected atom numbers

Todo

(possibly) add back in a peptide connectivity from connectivityOld() based on residue/atom numbers/names if want to restore connectivity that might be broken in fragment refinement.

Todo

improvements in efficiency are possible with numpy.isin() when upgrade numpy.__version__ >= 1.13. See comments.

The new implementation has been regression tested against the old, and, when errors of the old vesion involving hydrogen atoms are ignored, exactly the same results are obtained for insulin, arginine kinase and the high resolution AAV-DJ structures.

Crambin is irredeemable with C–H bonds varying from 0.6 to 1.6 A. Seems like this 0.48A structure was over-fit. Superficial inspection of the output suggests that the new implementation is working as designed, it is not possible to find distance criteria for either old or new implementations that capture all of the bonds that should be captured without forming bonds between close contacts.

connectivityOld(shortest=0.5, longest=2.0, within=76, verbose=False)¶

Hash list of intra-chain atom connectivities, excluding disulfides.

Parameters:

shortest (float) – minimum distance (A) considered covalent connection.
longest (float) – maximum distance (A) considered covalent connection.
within (int) – +/- search window (atoms); 999999 for slower full search; the default (34) is 3 x size of largest residue (Trp or Phe w/ Hs = 25) plus 1. Three times size allows these residues to be the furthest separated of 2-alternates at each site.

Returns:

Hash list of intra-chain atom connectivities.

Return type:

list (each atom) of lists of connected atom numbers

Deprecated since version 0.5.6: replaced with algorithm based on covalent radii.

Todo

(possibly) add to connectivity() a peptide connectivity from connectivityOld() based on residue/atom numbers/names if want to restore connectivity that might be broken in fragment refinement.

property covalentRadius¶

Array of approximate bond half-lengths.

Returns:: covalent radii, which, if summed for 2 atoms, approximates the covalent bond length.
Return type:: ndarray

A covalent bond length depends on the bond-type (single, double) which depends not only on the atom, but its partner. A per atom approximation is useful in a preliminary determination of bond connectivity and for crude restraints. It will be accurate (only) when the bond-type is unambiguously defined by element (or atom name), because this light implementation does not use toplogy-defining files. Several atoms have different bond-types, for example a carbonyl carbon has single bonds along the backbone, and a double-bond to the oxygen. In this implementation, accuracy of backbone is prioritized, so, in this example, the carbonyl carbon will be assigned single-bonded radii, and the double bond will be 8 pm too long.

diff(i, j)¶

Difference vector from atom i to j.

Parameters:

i (int) – atom index
j (int) – atom index

Returns:

vector

Return type:

ndarry(shape=(3,))

See also

difference_vector_sym for support of symmetry & matrix output

difference(reference, selection=None, reference_selection=None, verbose=False)¶

Difference between coordinate sets: rmsd(xyz), <abs(O1-O2)>, <abs(B1-B2)>

Requires atoms to be in the same order

Parameters:

reference (class Atoms) – atoms
selection (Selection or Group) – atoms in self to use.
reference_selection (Selection or Group) – atoms in reference to use (must correspond to seletion).

Returns:

RMS deviation in coordinates, mean absolute difference in coordinates, occupancy & B-factor

Return type:

4 floats

Todo

See superpose.Pairing.difference for additional functionality that could be added: handling and listing of outliers.

difference_vector_sym(i, j, symmetry, local_i=0, lattice_i=0, a_trans_i=0, b_trans_i=0, c_trans_i=0, local_j=0, lattice_j=0, a_trans_j=0, b_trans_j=0, c_trans_j=0)¶

Vector from (symmetry-equivalent) atom i to j

Parameters:

i (int) – index for 1st atom
j (int) – index for 2nd atom
symmetry (Symmetry) – definition of the symmetry with local, space group and crystal lattice operators referenced with the i, j subscripts in the following arguments.

Returns:

vector

Return type:

numpy.matrix(shape=(3,1)), i.e. column vector, so careful w/ numpy.dot()

differences(reference, selection=None, reference_selection=None, check=True)¶

Positional differences between coordinates by residue: each atom, rms.

Requires same atoms, same order.

Parameters:

reference (class Atoms) – atoms
selection (Selection or Group) – atoms in self to use.
reference_selection (Selection or Group) – atoms in reference to use (must correspond to seletion).
check (bool) – atoms must fully match between self and reference selections, else just checks atom name.

Raises:

AssertionError – if {check and chains, segid, resnum} or atnam do not match between self and target selections.

distance(i, j)¶: Distance between atoms i and j.

distance_sym(i, j, symmetry, local_i=0, lattice_i=0, a_trans_i=0, b_trans_i=0, c_trans_i=0, local_j=0, lattice_j=0, a_trans_j=0, b_trans_j=0, c_trans_j=0)¶

Distance between symmetry-equivalents of atoms i & j

Parameters:

i (int) – index for 1st atom
j (int) – index for 2nd atom
symmetry (Symmetry) – definition of the symmetry with local, space group and crystal lattice operators referenced with the i, j subscripts in the following arguments.

from_file(filename, fileobj=None, method='default')¶

Reads scatterers from PDB file.

Parameters:

filename (str|NoneType) – PDB file or None.
fileobj (fileobject) – required if filename is None.
method (str) – to call more sophisticated library for PDB input.

Enables Atoms.checkbounds.

from_list(attribute, mylist)¶

Safe copy by value, the elements of mylist into atoms.attr.

Parameters:

attribute (str) – Atoms attribute name to be set, or “x”, “y”, “z” to set one of the coordinates of the 2D matrix.
mylist (sequence) – array-like values

Note

If called from C, requires list and all elements to be PyObjects.

Note

As copy-by-value, this is perhaps the safest in-memory way of getting data into RSRef from external programs.

Safety checks: noting that valid Atoms attributes are instantiated as None, will raise exceptions if the attribute does not exist or has been assigned previously to an array of different dimensions.

from_reader(filename)¶

Coordinate information from MMTK PDB reader.

Parameters:: filename (string)

Links self.xyz to torsion angle array. .. todo:: Set link to torsion elsewhere if MMTK used more generally. .. todo:: (with Brynmor) - set self.polymer[serial_number]=True for ATOM, else False.

init_non_pdb()¶: Completes an instance when attributes cannot be read from PDB file.

Note

Requires: coordinates self.xyz to be set with correct number of atoms.

See also

from_list - might have been used to set coordinates from an external program, and the PDB file may not be available to initialize all attributes required by RSRef.

Warning

Attributes are filled with guesses, with possible odd consequences… particularly on writing coordinates, but it is anticipated that this method will only be called by external programs that will handle such functions independently.

Warning

No support (yet) for anisotopic Us. Not sure where would come from or how to initialize.

is_alt(i, j)¶: True if atom i is an alternate location of atom j

is_close(attr, value, atol=0.01, rtol=0.0, nan_exception=True)¶

Boolean for atoms whose attribute close to value.

Parameters:

attr (str) – one of the one-dimensional float arrays, eg. b, occ
value (float) – target
atol (float) – finds absolute(array[:]-value) <= (atol + rtol*value)
rtol (float) – see above
nan_exception (bool) – raise exception if isnan(value), else return [False]*shape_array

Returns:

True where close

Return type:

bool ndarray

Note

think about input precision of PDB file when setting atol.

See also

are_close

is_corresponding(other, attr, value, other_value, atol=0.01, rtol=0.0, alts_ok=False, exception=True)¶

Check that self and other attribute have value & other value for a common atom.

Parameters:

other (Atoms)
attr (str) – one of the one-dimensional float arrays, eg. b, occ
value (float) – expected in self.attr
other_value (float) – expected in other.attr
atol (float) – finds absolute(array[:]-value) <= (atol + rtol*value)
rtol (float) – see above
alts_ok (bool) – allows matches between different alternative conformers.
exception (bool) – on failure, throw exception rather than warning.

Returns:

corresponds

Return type:

bool

Raises:

KeyError – on failure if exception=True

is_cterm(i, j)¶

True if atom i is C-terminal of atom j, same chain & backbone

6/18/20 - this might be fragile. Adding alternative C-term atom names, but might also need backbone hydrogens.

join(added)¶: Join two Atoms instances.

Warning

limits are recalculated with self.margin.

Warning

incompatible with update_pdb()

Todo

Incompatible with xyz as property

label(i)¶

Hashable id for atom i:

[segid/]chain/residue number/atom name/alternative locator.

Parameters:: i (int)
Returns:: label
Return type:: str

See also

labels

labels(indices)¶

Hashable ids for atoms with specified indices:

[segid/]chain/residue number/atom name/alternative locator.

Parameters:: indices (array-like)
Returns:: labels
Return type:: list(str)

See also

label

labels_close(attr, value, atol=0.01, rtol=0.0, nan_exception=False)¶

Labels for atoms whose attribute close to value.

Parameters:

attr (str) – one of the one-dimensional float arrays, eg. b, occ
value (float) – target
atol (float) – finds absolute(array[:]-value) <= (atol + rtol*value)
rtol (float) – see above
nan_exception (bool) – raise exception if isnan(value), else return empty list

Returns:

list of indices where close

Return type:

bool ndarray

Note

think about input precision of PDB file when setting atol.

See also

are_close, labels

link_torsion()¶

Point self.xyz to the array used in torsion angle optimization.

Returns:: self
Return type:: class Atoms

max_diff(reference, selection=None, reference_selection=None, verbose=False)¶

Max absolute difference between coordinate sets: xyz, occupancy, B

Requires atoms to be in the same order

Parameters:

reference (class Atoms) – atoms
selection (Selection or Group) – atoms in self to use.
reference_selection (Selection or Group) – atoms in reference to use (must correspond to seletion).

Returns:

RMS deviation in coordinates, mean absolute difference in coordinates, occupancy & B-factor

Return type:

4 floats

neighbors(distance=3.0, verbose=True)¶

Find atoms that might be close to those being evaluated.

Parameters:: distance (float) – within which considered close.

Todo

if fast, make into a property, so don’t have to call. Could monitor whether coordinates have been changed significantly.

See also

symmetry.Symmetry.neighbors for neighbors in presence of symmetry.

next_residue()¶

Iterator over residues.

Returns:: (residue number, chain, segid)
Return type:: (int, str|NoneType, str|NoneType)

next_residue_atoms()¶

Iterator over residues.

Returns:: atoms in the next residue
Return type:: Selection

pdb_coord_records = ('ATOM', 'HETATM')¶

randomize(rms_xyz=0.0, std_o=0.0, std_b=0.0, seed=None, fix_an_o=False)¶

Add normally distributed error to atomic parameters.

Parameters:

rms_xyz (float) – target total rms error to be split into x,y,z components
std_o (float) – target standard deviation of occupancy error to be applied
std_b (float) – target standard deviation of b-factor error to be applied.
seed (int or NoneType) – for random number generator - for reproducible results.
fix_an_o (bool) – leave o[0] unchanged so that occupancies not co-linear with scaling in refinement.

Returns:

actual errors applied to xyz (rms), occupancy (std) & B (std).

Return type:

3 floats

residue(i, contiguous=False)¶

Atoms in the same residue as atom i.

Parameters:

i (int) – atom index
contiguous (bool) – assert that the atoms must be contiguous in array

Returns:

atoms in same residue

Return type:

Selection

residueID(i)¶

Hashable id for residue of atom i:

Returns:: ([segid|chain], residue number, insert), where a non-blank chain takes precedence over segid.
Return type:: tuple(str, int, str)

same(other)¶: Same names. Coordinates, B, occupancy can differ.

Warning

does not test derived attributes such as evaluate.

same_conformer(verbose=False)¶

Identify pairs of atoms in the same conformer.

Defined in practice as sharing the same conformer name, or if either of the pair of atoms is single conformer.

Returns:: A[i,j] is True when i & j are in the same conformer
Return type:: ndarray(shape=(natom, natom), dtype=bool)

select(selection)¶

Subset of Atoms instance, defined by a boolean array.

Parameters:: selection (ndarray(type=bool)) – True for atoms to be copied.

Warning

limits are recalculated with self.margin.

Warning

incompatible with update_pdb()

Todo

incompatible with xyz as a property

sort(selection=None, template=None, use_segid=False, return_matched_template=False, verbose=False, quiet=False)¶

Reorder atoms into the same order as corresponding atoms in template.

Parameters:

selection (Selection) – Return only selected atoms of self.
template (class Atoms.) – Self to be sorted into same order as template which should contain same or super-set of atoms in self.
use_segid (bool) – May want to skip as chain always required, but segid optional.
return_matched_template – in addition to the newly ordered Atoms, return the Selection of template atoms that were matched.
verbose (bool) – list un-matched atoms
quiet (bool) – silent (else print summary)

subset(start, end=None)¶

Subset of Atoms instance, starting at start, and ending at end-1.

Parameters:

start (int) – 1st atom (starting at 0)
end (int) – last atom (default is start+1 for single atom)

Warning

limits are recalculated with self.margin.

Warning

incompatible with update_pdb()

unlink_torsion()¶

Disconnect self.xyz from torsion angle optimization.

Returns:: self
Return type:: class Atoms

update_pdb(infile, outfile='rsref.pdb', header=None, interlace=False)¶

Copy infile to outfile, updating coordinates, occupancies and B-factors

Parameters:

infile (str) – PDB file name.
outfile (string or file object) – PDB file name or open file.
header (str) – inserted at top of output PDB
interlace (bool) – alternate input and output atom pairs (for debugging)

Returns:

number of atoms written

Return type:

int

..note::: infile & outfile must contain the same atoms in the same order.

Deprecated since version mostly: replaced by self.write

write(atoms=None, outfile='rsref.pdb', header=None, anisou=False, ca=False, rsref=None, options=None, stats=True, symmetry=None, symout=None, neighbors_only=False)¶

Write PDB file.

If symmetry is provided, a non-standard PDB extension is invoked to add neighboring (equivalent) atoms after the standard PDB output. The neighbors may deviate from the PDB standard in several ways, depending on options selected during the prior symmetry expansion:

The atoms may not correspond to comprise exactly one asymmetric unit.

Residues or chains may not be complete if only neighboring atoms were selected.

Columns 68-72 of ATOM/HETATM records (usually not used) document the crystallographic symmetry operator that generated the atom (if not the unit matrix):

Columns 68-69 contain the zero-filled space group operator number used to generate the atom.

Columns 70-72 contain the unit cell translations down a, b, c expressed as modulo-10 single digits, eg. 109 means +1 unit cell along a, 0 along b and -1 unit cell along c.

Columns 73-76, containing the segid, in the Xplor/CNS variant PDB, contain the non-crystallographic symmetry (NCS) operator name (if not named ‘unit’).

A large number of chain identifiers may be required, and are provided on first-come/first-served basis:

The original chain IDs are used if not in conflict.

A new ID is generated from the 1st letter of the segid or NCS operator name.

If these have already been used, generate a new ID starting from the end of the alphabet, then going through numbers, then lower case letters.

Parameters:

atoms (class Atoms) – Coordinates to write out or None for self.
outfile (string or file object) – PDB file name or open file.
header (str) – inserted at top of output PDB
anisou (bool) – write anisotropic displacement parameters (currently copied verbatim from intput, and therefore not supporting change in cell, structure or symmetry-equivalent. Use with care!)
ca (bool) – write out only CA atoms.
rsref (class RSRef or NoneType) – Real-space refinement instance from which statistics and unit cell are taken.
options (class Options) – Calling program options used in documenting PDB file, defaults to Q.option.
stats (bool) – Add refinement statistics to the remarks.
symmetry (class Symmetry or NoneType) – Expand symmetry (local, space group, lattice) according to a previously instantiated symmetry into a non-standard PDB file.
symout (string or NoneType) – which of the above equivalents to output, ‘local’ or ‘full’.
neighbors_only (bool) – write only if close=True (excluding evaluate=True), to output only the neighbors of a selection.

Returns:

number of atoms written

Return type:

int

Note

If chain or residues completion has been selected during symmetry expansion, evaluate and close may no longer be mutually exclusive.

Note

Requires the input PDB to have unique serial numbers (atom numbers) for each atom, so take care with manually edited PDB files.

Changed in version 0.6.0: 06/20/20 Support different atom nomenclatures; outputs segid if input & not later replace_segid with identifiers for symmetry-extensions to PDB format.

property xyz¶

class atoms.Collection(attribute, selections, verbose=False)¶

Bases: Group

Group with a Selection for each value of an Atoms attribute.

Examples would include Selections for each chain, or each residue.

Parameters:

seq (dict or iterable) – (if present) either a pre-existing Group dictionary, or an iterable of paired (name, Selection).
named – a sequence of name=Selection arguments.

class atoms.Commands(tasks=None, initial_commands=None, py_locals={}, parser=None)¶

Bases: TopCmd

Atoms-related cmd2 commands for importing/subclassing to other modules.

Parameters:

tasks (Control|NoneType) – class instance that defines methods to be called.
initial_commands (str or list of str) – file(s) of commands to run before interactive loop at the top level, and before any command-line commands.
py_locals (dict) – objects added to the local namespace in the python interpreter invoked by ‘py’ commands.
parser (ArgumentParser (argparse) or OptionParser (optparse)) – to which -t / –test option will be added for cmd2 regression testing. Should only be used once / program.

ap1 = ArgumentParser(prog='sphinx-build', usage=None, description=None, formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True)¶

ap2 = ArgumentParser(prog='sphinx-build', usage=None, description=None, formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True)¶

do_pdbout(args)¶: Write coordinates (symmetry expansion as per command-line arguments).

do_select(args)¶: Name selection(s) of atoms.

help_SELECTION_EXPR()¶

class atoms.Documentation¶

Bases: object

User-level documentation for imports.

class Selection_Group¶

Bases: object

property introduction¶: Generic documentation for import into other modules.

class atoms.Group(*seq, **named)¶

Bases: dict

Manage a dictionary of atom group selections.

Performs consistency checks:

Items are class Selection.

Selections are from the same Atoms object.

Selections can overlap only if they are subsets of one another:

Eliminating most conflicts, but…

Allowing a group w/in a group to be refined, but…

May be commutivity issues, but hopefully small for incremental changes during a refinement that will not stop convergence.

Parameters:

seq (dict or iterable) – (if present) either a pre-existing Group dictionary, or an iterable of paired (name, Selection).
named – a sequence of name=Selection arguments.

check()¶

Raises:: AssertionError – for items that are not Selections, or if selections have different Atoms objects, or if Selections overlap w/o being subsets.

checks = True¶

Variables:: checks – tests for internal consistency on new or modified Groups

count()¶

Numbers of atoms in each Selection.

Returns:: atom counts
Return type:: dict

duplicates()¶

Keys of Selections that are identical

Note that len(duplicates()) indicates whether there are or not duplicates.

Returns:: identical selections
Return type:: list(list(str))

empty()¶

Returns:: True if selection has zero atoms.
Return type:: bool

expand()¶

Iterates recursively through (nested) groups yielding each Selection.

Returns:: Atomic selection.
Return type:: class Selection
Raises:: TypeError – if encounters any value that is not Group or Selection

Todo

What if None? - is there a need to return None immediately?

expand_items()¶

Iterates recursively through (nested) groups yielding each (key, Selection).

Returns:: key, Atomic selection.
Return type:: Selection
Raises:: TypeError – if encounters any value that is not Group or Selection

Todo

What if None? - is there a need to return None immediately?

from_criteria(criteria=None, atoms=None)¶

Parameters:

criteria (string, list, tuple or dict(string(s))) – boolean expression string(s) for selection of atoms as in Selection.
atoms (class Atoms) – Coordinates that are to be selected from.

from_selections(selections=None)¶

Parameters:: selections (sequence of Selections (list, tuple, dict)) – Atom selection(s)

inAny()¶

Selection array, True for atoms in any group.

Returns:: atoms in any of the groups.
Return type:: class Selection

labels()¶

Identifiers for atoms in each Selection

Returns:: (chain/residue/atom)s
Return type:: dict(list(str))

static names(seq)¶

List of names for selection(s) w/in Group.

Returns:: names if Group or None if Selection.
Return type:: list of strings or NoneType

nonRedundant()¶

Remove items whose Selections are repeats of others.

In-place removal of replicate 2 onward, as sorted by key alphabetically.

Returns:: self
Return type:: Group

static selections(seq)¶: List of selection(s) w/in Group or Selection.

stats()¶

Summary statisics.

Returns:: number of selections, min. atoms in selection, max., average, documentation
Return type:: int, int, int, float, str

class atoms.OldArguments(imports=[], main=False, *args, **kwargs)¶

Bases: ArgumentParser

Argument parser for optimization target function.

Parameters:

imports ((list of) ArgumentParser method(s)) – method(s) adding arguments to ArgumentParser instance. (Alternative to ArgumentParser parents w/ better maintained groups.)
main (bool) – Parser for main program (i.e. not listed within an ArgumentParser parents argument). Adds default program information.

static export(self)¶: Adding options used in both stand-alone and imported modes.

class atoms.Selection(atoms=None, criterion=None)¶

Bases: ndarray

Boolean array with methods for atom selection.

Supports selection based on any of the attributes of PDB ATOM & HETATM records. Criteria are provided as strings to Selection methods (S) which are evaluated as Python expressions, and can be combined using Python bit-wise logical operators, for example: S(chain == A) ^ (S(chai==B) & (S(residue number .gt. 30) & S(resnu <= 50))), or … S((chain == A) .OR. (chain == C))

These selections can then be used as slices to perform calculations on subsets of Atoms attributes, e.g. atoms.b[selection] = 10.0, or within iterators over all atoms in an Atoms object.

One of two command parsers is used, chosen (by default) according to whether the command contains a unary operator (&, |, ~, ^, <<, >>) and is therefore a compound boolean expression, or is simple.

In criteria, Python syntax is relaxed in the following ways:

A selection of common synonyms can be substituted for the actual Atoms attributes, eg. atom_name for atnam, residue_type for resnam. (In simple expressions, ‘_’ above can be replaced by ‘ ‘ or ‘-‘.)

Strings should not be quoted. They will be recognized and quoted internally. Additionally, simple expressions should not be contained within parentheses (it is fine with compound expressions).

Names can be reduced to their shortest unique start, eg. ‘resna’ for ‘resnam’ allows ‘resnam’ to be distinguished from ‘resnum’.

Fortan-like operator synonyms (eg. .OR., .ge., .GT.) are accepted and will be translated into python equivalents (eg. ‘|’, ‘>=’, ‘>’) and can be used to avoid parsing of redirect symbols ‘>’ and ‘|’.

Simple expressions only - case insensitive.

The abbreviated form of Selection, S(<criterion>) can be accessed with the following Python commands: import functools; S = functools.partial(Selection, atoms), where atoms is an instantiation of class Atoms.

Warning

A select command used in a shell script may need its expression enclosed in double quotes if containing white-space. Use Fortran- synonym operators .OR., .ge., .GT. or the python xor operator, ‘^’ to avoid interpretation of ‘>’ and ‘|’ as command-line redirects.

Initialize a new selection, defaulting to False for all atoms.

Parameters:

atoms (class Atoms) – coordinates.
criterion (string) – initial selection logical expression, eg. resnum >= 3 or chain == A.

asSelection(input_array)¶: Convert input to a selection.

static attribute(synonym)¶: Unique Atoms attribute name from any of its synonyms.

boundary(verbose=False)¶

Identify atoms that are bonded to atoms outside the selection.

Returns:: True for boundary atoms, outside connections, inside connections
Return type:: Selection, [[int]|NoneType], [[int]|NoneType]

boundary_links()¶

Pairs of bonded atoms w/ 1st in Selection, 2nd not.

Returns:: Pairs of bonded atoms
Type:: list(tuple)

center()¶: Average coordinate (center of mass of equal mass atoms).

copy()¶

Shallow copy with new boolean array, but references to same atoms, _u

Returns:: new copy
Return type:: Selection

empty()¶

Returns:: True if selection has zero atoms.
Return type:: bool

expand()¶

Dummy to make the recursive Group.expand polymorphic for Selection.

Returns:: self
Return type:: class Selection
Raises:: TypeError – if encounters any value that is not Group or Selection

expand_items()¶

Dummy to make the recursive Group.expand_items polymorphic for Selection.

Returns:: self
Return type:: class Selection
Raises:: TypeError – if encounters any value that is not Group or Selection

following(current)¶

Index of next atom in Selection.

Parameters:: current (int) – index
Returns:: next selected index or None
Return type:: int or NoneType

fop2py = (('.eq.', '=='), ('.EQ.', '=='), ('./=.', '!='), ('.ne.', '!='), ('.NE.', '!='), ('.gt.', '>'), ('.GT.', '>'), ('.ge.', '>='), ('.GE.', '>='), ('.lt.', '>'), ('.LT.', '>'), ('.le.', '>='), ('.LE.', '>='), ('.not.', '~'), ('.NOT.', '~'), ('.or.', '|'), ('.OR.', '|'), ('.and.', '&'), ('.AND.', '&'), ('.eqv.', '^'), ('.EQV.', '&'), ('.neqv.', '^'), ('.NEQV.', '^'), ('.xor.', '^'), ('.XOR.', '^'))¶

static fortran(fexpr, warn=True)¶

Translate from expression with fortran operators to python.

Parameters:

fexpr (str) – possibly containing fortran operators like /=, .ge.
warn (bool) – issue a warning - might be useful in relating python- converted error tracebacks to fortran-containing user input

Return type:

str

Returns:

substituted with python operator strings

get()¶

inAny()¶: This method is solely for polymorphic compatibility with Group, enabling either to be treated as a Selection via inAny() :return self: :rtype Selection:

is_amino_acid()¶: True if selection contains 1 and only 1 N, CA, C, O in same residue.

is_of_atoms(atoms, verbose=True)¶: True if Selection’s atoms attribute is atoms.

key = 'macromolecule'¶

labels()¶

Identifiers for atoms in a Selection

Returns:: (chain/residue/atom)s
Return type:: list(str)

name = 'macromolecule'¶

next_residue_atoms()¶

Iterator over residues.

Returns:: atoms of the next residue in self (Selection)
Return type:: Selection

operators = ['<>', '!=', '==', '>=', '<=', '>', '<', '~', '^', '|', '&', '>>', '<<', '%', '//', '/', '**', '*', '-', '+']¶

prefix_set = {'m', 'ma', 'mac', 'macr', 'macro', 'macrom', 'macromo', 'macromol', 'macromole', 'macromolec', 'macromolecu', 'macromolecul', 'macromolecule'}¶

previous(current)¶

Index of prior atom in Selection.

Parameters:: current (int) – index
Returns:: prior selected index or None
Return type:: int or NoneType

select(criterion, compound=True)¶

Set self to True for all atoms meeting the criterion.

Parameters:

criterion (str) – logical expression operating on an Atoms attribute, eg. 'resnum >= 3' or :code`’chain == A’, or :code:’(resnum >= 3) & (chain == A)’`
compound (bool or NoneType) – Force criterion to be considered compound if True, simple if False, or determine if None.

Returns:

self

Return type:

ndarray subclass Selection

Warning

A select command used in a shell script may need its expression enclosed in double quotes to avoid shell interpretation of logical operators.

Changed in version 0.5.5: 10/18/15 default compound changed from None so that boolean selections can be provided uniformly eg. macromolecule instead of macromolecule==True for simple.

split(size=1, offset=None)¶

Split by segments of residues.

Parameters:

size (int) – number of residues to put in each segment
offset (int|NoneType) – number of residues in 1st segment, if different. Zero is equivalent to None.

synonyms = []¶

test()¶

property u¶: Boolean repeated 9-fold for all elements of selected U tensors.

unique_set = {'m', 'ma', 'mac', 'macr', 'macro', 'macrom', 'macromo', 'macromol', 'macromole', 'macromolec', 'macromolecu', 'macromolecul', 'macromolecule'}¶

v2 = {'m', 'ma', 'mac', 'macr', 'macro', 'macrom', 'macromo', 'macromol', 'macromole', 'macromolec', 'macromolecu', 'macromolecul', 'macromolecule'}¶

value = {'m', 'ma', 'mac', 'macr', 'macro', 'macrom', 'macromo', 'macromol', 'macromole', 'macromolec', 'macromolecu', 'macromolecul', 'macromolecule'}¶

whichAttr(strattr)¶

Returns the Atoms attribute indicated by a (minimally unique) string.

Parameters:: strattr (str) – Atoms coordinate attribute, case ignored.
Returns:: attribute, attribute name
Return type:: class Atoms object, string
Raises:: ValueError – if no attribute, or ambiguous

x = 12¶

class atoms.Tasks¶

Bases: object

Holder of atoms-centric task methods to be subclassed by other programs.

Superclass for atoms-related common tasks.

Warning

If this class defines pre-requisites on which other tasks depend, then this superclass must be instantiated before the other tasks are defined. If this class depends on pre-requisites defined elsewhere, then it must be instantiated afterwards. It may not be possible to satisfy both requirements!

getselect(name, flatten=False, **kwargs)¶

Retrieve a previously saved Selection or Group.

Polymorphic, returning searching the default collection / item names that might have been used in the preceeding select command.

Parameters:

name (str) – name of Selection, Group or Group item, eg. “mySelection”, “myGroup”, “myGroup[‘item’]” or “[‘item’]”.
flatten (bool) – Reduce Group to single Selection with inAny().
default – returned if no match instead of AttributeError

read_coords(new_file=None)¶

Parameters:: new_file (str) – PDB file to be read.

Warning

Does not automatically recalculate statistics etc..

read_coords_prereq(*args, **kwargs)¶: Pass-through, trying to meet the prerequisites.

select(collection=None, name=None, selection="S('all')")¶

Select atoms for evaluation / refinement.

Parameters:

selection (string or NoneType) – a bit-wise logical expression of class Selection variables and Selection.select or S operators. None: instantiate with an empty Selection for later definition. Default: all atoms.
name (dictionary key or NoneType) – name of the selection. None: defines the selection to be used when groups are not selected.
collection (string or dict or NoneType) – (name of) dictionary into which selection is to be placed. None: defines default refinement / evaluation selection.

select_attr(attribute, collection=None, selection=None)¶

Make a collection with subset groups sharing same Atoms attribute.

Parameters:

attribute (str) – name of the Atoms attribute to be grouped. Selection.synonyms / shortened forms are acceptable. Examples: ‘chain’, ‘residue number’, ‘resnam’.
selection (string or NoneType) – group/select only atoms that are also within selection, or in any of a Group of selections. This should be a string that evaluates to a previously defined Selection or Group. Default: all atoms.
collection (str) – name of dictionary into which selection is to be placed. None: defaults to the name of the attribute.

Returns:

dictionary containing selections for each unique value of attribute in selection(s).

Return type:

class Group

Warning

the Group returned could be clobbered if the default collection name is used a second time (if selecting the same attribute from different selections).

select_attr_prereq(*args, **kwargs)¶: Pass-through, trying to meet the prerequisites.

select_prereq(*args, **kwargs)¶: Pass-through, trying to meet the prerequisites.

update_coords(new_file=None, header=None)¶

Parameters:: new_file (str) – PDB file to be written.

Deprecated since version for: write_coords, but this remains a no-frills alternative.

write_coords(new_file=None, header=None, anisotropic=False, ca=False, neighbors_only=False)¶

Parameters:

new_file (str) – PDB file to be written.
anisotropic (bool) – Write riding anisotropic Us.
ca (bool) – Write only C_alphas.
header (str) – to be inserted at top of PDB file.
neighbors_only (bool) – output atoms that are close, but not selected evaluation (non-standard), noting “close” can contain “evaluation” atoms after residue or chain completion.

write_coords_prereq(*args, **kwargs)¶: Pass-through, trying to meet the prerequisites.

class atoms.Vec(vector=None, unit=None, length=None)¶

Bases: object

Immutable general vector, efficiencies from storing unit vector & length

Internally, the vectors are numpy arrays.

comp(other)¶

Scalar projection of self (component) onto other.

Parameters:: other (Vec)
Returns:: component
Return type:: float

dot(other)¶

Dot product of self and other.

Parameters:: other (Vec)
Returns:: dot product
Return type:: float

length()¶

property norm¶

proj(other)¶

Vector projection of self onto other.

Parameters:: other (Vec)
Returns:: projection
Return type:: Vec

udot(other)¶

Dot product of unit vectors along directions of self, other.

Parameters:: other (Vec)
Returns:: unit dot product
Return type:: float

property unit¶

property vec¶

class atoms.VecList(iterable=None)¶

Bases: list

List of Vec(tors).

align(other, lengths=True, normalize=False, verbose=False)¶

Find the permuted order of self vectors that aligns best with other.

Parameters:

other (VecList) – vectors to which self is to be aligned, same dimensions as self
lengths (bool) – use magnitudes, not just directions (unit vectors)
normalize (bool) – scale each VecList by dividing by the sum of magnitudes - see note below.

Returns:

(max-score, permutation, ave-score)

Return type:

(float, int-tuple, float)

Score is defined as the average for all paired vectors of the absolute values of their dot products. (Absolute so that +/- direction is arbitrary.) Permutation is the order of elements in self for the best (maximal) score. Ave-score is this score averaged over all possible permutations. If not lengths and not normalize, score will be the average cosine misalignment between vectors.

When the VecList are eigenvectors of a symmetric tensor, the sum of lengths is the trace. For an anisotropic U tensor, U_eq = trace/3 & B = 8 pi² U_eq, so normalization makes equivalent the isotropic or mean square displacements.

align_to_vec(vector, intermediate=(0.0, 1.0), cosine=True, degrees=True, verbose=True)¶

Misalignment of the “unique” axis in self with a single vector.

For VecList lengths of 3 (eg. principal axes of anisotropic U), find the most unique axes in self (depends on prolateness) and calculate the deviation from the direction of vector.

For a prolate ellipsoid, the longest axis is unique, and for oblate it is the shortest. For prolate, the misalignment is between the principal axis and the vector; for prolate, it is the deviation from orthogonality with the shortest vector.

Note the potential advantage of aligning only the unique axis, as we avoid the indeterminate alignment when the two other axes are the same (perfect prolate or oblate).

Does not care about the +/- direction of vectors, so the maximum misalignment is 90 deg..

Parameters:

vector (Vec) – same dimensions as self
intermediate ((float, float)) – beyond these prolateness limits, the ellipsoid will be considered pure oblate and prolate respectively, transitioning between the two with a linear ramp between the limits.
cosine (bool) – output cosines of angles, else angles
degrees (bool) – angles in degrees, else radians
invert (bool) – allow sign of vectors to be switched, so that the maximum deviation is 90 degrees.

Returns:

(misalignment, anisotropy-weighed misalignment, weight)

Return type:

(float,) x 3

See also

align_unique to calculate misalignment of unique vectors each from a list.

See also

align_vecs to calculate misalignment of all vectors.

See also

align to permute the order of vectors to fine the best match.

Weighting is by the geometric mean of (1-anisotropy) so that a spherical atom would have zero weight.

align_unique(other, intermediate=(0.0, 1.0), cosine=True, degrees=True, verbose=True)¶

Misalignment of the “unique” axis in self and other.

For VecList lengths of 3 (eg. principal axes of anisotropic U), find the most unique axes in self & other (depends on prolateness) and calculate the deviation from expected perfect alignment.

For a prolate ellipsoid, the longest axis is unique, and for oblate it is the shortest. if aligning two of the same, perfect alignment would put these parallel. If one each of prolate and oblate, perfect alignment would put these orthogonal.

Note the potential advantage of aligning only the unique axis, as we avoid the indeterminate alignment when the two other axes are the same (perfect prolate or oblate).

Does not care about the +/- direction of vectors, so the maximum misalignment is 90 deg..

Parameters:

other (VecList) – same dimensions as self
intermediate ((float, float)) – beyond these prolateness limits, the ellipsoid will be considered pure oblate and prolate respectively, transitioning between the two with a linear ramp between the limits.
cosine (bool) – output cosines of angles, else angles
degrees (bool) – angles in degrees, else radians
invert (bool) – allow sign of vectors to be switched, so that the maximum deviation is 90 degrees.

Returns:

(misalignment, anisotropy-weighed misalignment, weight)

Rtype (float,) x 3:

See also

align_vecs to calculate misalignment of all vectors.

See also

align to permute the order of vectors to fine the best match.

See also

align_to_vect, for alignment of the unique vector from a list to a single pre-defined vector.

Weighting is by the geometric mean of (1-anisotropy) so that a spherical atom would have zero weight.

align_vecs(other, permutation=None, cosine=True, degrees=True, invert=True)¶

Deviation between the directions of permuted self vectors and other.

Parameters:

other (VecList) – same dimensions as self
permutation (tuple of ints or NoneType) – order of vectors in self to be compared. If, None, use input order.
cosine (bool) – output cosines of angles, else angles
degrees (bool) – angles in degrees, else radians
invert (bool) – allow sign of vectors to be switched, so that the maximum deviation is 90 degrees.

Returns:

angles by which permuted self vectors are aligned from other

Return type:

ndarray

append(vec)¶: In-place append of a Vec instance.

extend(veclist)¶: In-place extension by an iterable of Vec instances.

extend_svd(values, uvecs)¶

In-place extension from the output of numpy.linalg.svd.

Parameters:

values (ndarray(shape=N, dtype=float)) – N “s” singular (eigen) values, 2nd object returned by svd decomposition of […, M, N] matrix(ces).
uvecs (ndarray(shape=(N,N) dtype=float)) – N “v” unitary […, N, N] matrix(ces), 3rd object returned by svd, where each row […, N, 0:N] (fastest dimension) would be an eigen vector.

If the VecList is to be used for the eigenvectors / principle components of a single tensor, then the input to extend_svd should be the results of linalg.svd applied to the single [M, N] matrix. If linalg.svd is broadcast with [L, M, N] input, then extend_svd should be called with the slices for the single-matrix l: values[l, …], uvecs[l, …]

insert(i, vec)¶: In-place insertion of a Vec instance before item i.

mean()¶: Average vector length.

property norms¶

permute()¶

Iterate through all possible orders of Vec in VecList.

Returns:: yields (permuted copy of self, self indices in the order of permuted)
Return type:: (VecList, int-tuple)

reordered(order)¶

Reordered copy.

Parameters:: order (int-iterable) – indices for elements of self in the order desired
Returns:: new
Return type:: VecList

sort(key=None, reverse=False)¶

Sort the vector list

Parameters:: key (method|NoneType) – if None sort by vector-length, else see documentation for sort

Changed in version 10/30/20: param cmp=None deprecated for consistency with python 3

sum()¶: Sum of vector lengths.

sumproduct(other, normalize=False, absolute=False)¶

Sum of scalar products of the corresponding vectors in self and other.

Parameters:

other (VecList) – (same dimensionality as self)
normalize (bool) – noralize, using unit vectors for self and other, then now returning the sum of cosines between paired vectors.
absolute (bool) – sum absolute values of products, else signed values.

sumsq()¶: Sum of vector lengths squared.

class atoms.dummy_chain(xyz)¶: Bases: object

atoms.is_number(s)¶

atoms.is_quoted_string(s)¶

atoms.is_triple_quoted_string(s)¶

atoms.iso_like(anisou, isotropic_b=None)¶

Scale anisotropic U tensor to match the isotropic B.

Parameters:

anisou (ndarray/NoneType) – U tensor if defined, else None.
isotropic_b (float) – B to which U is to be optionally scaled.

Returns:

scaled U if defined, else None

Return type:

numpy.matrix[3,3]/NoneType

See also

Anisotropics.iso_like() for coordinate set.

atoms.isotropic(anisou)¶

Convert anisotropic U to isotropic B.

Parameters:: anisou (ndarray/NoneType) – U tensor.
Returns:: isotropic B-factor.
Return type:: float

See also

Anisotropics.isotropic() for coordinate set.

atoms.masked_determinants(a)¶

Determinants of a set of M, NxN matrices in a masked array.

Parameters:: a (numpy.MaskedArray) – shape = (M, N, N)
Return inverse:
Rtype numpy.MaskedArray:: shape = (M, N, N)

atoms.masked_inverses(a)¶

Inverses of a set of M, NxN matrices in a masked array.

Parameters:: a (numpy.MaskedArray) – shape = (M, N, N)
Return inverse:
Rtype numpy.MaskedArray:: shape = (M, N, N)

atoms.masked_norm(a, axis=0)¶

Normalize a masked array.

Parameters:

a (numpy.ma.MaskedArray) – dtype=float
axis (int) – along which the sums will be 1.0

Returns:

normalized, same shape, type as a

Return type:

numpy.ma.MaskedArray

atoms.matmul(a, b)¶: Poorman’s numpy.matmul(), due in version 1.10, specific for 2 x 3-D arrays

atoms.number_selections(seq)¶

atoms.quoteString(s)¶

atoms.startup()¶

Program initialization, reading options etc..

Returns:: parser
Return type:: Arguments.ArgumentParser

atoms.triangular(anisou)¶

Top right triangle of U for PDB output.

Returns:: U-tensor [1,1], [2,2], [3,3], [1,2], [1,3], [2,3] units: Å², noting that above indices start at 1, but array elements from 0.
Return type:: ndarray.shape(6)

See also

Anisotropics.triangular() for coordinate set.

atoms.where(condition, atoms1, atoms2)¶

Where condition is true, yield x,y,z,B,O from atoms1, else atoms2.

Parameters:: condition (ndarray or (class Selection)) – Boolean array
Returns:: New set of coordinates merged from atoms1, atoms2
Return type:: class Atoms

..note:: requires: condition, atoms1, atoms2 to be of same length and: all other attributes in atoms1 & atoms2 to be identical.

atoms module¶

Table of Contents

Table of Contents