Superpose: Rigid-group / Flexible Structure Overlay

Section author: Michael S. Chapman <chapmami@ohsu.edu>

Authors:

Michael S. Chapman <chapmami@ohsu.edu> & Brynmor K. Chapman,

Oregon Health & Science University

Version:

0.5, March 23, 2016

Usage:

python -u -m superpose [-option(s)] <moving.pdb> <target.pdb>

Synopsis

Constrained least-squares super-imposition of molecules parameterized as rigid-groups and/or proteins with flexible phi, psi backbone torsion angles:

Rigid-group
Multiple rigid groups can be refined simultaneously (subsets and/or non-overlapping selections), allowing inter-domain motions to characterized independent of overall superposition.
Torsion angle
Restriction to the phi, psi backbone dihedrals within user-defined regions, and/or a parsimony restraint are used for conservative modeling without over-fitting. Algorithms are suited for domain-sized changes, and differ from more prevalent approaches, designed for local optimization, in which atoms from different segments refine at different times. In our global approach, all connected atoms rigidly rotate about each torsion angle. To maintain convergence, dihedral rotations (and their partial derivatives) are linked to rigid-body rotations that minimize changes to the moments of inertia. Refinement, restrained by parsimony can be used to find the torsion angles most critical to a conformational transition. Side chains are currently riding, i.e. they move passively with the backbone, without rotation of chi dihedrals.

Sources of Documentation

All of the following should be referenced:

Program options
Brief explanations of the arguments are given with: superpose.py -h, further information below.
Commands
Within the program (Superpose> prompt), available commands are listed with help -a; help about command is provided by help command, and for all commands by help -rl. The help text is reproduced later in this document.
Examples
The examples directory in the installation will soon contain scripts, data files, results and log files. Details to follow shortly.
Details, API

Details are encoded within docstrings that are accessible to programmers using Interactive Development Environments (IDEs). They are also compiled with sphinx into html files, linked from the module index on the home page. This is the searchable, cross-linked (API) reference documentation that will explain the meaning of parameters, performance of different functions etc..

The documentation is accessed from (index.html) on-line or in the distribution directory doc/html. (Additional formats can be generated with sphinx.)

Command-line options

The most up-to-date documentation is generated from python -u -m superpose -h:

Command interpreter

The command interpreter is available for programs in stand-alone mode, but generally not when called from / embedded in another package.

Syntax:

Commands are entered at the prompt in a unix-shell style:
  • command [option(s)] [positional argument(s)]
  • where options can be provided as -x [value] (x is a single letter), or --option[=value] in “two-dash” long form.
  • Various standard short-cuts are pre-defined, and any command can be shortened to a unique abbreviation. Note that the abbreviation given in the help is not always long enough to be unique (a bug inherited from cmd2).
  • Comments can be included with “#”, with text to the end-of-line then ignored.

Shell:

The cmd2 shell-like interface is inherited, offering history, command editing and redirects. Redirects can be awkward, because of the conflict with logical operators (<, |, >) used in selections (which therefore need to be quoted).

Hierachical structure:

Sub-commands are only available after entering the command. Higher-level commands are generally not available in sub-commands. The exceptions are general utility commands such as shell, shortcuts & set. Help, by default, is specific to the command level, but this behaviour can be changed with the --all (-a) & --recursive (-r}) options. Note that load (“@”) & related commands do not transcend different command levels.

Just-in-time calculation & pre-requisites:

A number of efficiencies are possible by pre-calculating and repeatedly using objects. Rather than pre-calculating at startup all objects that might be needed, the program attempts to calculate the minimal needed, just-in-time. For the most part, the pre-requisites are figured out and tasks are executed when needed using pre-assigned (or default) parameters. One exception is that any command with a “parameterize” pre-requisite will issue an error message if not already performed (mind-reading is not an option!).

The order that commands are entered is sometimes important, particularly when the embedded python interpreter is invoked with “py” (see below). Given the flexibility of the “py” command, there is no way to figure out the pre-requisites. Users should be especially attentive to AttributeErrors that might indicate an unmet pre-requisite dependence.

Error recovery:

Inherited from cmd2, exceptions are captured at the Command level, printing at least an error message, but without aborting the whole program. On interactive use, this conveniently often offers a second chance. If run as a script, users should search the output for “Error”, lest one has scrolled by. The default is a terse error message, but this can be changed to a full traceback using “set debug True” (still does not abort).

Selected commands and implementation-specific limitations.

@FILE or load FILE:

Used to run commands from an external file. The limitation is that commands cannot descent/ascend through nested sub-commands. Thus, for example, commands within parameterize would have to be given separately. The same limitation applies to variants _relative_load (@@).

Command list

The most up-to-date documentation is generated from python -u -m superpose /dev/null /dev/null help exit:

Superpose v0.5.6, 03/23/16 (Command: /home/chapman/Devel/RSRef/FTatom/src/superpose.py) [?1034h Superpose> help

Documented commands (type help <topic>):

_load ed history pair phipsi_diff save test
_relative_load edit l parameterize py select  
analyze evaluate li parrot r set  
backup flip list pause refine shell  
bfactors help load pdbout restore shortcuts  
cmdenvironment hi overlay phipsi_copy run show  

Miscellaneous help topics:

SELECTION_EXPR

Undocumented commands:

EOF eof exit q quit

Superpose> exit

Command Help

(See also Program control / utilities, below.) The most up-to-date documentation is generated from python -u -m superpose then help -rl:

analyze

Analyze prior refinement: shifts, gradients & impact of parameters.

analyze> convergence

Analyze the gradients and shifts of prior refinement.

Usage: convergence [options] arg

Options:
-h, --help show this help message and exit
-s, --summary Summary only without itemizing groups or dihedrals.

analyze> dihedrals

Changes in torsion angles and their impact.

dihedrals> hinges | hi

Find hinges in dihedral changes.

Usage: hinges [options] arg

Options:
-h, --help show this help message and exit
-g INT, --gap=INT
 # residues w/o dihedral changes that can be bridged in a hinge
-t FLOAT, --threshold=FLOAT
 above which dihedral rotations considered hinges. Degrees if > 1.0, else fraction of total change.
-p, --pseudo combine phi_i with psi_i-1, else individual dihedrals.
-s START, --start=START
 start of an explicitly defined hinge: chain-residue, requires –end, repeatable.
-e END, --end=END
 end of an explicitly defined hinge: chain-residue, requires –end, repeatable.
dihedrals> impact

Estimate impact of refined torsion angle changes on superimposition.

Usage: impact [options] arg

Options:
-h, --help show this help message and exit
-i, --iterative
 Iteratively determines/applies highest impact dihedral changes. This rarely-needed option is compute- intensive, because it iteratively finds the single, highest impact dihedral, applies the rotation, and repeats. Otherwise, by default, impact is assessed more rapidly from the dot product of the gradient and shift vectors, integrated over refinement iterations.
-I IDENTIFY, --identify=IDENTIFY
 List this number of the top-impact dihedrals.
-p, --pseudo Use approximation to pseudo-torsion angles (phi_i + psi_{i-1}) instead of individual phi, psi. (Limits output options, but speeds option iterative.)
-R, --recover Use previous calculation of impact, other options invalid.
-r, --rmsd Instead of the residual, use it square root (RMSD in superpose), decreasing influence of early cycles.
impact> color

Output coordinates, B-factors set to percent impact on fit of dihedrals.

Usage: color [options] output PDB file

Options:
-h, --help show this help message and exit
-p, --pseudo Use pseudo-torsion angle (phi_i + psi_i-1), else individual phi, psi; ignored if impact –pseudo.
impact> pickle

Save impact-needed data, so impact.py can replot without recaclulation (when tweaking plot). WARNING: objects picked by superpose must be compatible with impact - has not been checked recently.

Usage: pickle [options] FILE.jar - pickled impact object for later analysis.

Options:
-h, --help show this help message and exit
impact> plot

Graph impact of dihedrals on fit.

Usage: plot [options] arg

Options:
-h, --help show this help message and exit
-p <impact|change>, --prior=<impact|change>
 Replace impact or dihedral change with values from previous refinement (to superimpose restrained impact upon unrestrained dihedral changes).
-c, --changeOnly
 plot only dihedral changes, not estimate of impact.
-t, --tty Display graph on terminal (not recommended for background jobs).
-f FILE, --file=FILE
 save plot in graphics file of type given by extension (.emf, eps, jpeg, jpg, pdf, png, ps, raw, rgba, svg, svgz, tif, tiff)
-A FILE, --annotation=FILE
 File containing additional plot commands, overriding command-line input.
-O, --overall Dihedral changes (not impact) relative to reference (input) structure not refinement batch.
impact> print

Tabulate changes in dihedrals & impact on fit.

Usage: print [options] arg

Options:
-h, --help show this help message and exit
-O, --overall Dihedral changes (not impact) relative to reference (input) structure not refinement batch.
impact> done

Safe sub-menu exit, returning to higher level command.

dihedrals> paint

Output coordinates, B-factors set to dihedral change for visualization.

Usage: paint [options] output PDB file (w/ corrupted B-factors!)

Options:
-h, --help show this help message and exit
-p, --pseudo Use pseudo-torsion angle (phi_i + psi_i-1),else individual phi, psi.
-n, --normalized
 Scaled between 0 & 100.
dihedrals> done

Safe sub-menu exit, returning to higher level command.

analyze> done

Safe sub-menu exit, returning to higher level command.

analyze> done

Safe sub-menu exit, returning to higher level command.

backup

backup <new-jar-file>: save coordinates, history, stats from refinement

bfactors

B-factor statistics comparing current model to target.

Usage: bfactors [options] SELECTION_EXPR | SELECTION | COLLECTION:

Group(s) of atoms over which average statistics are to be output.

SELECTION_EXPR: boolean expr; see help SELECTION_EXPR.

COLLECTION: a dictionary-like Group of SELECTIONs (defined by prior select command) over which statistics are iterated.

SELECTION: COLLECTION[‘ITEM’] previously defined with a select command.

COLLECTION or [‘ITEM’] can be omitted if defaults were used in the select command.

Options:
-h, --help show this help message and exit
-v, --verbose Including per-residue information.
-p OUTPUT_type, --plot=OUTPUT_type
 Plot to “screen” or “file” an analysis of anisotropic Bs and coordinate differences. Repeat for both.
-f FILE, --file=FILE
 base of the file name (for –plot=file) into which “_cmp_U” and “_U_vs_xyz” will be inserted for 2 plots. The extension designates the file-type: .emf, eps, jpeg, jpg, pdf, png, ps, raw, rgba, svg, svgz, tif, tiff.
-c PDB_out, --color=PDB_out
 Output pdb files with B-factors replaced for molecular graphic illustration of the following statistics. The argument gives the base file name, which is modified to indicate the contents of the B-factor column: “_A”- 100 x Anisotropy of moving structure; “_U_vs_U”- 100 x |cos(deviation U_moving vs. U_target)|; “_U_vs_xyz”- 100 x |cos(deviation U_moving vs. coordinate difference vector)|; “_div”- Kullback-Liebler divergence for U_moving vs. U_target; “_div_scaled”- like “_div” except scaled by B_iso for each atom (like Merritt, 2011) - this is only output if SCALING is not “individual”. [“_div” and “_div_scaled” are provided for comparison and may be deprecated.] Use -c None to disable output of all files.
-s SCALING, --scaling=SCALING
 Scaling of anisotropic Us for atom size (B_iso) by individual (atom) or group or overall (by means of all paired atoms).

evaluate

Statistics for current model compared to target.

Usage: evaluate [options] arg

Options:
-h, --help show this help message and exit
-v, --verbose Including per-residue information.
-c PDB_file, --color=PDB_file
 Output superimposed stucture, B replaced by |coordinate difference|.
-s float, --scale=float
 Scale by which differences in PDB_file will be multiplied.

flip

Chi rotation of pseudo-symmetrical side chains to match target.

180 degree rotations about sp3-sp2 bonds.

Usage: flip [options] arg

Options:
-h, --help show this help message and exit
-R, --arg Include arginines.
-N, --asn Include asparagines.
-D, --asp Include aspartates.
-Q, --gln Include glutamines.
-E, --glu Include glutamates.
-H, --his Include histidines.
-F, --phe Include phenylalinines.
-Y, --tyr Include tysosines.
-a, --all All amino acids, default if none specified.
-v VERBOSITY, --verbosity=VERBOSITY
 -1 (quiet) to +1 (verbose)

overlay

Apply the rigid transformation that LSQ overlays moving atoms on paired targets.

For a single rigid group, this is prefered over iterative refinement, using the Kabsch method to give the least-squares solution in a single step.

overlay transforms selected coordinates by the rigid group operator
that least-squares aligns previously paired refining and target atoms. (Note that the selection here (default is ‘all’) can be different from the selections used in pairing.) The operator is determined by Kabsch’s eigenvector method [Kabsch-1976], [Kabsch-1978]. This is a direct (linear) method that can circumvent slow and narrow convergence of the non-linear methods in refine. However, it cannot be combined with torsion angle parameterization or external restraints. It is best used as a preliminary such that refine can start well within its convergence radius.
[Kabsch-1976]Kabsch, W., 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica A32, 922-3.
[Kabsch-1978]Kabsch, W., 1978. A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallographica A34, 827-8.

Usage: overlay [options] arg

Options:
-h, --help show this help message and exit
-m <SELECTION_EXPR|SELECTION>, --moving=<SELECTION_EXPR|SELECTION>
 Selection of atoms to move; default: all. SELECTION_EXPR: boolean expr; see help SELECTION_EXPR. SELECTION: COLLECTION[‘ITEM’] previously designated with a select command. COLLECTION or [‘ITEM’] can be omitted if defaults used in select command. This option does not change the pairing or the transformation operator.

pair

Pair atoms between moving & target structures. Required before flip, refine etc..

Option –moving and –target selections limit the atoms that are available for matching in (just) this pair session (defaults are all atoms). These selections are combined (logical AND) with those of the pair>within subcommand. Limits need only be defined for one structure, but either/both can be specified for convenience.

Usage: pair [options] arg

Options:
-h, --help show this help message and exit
-e, --extend Extend prior pairing, else start anew.
-m <SELECTION|SELECTION_EXPR>, --moving=<SELECTION|SELECTION_EXPR>
 Moving atoms for this pair session. SELECTION_EXPR: boolean expr; see help SELECTION_EXPR. SELECTION: COLLECTION[‘ITEM’] previously designated with a select command. COLLECTION or [‘ITEM’] can be omitted if defaults used in select command.
-t SELECTION_EXPR, --target=SELECTION_EXPR
 Target atoms for this pair session.

pair> done

Safe sub-menu exit, returning to higher level command.

pair> match

Pair atoms by attribute value (-v) and/or position (-p; order).

ATTR is specified as in “select –attr”, white-space replaced with “_”. Default: –value atom_name –value=residue_num -v conformer –position chain, (-v None (or -p None) avoids default, disabling value (position) matching.) Through the overall (pair -m/-t) and current (within -m/-t) selections and use of sufficient paired attributes, the specified match must be unambiguous 1:1.

Usage: match [options] arg

Options:
-h, --help show this help message and exit
-v ATTR, --value=ATTR
 Match attribute by value (repeatable).
-p ATTR, --position=ATTR
 Match attribute by position (order; repeatable).

pair> print

Pairing statistics.

pair> within

Limit atom matching to moving and/or target selection until redefined.

Selections are remembered between pair commands. These selections are combined (logical AND) with those of the parent pair command. Limits need only be defined for one of the structures, but both can be used if more convenient.
Usage: within [options] SELECTION_EXPR | SELECTION:

SELECTION_EXPR: boolean expr; see help SELECTION_EXPR.

SELECTION: COLLECTION[‘ITEM’] previously designated with a select command. SELECTION is usually defined only for –moving atoms and will usually be invalid if applied to –target atoms. COLLECTION or [‘ITEM’] can be omitted if defaults were used in the select command.

Options:
-h, --help show this help message and exit
-m, --moving Selection is for moving atoms.
-t, --target Selection is for target atoms.

pair> done

Safe sub-menu exit, returning to higher level command.

parameterize

For designated parameter type(s) (default positions), select atoms to be refined & how.

Usage: parameterize [options] arg

Options:
-h, --help show this help message and exit
-p, --position Designate which atomic positions (xyz, default) to be refined.

parameterize> SELECTION_EXPR

SELECTION_EXPR: logical expression of <Selection(s)> (string)
using array logical operators: &, |, ==, !=, ~, >, >=, (, ), ... evaluated in the task namespace. <Selection> is an existing instance of class Selection or a new one instantiated with S(<criterion>), where string <criterion> is a logical array expression using keywords such as residue number, chain, atom number, synonyms or unique abbreviations, see documentation for Selection. SELECTION_EXPR should be quoted to avoid shell commands / redirects etc.

Pre-defined selections/groups are created by top-level commands:

    > select -C collection -n name
    > select -C collection -a attribute

and are referred to in SELECTION_EXPR in one of two ways::

    > collection['name'] or collection['attribute'] (single selection).
    > collection (uses the logical OR of all selections in the Group).

Examples:

S('chain == A') & S('residue num <= 105')
S('(chai == C) & (residue_ty != HOH)')
"rigid['N_domain'] | S('resnam == ATP')"

Default: S(‘all’), i.e. all atoms.

parameterize> clear

Switch off all refinement of requested parameter type. (Individual parameterizations can be switched off with”group None”, “individual None”, “torsion None”, “overall False”)

parameterize> done

Safe sub-menu exit, returning to higher level command.

parameterize> group

Select atoms to be refined as one or more groups.

Usage: group [options] GROUP | COLLECTION | SELECTION | SELECTION_EXPR | GROUP_EXPR:

SELECTION: a previously saved single Selection, given as: COLLECTION[‘ITEM’].
(COLLECTION or [‘ITEM’] can be omitted if defaults were used in the corresponding select command.)

SELECTION_EXPR: boolean Selection expression, see help select.

GROUP_EXPR: list, dictionary of tuple of multiple quoted SELECTION_EXPR.

GROUP | COLLECTION: name of previously saved (cmd select) Selections.

Options:
-h, --help show this help message and exit

parameterize> individual

Select atoms to be refined individually. (More appropos with exptl. data than superposition.)

Usage: individual [options] SELECTION_EXPR | SELECTION:

SELECTION_EXPR: a boolean expression, see help select; SELECTION: name of a previously saved Selection, given as: COLLECTION[ITEM]. (COLLECTION or [ITEM] can be omitted if defaults were used in the corresponding select command.)
Options:
-h, --help show this help message and exit

parameterize> overall

Refine requested parameter type as a single group.

If just refining overall position, overlay is better (wider convergence/faster). Use overall when also simultaneously refining sub-groups, torsion angles etc..

Usage: overall [options] [True | False]

Options:
-h, --help show this help message and exit

parameterize> print

Print parameterization. (Options invoked until reset.)

Usage: print [options] arg

Options:
-h, --help show this help message and exit
-w INT, --width=INT
 Line width (def. 132).
-a SIZE, --abbreviate=SIZE
 Selections longer than SIZE abbreviated w/ ellipsis (def. 33).
-c, --count Report number of moving atoms in selections.
-b, --boolean Report selections as T/F boolean arrays (def).

parameterize> torsion

Select variable dihedrals and atoms whose positions depend on them.

Usage: torsion [options] [SELECTION_EXPR] - used by -f/-V and overules -d defailt.

Options:
-h, --help show this help message and exit
-d SELECTION_EXPR, --dihedrals=SELECTION_EXPR
 optimize all variable dihedrals (phi, psi) within this selection. See help select. Default: macromolecule
-a SELECTION_EXPR, --atoms=SELECTION_EXPR
 atoms to be moved by dihedral rotations if linked to variable bonds (which must be fully enclosed by this Selection). See help select. Default: all atoms - macromolecule
-f <phi|psi|pseudo>, --fix=<phi|psi|pseudo>
 Fix all dihedrals of named type within atoms in SELECTION_EXPR (must be only option/argument.
-t <NUMBER(int)|FRACTION(float)>, --top=<NUMBER(int)|FRACTION(float)>
 Fix all but these dihedral angles (must be only option/argument).
-V <phi|psi|pseudo>, --vary=<phi|psi|pseudo>
 Fix all dihedrals of named type within atoms in SELECTION_EXPR (must be only option/argument.
-v, --verbose List selected dihedrals (–top).

parameterize> done

Safe sub-menu exit, returning to higher level command.

pdbout

Write coordinates (symmetry expansion as per command-line arguments).

Usage: pdbout [options] [Header inserted into top of PDB file]

Options:
-h, --help show this help message and exit
-o FILE, --file=FILE
 Output file name (required argument).
-a, --anisotropic
 Output anisotropic U, riding w/ coordinates from input PDB.
-c, --c_alpha Output only C_alphas.

phipsi_copy

Difference in phi, psi between current structure & target.

Usage: phipsi_copy [options] arg

Options:
-h, --help show this help message and exit
-s FLOAT, --sigma=FLOAT
 Copy only dihedrals above these standard deviations.
-p, --pseudo Sigma threshold applies to pseudo-dihedrals (phi_i + psi_i-1) rather than individual torsion angles.
-c, --confirm Confirm each change interactively before applying.

phipsi_diff

Difference in phi, psi between current structure & target.

Usage: phipsi_diff [options] [Header inserted into top of PDB file]

Options:
-h, --help show this help message and exit
-c PDB-FILE, --color=PDB-FILE
 Output coordinates for display - B-factors set to differences.
-H, --hinge hinge analysis using THRESHOLD & GAP
-g INT, --gap=INT
 # residues w/o dihedral changes that can be bridged in a hinge
-t FLOAT, --threshold=FLOAT
 above which dihedral rotations considered hinges. Degrees if > 1.0, else fraction of total change.
-s START, --start=START
 start of an explicitly defined hinge: chain-residue, requires –end, repeatable.
-e END, --end=END
 end of an explicitly defined hinge: chain-residue, requires –end, repeatable.
-n, --normalized
 Color by relative absolute differences not actual signed values, and plot absolute values.
-p, --pseudo quasi pseudo-torsion angles, combining phi_i w/ psi_i-1
-P, --plot graph the differences.

refine | r

Refine atomic model.

Usage: refine [options] arg

Options:
-h, --help show this help message and exit
-C INT, --max_cycles=INT
 Maximum number of cycles.
-i FLOAT, --min_improvement=FLOAT
 End when per-cycle improvement falls below this value.
-g FLOAT, --min_grad=FLOAT
 End when gradient norm falls below this value.
-n, --new_batch
 New batch (losing prior history), else continue prior (default) if exists/possible.
-r, --restart From original coordinates (implies -n).
-v INT, --verbosity=INT
 Per-cycle logging: -1 (terse) to 3 (verbose) [def. 0].

restore | r

restore <jar-file>: return to previously saved refinement. (Options, parameterization must be compatible w/ file before loading to avoid unpredictable results).

select

Name selection(s) of atoms.

Usage: select [options] [SELECTION_EXPR]

SELECTION_EXPR: logical expression of <Selection(s)> (string) using array logical operators: &, |, ==, !=, ~, >, >=, (, ), ... evaluated in the task namespace. <Selection> is an existing instance of class Selection or a new one instantiated with S(<criterion>), where string <criterion> is a logical array expression using keywords such as residue number, chain, atom number, synonyms or unique abbreviations, see documentation for Selection. SELECTION_EXPR should be quoted to avoid shell commands / redirects etc, and if provided as a named argument, must be devoid of white-space.

Examples:

select -C rigid -n N_domain S('chain == A') & S('residue num <= 105')
select -C protein -n C S('(chai == C) & (residue_ty != HOH)')
select -C all -n catalytic "rigid['N_domain'] | S('resnam == ATP')"
select -C mycollection -a resnum -F protein -S C
select --collection=chains --attr=chain
Options:
-h, --help show this help message and exit
-C NAME, --collection=NAME
 Collection (dictionary) into which selection is placed (default: None –> “collection” or name of attribute if -a specified).
-n NAME, --name=NAME
 Unique name to be given to selection (default: None –> “default”). (-a (–attr) & -n (–name) are mutually exclusive.)
-a ATOM_ATTR, --attr=ATOM_ATTR
 Selections are made (and named) for each unique value of ATOM_ATTR in the coordinates (see -d, -f). ATOM_ATTR must be specified as a single-word abbreviation/synonym recognized in Selection expressions, eg. –attr=chain might give selections A, B..., while –attr=resnum might give 23, 24,... (-a (–attr) & -n (–name) are mutually exclusive.)
-S NAME, --selection=NAME
 Selection or dictionary (group) from which –attr subset is to be drawn (default: None –> all atoms).

Program control / utilities

(Available from top-level and all sub-menus.) The most up-to-date documentation is generated from python -u -m superpose then help -rl:

_load

Runs script of command(s) from a file or URL.

_relative_load

Runs commands in script at file or URL; if this is called from within an already-running script, the filename will be interpreted relative to the already-running script’s directory.

cmdenvironment

Summary report of interactive parameters.

edit | ed

ed: edit most recent command in text editor

ed [N]: edit numbered command from history ed [filename]: edit specified file name

commands are run after editor is closed. “set edit (program-name)” or set EDITOR environment variable to control which editing program is used.

help

Document command or list available commands.

Usage: help [options] [command]

Options:
-h, --help show this help message and exit
-a, --all Include commands inherited from higher levels. (Combining -a -r will be excessively repetitious.)
-l, --long Fully document all commands.
-r, --recursive
 Descend through nested command sets.

history | hi

history [arg]: lists past commands issued

no arg: list all
arg is integer: list one history item, by index
arg is string: string search
arg is /enclosed in forward-slashes/: regular expression search

Usage: history [options] (limit on which commands to include)

Options:
-h, --help show this help message and exit
-s, --script Script format; no separation lines

list | l | li

list [arg]: lists last command issued

no arg -> list most recent command arg is integer -> list one history item, by index a..b, a:b, a:, ..b -> list spans from a (or start) to b (or end) arg is string -> list all commands matching string search arg is /enclosed in forward-slashes/ -> regular expression search

load | l

Runs script of command(s) from a file or URL.

parrot

parrot [T[rue]|F[alse]|Y[es]|N[o]]: toggle or set command echoing (for log file).

pause

Displays the specified text then waits for the user to press RETURN.

py

py <command>: Executes a Python command. py: Enters interactive Python mode. End with Ctrl-D (Unix) / Ctrl-Z (Windows), quit(), ‘exit()`. Non-python commands can be issued with cmd("your command"). Run python code from external files with run("filename.py")

run | r

run [arg]: re-runs an earlier command

no arg -> run most recent command arg is integer -> run one history item, by index arg is string -> run most recent command by string search arg is /enclosed in forward-slashes/ -> run most recent by regex

save

save [N] [filename.ext]

Saves command from history to file.

N => Number of command (from history), or *;
most recent command if omitted

set

Sets named Cmd parameter or lists all; unambiguous abbreviations OK.

shell

execute a command as if at the OS prompt.

shortcuts

Lists single-key shortcuts available.

show

Shows value of a parameter.

Usage: show [options] arg

Options:
-h, --help show this help message and exit
-l, --long describe function of parameter

Python Interpreter for advanced functionality

An attempt has been made to balance flexibility with simplicity and ease of use, in deciding which functionalities and parameters are available through command-line or command interpreter control. Many others are accessible through an embedded python interpreter that is invoked with the command py command. It is executed in a namespace that is local to the command interpreter (and not very useful). The command-line options are imported as attributes of the object “option”, and most other needed objects can be accessed as attributes of self.task, for which the alias “my” is provided. The following examples illustrate:

  • Changing chain name of target to match refining model: Superpose> py import numpy; my.atoms.chain = numpy.where(my.atoms.chain=='C', 'A', my.atoms.chain)
  • Removing one of the target alternative locator IDs to match refining model: Superpose> py import numpy; my.atoms.altloc=numpy.where(my.atoms.altloc=='A',' ',my.atoms.altloc)
  • Changing the torsion-restraint weight during a refinement: Superpose> py option.torsion_weight = 2.0
  • Print values of command-line options: Superpose> py print option
  • Replace all B values with their mean: Superpose> py b=my.atoms.b; b.fill(b.mean())

Atom selections and groups

A parser is provided for flexible selection of atom groups within the command interpreter. It is applicable to stand-alone running (rsref.py, superpose.py etc.), but not usually accessible when modules are wrapped/embedded in other packages (eg. CNS) which then manage atom selections.

Selections

The selection syntax is terse, flexible, and (for better or worse) relies on Python evaluation of expressions. Thus, hopefully, it will be intuitive for many users.

It differs from some other programs in that selections are objects (instances of class Selection). Selection objects are an extension of boolean arrays, specific for a coordinate set (class Atoms). Selection expressions can be used directly in commands like parameterize, or user-named Selections can be pre-defined for repeated use or to simplify/clarify the definition of complicated Selections. Selections can be combined or assigned to new Selection instances through use of Python bit-wise logical operators (&,|,==,!=,~,>,>=,(,),...). This should make it convenient to write refinement scripts in which different subsets of the atomic parameters are refined at different stages, different sets of atoms are subject to positional, B-factor or occupancy refinement, and in which some atoms might be refined individually, others rigidly grouped for simultaneous refinement, etc.. This frees the user of constraints embodied in other programs.

In RSRef, “S” is a synonym of Selection, the class. A selection expression is defined as:

<selection>|S(<criterion>) [<operator> <selection>|S(<criterion>)]

where:

<selection>
is a pre-existing instance of class S.
<criterion>

is a non-quoted string to select atoms for a new instance of S, where <criterion> can be in simple form or compound:

Simple

contains a single operator, and is given without parentheses, such as:

  • chain == A
  • chai==B
  • Residue number >= 30
  • resnu <= 50
Compound

an expression combining operators, using parentheses to set precedence. Spaces in names must be replaced with underscores, and expressions should use lower case as case sensitive. Examples:

  • (chain == A) | (chai==B)
  • ((residue_numb >= 30) & (atnam == CA)) | (chain != C)

Warning

the compound parser will silently do unexpected things with syntax or spelling errors. Check that results are consistent with expectations!

Note

the inner operands are enclosed within parentheses, because the combination operators (&, |) have higher precendence than the comparison operators (>, >=, <, <=).

<operator>
is a bit-wise Python logical operator (&,|,==,!=,~,>,>=,(,),...) Further details of the syntax for making Selections are provided in atoms.Selection.

Selections are named and defined with the select command.

Groups

Groups are dictionary-like collections of named Selections, each constrained as a group in group refinement. (In individual atom refinement, a Group is treated as the logical OR between all the named Selections.) The Group class contains methods for checking that selections do not overlap (work-in-progress). In most commands, the name of a Group can be substituted for that of a Selection.

Groups are defined with the selection command, for example:

select --collection=domains --name=N S('chain == A') & S('residue num <= 105')
select -C domains -n C S('chain == A') & S('residue num > 105')

These 2 statements illustrate several varients of the syntax in together defining a Group called domains, with two Selections, named N & C that contain the N- and C-terminal parts of subunit A.

Troubleshooting

Pairing

AssertionError: Overwriting prior pairing for ... in structure B: Attempted pairing was ambiguous with two atoms of structure A mapped to the same atom of B. Sometimes this is because of multiple conformers, which can be resolved in pair by specifying the conformer to be used in each selection, eg. --moving ((chain==A)&(conformer==1))

Refinement

Stunted convergence: Refinement, started far from convergence, will sometimes do better with additional cycles specified with –new_batch. However, this restarts the history, spoiling impact analysis. Overlay can often bring the structure close enough to the target to avoid this dilemma - this is appropriate when local conformational analysis is important, but not differences in absolute coordinates. (Stunted convergence likely arises when the starting point is so far from the answer that many of the initial partial derivatives are poorly correlated with those needed later. The history that is used by lbgfs to accelerate well-behaved convergence can therefore become a liability, and the –new_batch option may be helping by erasing the history.)

Installation:

Superpose has been developed as part of the RSRef package. It can also be installed, stand-alone, as pure a Python program with its subset of RSRef module dependencies: argparser.py, atoms.py, cmd_extensions.py, cmd2nest.py, impact.py, optimize.py, rotamer.py, superpose_user_doc.py, torsion.py & transform.py. External packages cmd2, NumPy and SciPy are also required.

Credits

Libraries used in rigid-group and torsion angle optimizations were first programmed by Brynmor K. Chapman. Other programming by Michael S. Chapman.

Citations coming soon, hopefully.

Colophon

This introductory documentation is generated from the source file: superpose_doc.rst. Its dependents are extracted from specific python modules using documentation.py, see instructions within.