Superpose: Rigid-group / Flexible Structure Overlay

Section author: Michael S. Chapman <chapmanms@missouri.edu>

Authors:

Michael S. Chapman <chapmanms@missouri.edu> & Brynmor K. Chapman,

Oregon Health & Science University and University of Missouri

Version:

1, Nov 26, 2024

Usage:
  • SuperPose [-option(s)] <moving.pdb> <target.pdb>

  • superpose.py [-option(s)] <moving.pdb> <target.pdb>

Synopsis

Constrained least-squares super-imposition of molecules parameterized as rigid-groups and/or proteins with flexible phi, psi backbone torsion angles:

Rigid-group

Multiple rigid groups can be refined simultaneously (subsets and/or non-overlapping selections), allowing inter-domain motions to characterized independent of overall superposition.

Torsion angle

Restriction to the phi, psi backbone dihedrals within user-defined regions, and/or a parsimony restraint are used for conservative modeling without over-fitting. Algorithms are suited for domain-sized changes, and differ from more prevalent approaches, designed for local optimization, in which atoms from different segments refine at different times. In our global approach, all connected atoms rigidly rotate about each torsion angle. To maintain convergence, dihedral rotations (and their partial derivatives) are linked to rigid-body rotations that minimize changes to the moments of inertia. Refinement, restrained by parsimony can be used to find the torsion angles most critical to a conformational transition. Side chains are currently riding, i.e. they move passively with the backbone, without rotation of chi dihedrals.

Sources of Documentation

All of the following should be referenced:

Program options

Brief explanations of the arguments are given with: superpose.py -h, further information below.

Commands

Within the program (pasto.superpose> prompt), commands are listed with pasto.superpose ‣ help with the -v option to list synopses, help command for details on command, help -l for details on all commands and help -rl to descend recursively through subcommands. The help text is reproduced later in this document.

Examples

examples/virus-5A has an example, compare.sh that is documented as number 7 in examples/virus-5A/README.txt. Other examples to come.

Details, API

Details are encoded within docstrings that are accessible to programmers using Interactive Development Environments (IDEs). They are also compiled with sphinx into html files, linked from the module index on the home page. This is the searchable, cross-linked (API) reference documentation that will explain the meaning of parameters, performance of different functions etc..

The documentation is accessed from (index.html) on-line or in the distribution directory doc/html. (Additional formats can be generated with sphinx.)

Command-line options

The most up-to-date documentation is generated from superpose.py -h:

(Command: /trihome/chapmanms/Devel/RSRef/FTatom/pasto/superpose.py -h)

(Sources: /trihome/chapmanms/Devel/RSRef/FTatom/pasto v1.0.6)

usage: superpose.py [-h] [–in_nomenclature STR] [–out_nomenclature STR] [–version] [–infile FILE] [–outfile FILE] [–torsion_limit FLOAT] [–torsion_weight FLOAT] [–units_per_A FLOAT] [–units_per_rad FLOAT] [–units_per_torsion FLOAT] [–pdb_out FILE] [–impact_annotation FILE] [MOVING.PDB] [TARGET.PDB] …

Superimposition of structures by rigid-group / torsion optimization. (c) OHSU 2010-18; University of Missouri 2018-20, Michael S. Chapman

positional arguments:
… Program commands may follow required MOVING.PDB & TARGET.PDB. Commands are space-separated, quoted if

containing white space.

options:
-h, --help

show this help message and exit

--version, -v

show program’s version number and exit

--infile FILE

Redirected standard input, like “<”. (default: <_io.TextIOWrapper name=’<stdin>’ mode=’r’ encoding=’utf-8’>)

--outfile FILE

Redirected standard output, like “>”. (default: <_io.TextIOWrapper name=’<stdout>’ mode=’w’ encoding=’utf-8’>)

Nomenclature translation:
--in_nomenclature STR, -q STR

From input: bmrb, cif, cns, diana, midas, msi, pdb92, sc, sybyl, ucsf, xplor, lax; lax for no translation, None for auto-determine (default: None)

--out_nomenclature STR, -Q STR

To output: bmrb, cif, cns, diana, midas, msi, pdb92, sc, sybyl, ucsf, xplor; default: same as –in_nomenclature (-q) if defined, else internal convention if None (default: None)

Input model parameters:

MOVING.PDB Input PDB file, moving atoms. (default: None) TARGET.PDB Input PDB file, target atoms. (default: None)

Optimization / target function:
--torsion_limit FLOAT, -P FLOAT

Sum of torsion angle changes (deg.) beyond which a restraint is imposed. (default: 0.0)

--torsion_weight FLOAT, -p FLOAT

Weight: restraint on total torsion angle change. (default: None)

Model parameterization for refinement:
--units_per_A FLOAT

Parameter scaling in refinement: Angstrom (positions). Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 1.0)

--units_per_rad FLOAT

Parameter scaling in refinement: Rigid-group rotations. Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

--units_per_torsion FLOAT

Parameter scaling in refinement: Dihedral bond rotations (rad). Decrease for sensitivity. Inverse of (physical units x value) should approximately reflect relative importance to residual. (default: 10.0)

Output options:
--pdb_out FILE, -o FILE

Output PDB file. (default: superpose.pdb)

--impact_annotation FILE, -I FILE

Additional plot commands for annotating impact graph. (default: None)

+<file> inserts options from <file>, one per line.

Command interpreter

The command interpreter is used for program control in stand-alone mode, but generally not when called from / embedded in another package. After the program has been invoked with options/arguments that are parsed, program flow is controlled with a series of user-entered interactive commands.

These run-time commands are interpreted with an extenstion (cmd2nest) of the cmd2 package that adds support for nested sub-commands. The following will summaraize the essentials of program control and note differences from cmd2. The cmd2 documentation provides additional detail.

Syntax:

Commands are entered at the prompt in a unix-shell style:
  • command [option(s)] [positional argument(s)]

  • where options can be provided as -x [value] (x is a single letter), or --option[[=]value] in “two-dash” long form.

  • Various standard short-cuts are pre-defined, and tab completion is available for commands and arguments.

    Error

    The abbreviation given in the help is not always long enough to be unique (a bug inherited from cmd2.

Error

Tab completion fails after exiting a sub-command.

Shell:

The cmd2 shell-like interface is inherited, offering history, command editing and redirects. Redirects should work (<, |, >).

Hierachical structure:

Sub-commands are only available after entering the command. Higher-level commands are generally not available in sub-commands. The exceptions are general utility commands such as shell, shortcuts & set. Help, by default, is specific to the command level. The --descend (-d) documents one sub-command level deeper & --recursive --long (-rl) decends through all levels exhaustively. Note that load (“@”) & related commands and history, do not transcend different command levels.

Just-in-time calculation & pre-requisites:

A number of efficiencies are possible by pre-calculating and repeatedly using objects. Rather than pre-calculating at startup all objects that might be needed, the program attempts to calculate the minimal needed, just-in-time. For the most part, the pre-requisites are figured out and tasks are executed when needed using pre-assigned (or default) parameters. One exception is that any command with a “parameterize” pre-requisite will issue an error message if not already performed (mind-reading is not an option!).

The order that commands are entered is sometimes important, particularly when the embedded python interpreter is invoked with “py” (see below). Given the flexibility of the “py” command, there is no way to figure out the pre-requisites. Users should be especially attentive to AttributeErrors that might indicate an unmet pre-requisite dependence.

Error recovery:

Inherited from cmd2, exceptions are captured at the Command level, printing at least an error message, but without aborting the whole program. On interactive use, this conveniently often offers a second chance. If run as a script, users should search the output for “Error”, lest one has scrolled by. The default is a terse error message, but this can be changed to a full traceback using “set debug True” (still does not abort).

Selected commands - implementation-specific extensions & limitations.

@FILE or run_script FILE

Used to run commands from an external file. The limitation is that commands cannot descend/ascend through nested sub-commands. Thus, for example, commands within parameterize would have to be given separately. The same limitation applies to variants _relative_run_script (@@).

Command-line commands

Tokens following any of the program’s required positional parameters (e.g. INPUT.PDB) will be run as top-level commands. There is no support for transcending sub-commands or for command options:

Example: rsref.py /dev/null help shortcuts exit

Options can be incorporated by using the cmd2 run_script shortcut on the command line (Example1), where cmd.txt has one command per line that can include options and whitespace. Even better, use the --infile option (imported from argparser)

Example1: rsref.py /dev/null @cmd.txt exit
Example2: rsref.py /dev/null --infile=cmd.txt

done, exit and quit

These are near synonyms to mitigate a problem with cmd2’s error-handling in scripted runs with sub-commands. On an exception within a sub-command, the program terminates (just) the sub-command, and continues reading commands that had been intended for the terminated sub-command, but applying them mistakenly to continued exectuation in the higher level from where the sub-command was invoked. Should a quit (or exit), intended for a sub-command, be encountered at top-level, the program can terminate before any results are saved. To avoid this, the base cmd2 commands have been overridden:

  • exit is only available from the top level.

  • done is only available from sub-commands.

  • quit is unsafely available from both.

Thus, if done is used exclusively to finish a sub-command, if it is invoked accidentally at the top level, it will lead to an unrecognized command error, and remaining top-level commands (eg. saving results) will executed before the program is terminated with exit. The unsafe quit can be used interactively and repeatedly to bail out of a failed run if the sub-command level is unclear.

python interpreter

py (without statement) opens a python shell within which multiple statements can be executed, terminating the shell with Cntrl-D, exit() or quit(). These provide powerful ways of customizing the programs and extending functionality beyond the commands that are provided.

In our extension of cmd2, namespace my provides access to objects within the task-space of the program. Thus, for example, atomic B-factors could be printed or manipulated using my.atoms.b, and program options with my.option.resolution (for example).

The python shell is executed in its own namespace, so modules (such as sys or numpy) have to be imported explicitly.

Additional examples are given in sections “Python Interpreter for advanced functionality” in the documentation for specific programs.

Limitations, bugs & work-arounds

The single-line variant, py statement, executing a single python statement, was deprecated in package cmd2 v2.4. There may be legacy scripts that will need updating.

  • Annotations

    py print("\nAnnotation for stdout") was a common use. Consider the alternative: shell echo -e "\nAnnotation for stdout" which will avoid additional output from interpreter start and termination. The shell alternative is only possible if access to python attributes is not required.

  • Redirects (> , < and |) on the py command line

    are captured by the cmd2 parser, not any shell that might redirect io for the python interpreter. This means that shell heredoc file and variables are not supported within py.

  • Support for compound single-line statements (py stmt1; stmt2)

    with a ‘;’ separator disappeared from cmd2 prior to v2.4. A common usage was to import a module, then use a module attribute. Subsequently, code following ‘;’ was ignored silently, affecting some legacy scripts.

All of these issues are by-passed by invoking the full python shell instead of the single-line py command.

Other commands available in all applications

Use application help <command> for further details:

alias

Manage aliases

edit

Run a text editor and optionally open a file with it

exit

Safe program exit from top level, not subcommands.

help

List available commands or help for one or all commands

history

View, run, edit, save, or clear previously entered commands

macro

Manage macros

parrot

Echo commands (or not)

py

Invoke Python shell, ending exit(); single line “py command” no longer supported; see comments above and also shell

quit

Exit this application

run_pyscript

Run a Python script file inside the console

run_script

Run commands in script file that is encoded as either ASCII or UTF-8 text

set

Set a settable parameter or show current settings of parameters

shell

Execute a command as if at the OS prompt

shortcuts

List available shortcuts

test

test [<name>]: run test (developers only)

Command list

The most up-to-date documentation is generated from superpose.py /dev/null help exit:

Documented commands (use 'help -v' for verbose/'help <topic>' for details):
===========================================================================
alias     evaluate  history  parameterize  phipsi_diff  run_pyscript  shell
analyze   exit      macro    parrot        py           run_script    shortcuts
bfactors  flip      overlay  pdbout        quit         select        test
edit      help      pair     phipsi_copy   refine       set

Miscellaneous help topics:
==========================
SELECTION_EXPR

Undocumented commands:
======================
backup  restore

pasto.superpose>  exit

Command Help

The most up-to-date documentation is generated from superpose.py then help -rl

alias

Usage: alias [-h] SUBCOMMAND ...

Manage aliases

An alias is a command that enables replacement of a word by another string.

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  SUBCOMMAND
    create    create or overwrite an alias
    delete    delete aliases
    list      list aliases

See also:
  macro

analyze

usage: analyze [-h] [-s]

Analyze prior refinement: shifts, gradients & impact of parameters.

options:
  -h, --help     show this help message and exit
  -s, --summary  Summary only without itemizing groups or dihedrals.

analyze sub-commands

convergence
usage: convergence [-h] [-s]

Analyze the gradients and shifts of prior refinement.

options:
  -h, --help     show this help message and exit
  -s, --summary  Summary only without itemizing groups or dihedrals.
dihedrals
usage: dihedrals [-h]

Changes in torsion angles and their impact

options:
  -h, --help  show this help message and exit
dihedrals sub-commands
hinges
usage: hinges [-h] [-i] [-g INT] [-t FLOAT] [-p] [-s START] [-e END]

Find hinges in dihedral changes.

options:
  -h, --help            show this help message and exit
  -i, --information     Report statistics, then exit immediately.
  -g INT, --gap INT     # residues w/o dihedral changes that can be bridged in a hinge
  -t FLOAT, --threshold FLOAT
                        above which dihedral rotations considered hinges. Degrees if > 1.0, else fraction of total change.
  -p, --pseudo          combine phi_i with psi_i-1, else individual dihedrals.
  -s START, --start START
                        start of an explicitly defined hinge: chain-residue, requires --end, repeatable.
  -e END, --end END     end of an explicitly defined hinge: chain-residue, requires --end, repeatable.
impact
usage: impact [-h] [-i] [-I IDENTIFY] [-p] [-R] [-r]

Estimate impact of refined torsion angle changes on superimposition.

options:
  -h, --help            show this help message and exit
  -i, --iterative       Iteratively determines/applies highest impact dihedral changes. This rarely-needed option is compute-
                        intensive, because it iteratively finds the single, highest impact dihedral, applies the rotation, and
                        repeats. Otherwise, by default, impact is assessed more rapidly from the dot product of the gradient and
                        shift vectors, integrated over refinement iterations.
  -I IDENTIFY, --identify IDENTIFY
                        List this number of the top-impact dihedrals.
  -p, --pseudo          Use approximation to pseudo-torsion angles (phi_i + psi_{i-1}) instead of individual phi, psi. (Limits
                        output options, but speeds option iterative.)
  -R, --recover         Use previous calculation of impact, other options invalid.
  -r, --rmsd            Instead of the residual, use its square root (RMSD in superpose), decreasing influence of early cycles.
impact sub-commands
color
usage: color [-h] [-p] file

Output coordinates, B-factors set to percent impact on fit of dihedrals.

positional arguments:
  file          PDB output

options:
  -h, --help    show this help message and exit
  -p, --pseudo  Use pseudo-torsion angle (phi_i + psi_i-1), else individual phi, psi; ignored if impact --pseudo.
pickle
usage: pickle [-h] file

Save impact-needed data, so impact.py can replot without recaclulation (when tweaking plot). WARNING: objects picked by superpose
must be compatible with impact - has not been checked recently.

positional arguments:
  file        output jar format, pickled impact object for later analysis.

options:
  -h, --help  show this help message and exit
plot
usage: plot [-h] [-p impact|change] [-c] [-t] [-f FILE] [-A FILE] [-O]

Graph impact of dihedrals on fit.

options:
  -h, --help            show this help message and exit
  -p impact|change, --prior impact|change
                        Replace impact or dihedral change with values from previous refinement (to superimpose restrained impact
                        upon unrestrained dihedral changes).
  -c, --changeOnly      plot only dihedral changes, not estimate of impact.
  -t, --tty             Display graph on terminal (not recommended for background jobs).
  -f FILE, --file FILE  save plot in graphics file of type given by extension (.emf, eps, jpeg, jpg, pdf, png, ps, raw, rgba, svg,
                        svgz, tif, tiff)
  -A FILE, --annotation FILE
                        File containing additional plot commands, overriding command-line input.
  -O, --overall         Dihedral changes (not impact) relative to reference (input) structure not refinement batch.
print
usage: print [-h] [-O]

Tabulate changes in dihedrals & impact on fit.

options:
  -h, --help     show this help message and exit
  -O, --overall  Dihedral changes (not impact) relative to reference (input) structure not refinement batch.

(Also: alias, done, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent dihedrals command.) (end impact sub-commands)


paint
usage: paint [-h] [-p] [-n] file

Output coordinates, B-factors set to dihedral change for visualization.

positional arguments:
  file              PDB output

options:
  -h, --help        show this help message and exit
  -p, --pseudo      Use pseudo-torsion angle (phi_i + psi_i-1),else individual phi, psi.
  -n, --normalized  Scaled between 0 & 100.

Output PDB file has B-factors replaced by magnitude of phi/psi changes!

(Also: alias, done, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent analyze command.) (end dihedrals sub-commands)


done
usage: done [-h]

Safe sub-menu exit, returning to higher level command.

options:
  -h, --help  show this help message and exit

(Also: alias, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent pasto.superpose command.) (end analyze sub-commands)


backup

usage: backup [-h] FILE

Save state: coordinates, history, refinement stats

positional arguments:
  FILE        pickle jar format

options:
  -h, --help  show this help message and exit

bfactors

usage: bfactors [-h] [-v] [-p OUTPUT_type] [-f FILE] [-c PDB_out] [-s {individual,group,overall,None}] [selection]

B-factor statistics comparing current model to target.

positional arguments:
  selection             SELECTION_EXPR | SELECTION | COLLECTION: Group(s) of atoms over which average statistics are to be output.
                        SELECTION_EXPR: boolean expr; see help SELECTION_EXPR. COLLECTION: a dictionary-like Group of SELECTIONs
                        (defined by prior select command) over which statistics are iterated. SELECTION: COLLECTION['ITEM']
                        previously defined with a select command. COLLECTION or ['ITEM'] can be omitted if defaults were used in
                        the select command.

options:
  -h, --help            show this help message and exit
  -v, --verbose         Including per-residue information.
  -p OUTPUT_type, --plot OUTPUT_type
                        Plot to "screen" or "file" an analysis of anisotropic Bs and coordinate differences. Repeat for both.
  -f FILE, --file FILE  base of the file name (for --plot=file) into which "_cmp_U" and "_U_vs_xyz" will be inserted for 2 plots.
                        The extension designates the file-type: .emf, eps, jpeg, jpg, pdf, png, ps, raw, rgba, svg, svgz, tif,
                        tiff.
  -c PDB_out, --color PDB_out
                        Output pdb files with B-factors replaced for molecular graphic illustration of the following statistics.
                        The argument gives the base file name, which is modified to indicate the contents of the B-factor column:
                        "_A"- 100 x Anisotropy of moving structure; "_U_vs_U"- 100 x \|cos(deviation U_moving vs. U_target)\|;
                        "_U_vs_xyz"- 100 x \|cos(deviation U_moving vs. coordinate difference vector)\|; "_div"- Kullback-Liebler
                        divergence for U_moving vs. U_target; "_div_scaled"- like "_div" except scaled by B_iso for each atom
                        (like Merritt, 2011) - this is only output if SCALING is not "individual". ["_div" and "_div_scaled" are
                        provided for comparison and may be deprecated.] Use -c None to disable output of all files.
  -s {individual,group,overall,None}, --scaling {individual,group,overall,None}
                        Scaling of anisotropic Us for atom size (B_iso) by individual (atom) or group or overall (by means of all
                        paired atoms).

edit

Usage: edit [-h] [file_path]

Run a text editor and optionally open a file with it

The editor used is determined by a settable parameter. To set it:

  set editor (program-name)

positional arguments:
  file_path   optional path to a file to open in editor

optional arguments:
  -h, --help  show this help message and exit

evaluate

usage: evaluate [-h] [-v] [-c PDB_file] [-s float]

Statistics for current model compared to target.

options:
  -h, --help            show this help message and exit
  -v, --verbose         Including per-residue information.
  -c PDB_file, --color PDB_file
                        Output superimposed stucture, B replaced by \|coordinate difference\|.
  -s float, --scale float
                        Scale by which differences in PDB_file will be multiplied.

exit

usage: exit [-h]

Safe program exit from top level, not subcommands.

options:
  -h, --help  show this help message and exit

flip

usage: flip [-h] [-R] [-N] [-D] [-Q] [-E] [-H] [-F] [-Y] [-a] [-v VERBOSITY]

Chi rotation of pseudo-symmetrical side chains to match target.

options:
  -h, --help            show this help message and exit
  -R, --arg             Include arginines.
  -N, --asn             Include asparagines.
  -D, --asp             Include aspartates.
  -Q, --gln             Include glutamines.
  -E, --glu             Include glutamates.
  -H, --his             Include histidines.
  -F, --phe             Include phenylalinines.
  -Y, --tyr             Include tysosines.
  -a, --all             All amino acids, default if none specified.
  -v VERBOSITY, --verbosity VERBOSITY
                        -1 (quiet) to +1 (verbose)

help

usage: help [-h] [-v] [-d] [-l] [-r] [command] ...

List available commands or help for one or all commands

positional arguments:
  command          [command-name, optional]
  subcommands      subcommand(s) to retrieve help for

options:
  -h, --help       show this help message and exit
  -v, --verbose    single-line description of each command
  -d, --descend    Document not the command(s) but sub-commands thereof
  -l, --long       Full (multi-line) help for each command.
  -r, --recursive  Recursively descend through nested command sets.

history

Usage: history [-h] [-r | -e | -o FILE | -t TRANSCRIPT_FILE | -c] [-s] [-x] [-v] [-a] [arg]

View, run, edit, save, or clear previously entered commands

positional arguments:
  arg                   empty               all history items
                        a                   one history item by number
                        a..b, a:b, a:, ..b  items by indices (inclusive)
                        string              items containing string
                        /regex/             items matching regular expression

optional arguments:
  -h, --help            show this help message and exit
  -r, --run             run selected history items
  -e, --edit            edit and then run selected history items
  -o, --output_file FILE
                        output commands to a script file, implies -s
  -t, --transcript TRANSCRIPT_FILE
                        output commands and results to a transcript file,
                        implies -s
  -c, --clear           clear all history

formatting:
  -s, --script          output commands in script format, i.e. without command
                        numbers
  -x, --expanded        output fully parsed commands with any aliases and
                        macros expanded, instead of typed commands
  -v, --verbose         display history and include expanded commands if they
                        differ from the typed command
  -a, --all             display all commands, including ones persisted from
                        previous sessions

macro

Usage: macro [-h] SUBCOMMAND ...

Manage macros

A macro is similar to an alias, but it can contain argument placeholders.

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  SUBCOMMAND
    create    create or overwrite a macro
    delete    delete macros
    list      list macros

See also:
  alias

overlay

usage: overlay [-h] [-m SELECTION_EXPR|SELECTION]

Apply the rigid transformation that LSQ overlays moving atoms on paired targets.

options:
  -h, --help            show this help message and exit
  -m SELECTION_EXPR|SELECTION, --moving SELECTION_EXPR|SELECTION
                        Selection of atoms to move; default: all. SELECTION_EXPR: boolean expr; see help SELECTION_EXPR.
                        SELECTION: COLLECTION['ITEM'] previously designated with a select command. COLLECTION or ['ITEM'] can be
                        omitted if defaults used in select command. This option does not change the pairing or the transformation
                        operator.

pair

usage: pair [-h] [-e] [-m SELECTION|SELECTION_EXPR] [-t SELECTION_EXPR]

Pair atoms between moving & target structures; pre-requisite of flip, refine.

options:
  -h, --help            show this help message and exit
  -e, --extend          Extend prior pairing, else start anew.
  -m SELECTION|SELECTION_EXPR, --moving SELECTION|SELECTION_EXPR
                        Moving atoms for this pair session. SELECTION_EXPR: boolean expr; see help SELECTION_EXPR. SELECTION:
                        COLLECTION['ITEM'] previously designated with a select command. COLLECTION or ['ITEM'] can be omitted if
                        defaults used in select command.
  -t SELECTION_EXPR, --target SELECTION_EXPR
                        Target atoms for this pair session.

pair sub-commands

done
usage: done [-h]

Safe sub-menu exit, returning to higher level command.

options:
  -h, --help  show this help message and exit
match
usage: match [-h] [-v ATTR] [-p ATTR]

Pair atoms by attribute value (-v) and/or position (-p; order).

options:
  -h, --help            show this help message and exit
  -v ATTR, --value ATTR
                        Match attribute by value (repeatable).
  -p ATTR, --position ATTR
                        Match attribute by position (order; repeatable).

ATTR is specified as in "select --attr", white-space replaced with "_". Default: --value atom_name --value=residue_num -v
conformer --position chain, (-v None (or -p None) avoids default, disabling value (position) matching.)
print
usage: print [-h]

Report numbers of paired atoms.

options:
  -h, --help  show this help message and exit
within
usage: within [-h] [-m] [-t] selection

Limit atom matching to moving and/or target selection until redefined.

positional arguments:
  selection     SELECTION_EXPR | SELECTION

options:
  -h, --help    show this help message and exit
  -m, --moving  Selection is for moving atoms.
  -t, --target  Selection is for target atoms.

SELECTION_EXPR: boolean expr; see help SELECTION_EXPR. SELECTION: COLLECTION['ITEM'] previously designated with a select command.
SELECTION is usually defined only for --moving atoms and will usually be invalid if applied to --target atoms. COLLECTION or
['ITEM'] can be omitted if defaults were used in the select command.

(Also: alias, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent pasto.superpose command.) (end pair sub-commands)


parameterize

usage: parameterize [-h] [-p]

For designated parameter type(s) (def. positions), select atoms to be refined & how.

options:
  -h, --help      show this help message and exit
  -p, --position  Designate which atomic positions (xyz, default) to be refined.

parameterize sub-commands

SELECTION: NAME | EXPRESSION
NAME: pre-defined selection created by the top level command “select”.

Examples: rigid[‘N_domain’] or my_subdomain (if previously defined).

EXPRESSION: array expression using keywords such as residue number,

chain, atom number, synonyms or unique abbreviations. Criteria can be combined with python operators: &, |, ^, ==, !=, ~, >, >=, <, <=, (, ), ... or Fortran synonyms which will be translated into python: .and., .OR., .xor., .NEQV., .eq., /=, .NOT., .gt., .GE., etc. Use quotes to escape white space, and to avoid command line parsing redirection (>) or piping (|), use .GT., .ge., .OR. or ‘^’. Quoting EXPRESSION avoids shell interpretation of special characters Examples: chain==B ‘(chai == C) & (residue_ty != HOH)’ ‘(residue num <= 105) ^ ((resnum .ge. 300) & (resnam != ATP))’

clear
usage: clear [-h]

Switch off all refinement of requested parameter type.

options:
  -h, --help  show this help message and exit

Individual parameterizations can be switched off with "group None", "individual None", "torsion None", "overall False"
done
usage: done [-h]

Safe sub-menu exit, returning to higher level command.

options:
  -h, --help  show this help message and exit
group
usage: group [-h] [group]

Select atoms to be refined as one or more groups.

positional arguments:
  group       GROUP | COLLECTION | SELECTION | SELECTION_EXPR | GROUP_EXPR

options:
  -h, --help  show this help message and exit

SELECTION: a previously saved single Selection, given as: COLLECTION['ITEM']. (COLLECTION or ['ITEM'] can be omitted if defaults
were used in the corresponding select command.) SELECTION_EXPR: boolean Selection expression, see help select. GROUP_EXPR: list,
dictionary of tuple of multiple quoted SELECTION_EXPR. GROUP | COLLECTION: name of previously saved (cmd select) Selections.
individual
usage: individual [-h] [selection]

Select atoms to be refined individually. (More appropos with exptl. data than superposition.)

positional arguments:
  selection   SELECTION_EXPR | SELECTION:

options:
  -h, --help  show this help message and exit

SELECTION_EXPR: a boolean expression, see help select; SELECTION: name of a previously saved Selection, given as:
COLLECTION[ITEM]. (COLLECTION or [ITEM] can be omitted if defaults were used in the corresponding select command.)
overall
usage: overall [-h] [overall]

Refine requested parameter type as a single group.

positional arguments:
  overall     [True (default)| False]

options:
  -h, --help  show this help message and exit

If just refining overall position, command overlay is better (wider convergence/faster). Use command overall when also
simultaneously refining sub-groups, torsion angles etc..
print
usage: print [-h] [-w INT] [-i SIZE] [-a SIZE] [-c | -b]

Print parameterization. (Options invoked until reset.)

options:
  -h, --help            show this help message and exit
  -w INT, --width INT   Line width (def. 132).
  -i SIZE, --items SIZE
                        Groups are truncated at SIZE Selections, followed by summary statistics (def. 10).
  -a SIZE, --abbreviate SIZE
                        Selections longer than SIZE abbreviated w/ ellipsis (def. 33).
  -c, --count           Report number of moving atoms in selections.
  -b, --boolean         Report selections as T/F boolean arrays (def).
torsion
usage: torsion [-h] [-d SELECTION] [-a SELECTION] [-f phi|psi|pseudo] [-t NUMBER(int)|FRACTION(float)] [-V phi|psi|pseudo] [-v]
               [-s SELECTION] [-S SIZE OFFSET SIZE OFFSET] [-w SELECTION]
               [selection]

Select variable dihedrals and atoms whose positions depend on them.

positional arguments:
  selection             used by -f/-V & resets -d default; See help SELECTION.

options:
  -h, --help            show this help message and exit
  -d SELECTION, --dihedrals SELECTION
                        optimize all variable dihedrals (phi, psi) within SELECTION. Default: macromolecule
  -a SELECTION, --atoms SELECTION
                        atoms to be moved by dihedral rotations if linked to variable bonds (which must be fully enclosed in
                        SELECTION). Default: macromolecule
  -f phi|psi|pseudo, --fix phi|psi|pseudo
                        Fix all dihedrals of named type within atoms in SELECTION (must be only option/argument.
  -t NUMBER(int)|FRACTION(float), --top NUMBER(int)|FRACTION(float)
                        Fix all but these dihedral angles (must be only option/argument; requires prior refinement; to be followed
                        by refine --restart; mostly superseded by --torsion_weight command line option).
  -V phi|psi|pseudo, --vary phi|psi|pseudo
                        Vary all dihedrals of named type within atoms in SELECTION (must be only option/argument.
  -v, --verbose         List selected dihedrals (--top).
  -s SELECTION, --segment SELECTION
                        optimize all variable dihedrals (phi, psi) in a segment. (Repeat -s for each segment; incompatible w/
                        -adftvSV)
  -S SIZE OFFSET SIZE OFFSET, --auto_segment SIZE OFFSET SIZE OFFSET
                        Split SELECTION into segments of SIZE residues, starting at 0th residue + OFFSET in "within"
  -w SELECTION, --within SELECTION
                        Auto-segment only within these residues, default: macromolecule

(Also: alias, edit, help, history, macro, parrot, py, quit, run_pyscript, run_script, set, shell, shortcuts, test (documented in parent pasto.superpose command.) (end parameterize sub-commands)


parrot

usage: parrot [-h] [parrot [True|False|Yes|No]]

Echo commands (or not)

positional arguments:
  parrot [True|False|Yes|No]
                        abbrev. OK; default is to toggle

options:
  -h, --help            show this help message and exit

(similar to set echo True|False.)

pdbout

usage: pdbout [-h] [-o FILE] [-a] [-c] [altfile] [header]

Write coordinates (symmetry expansion as per command-line arguments).

positional arguments:
  altfile               [file name]
  header                [optional quoted text]

options:
  -h, --help            show this help message and exit
  -o FILE, --file FILE  Output file name (required argument).
  -a, --anisotropic     Output anisotropic U, riding w/ coordinates from input PDB.
  -c, --c_alpha         Output only C_alphas.

phipsi_copy

usage: phipsi_copy [-h] [-s FLOAT] [-p] [-c]

Copy phi, psi from target to moving structure

options:
  -h, --help            show this help message and exit
  -s FLOAT, --sigma FLOAT
                        Copy only dihedrals above these standard deviations.
  -p, --pseudo          Sigma threshold applies to pseudo-dihedrals (phi_i + psi_i-1) rather than individual torsion angles.
  -c, --confirm         Confirm each change interactively before applying.

phipsi_diff

usage: phipsi_diff [-h] [-c PDB-FILE] [-H] [-g INT] [-t FLOAT] [-s START] [-e END] [-n] [-p] [-P] [header ...]

Difference in phi, psi between current structure & target.

positional arguments:
  header                [Header for PDB file, optionally quoted to escape shell]

options:
  -h, --help            show this help message and exit
  -c PDB-FILE, --color PDB-FILE
                        Output coordinates for display - B-factors set to differences.
  -H, --hinge           hinge analysis using THRESHOLD & GAP
  -g INT, --gap INT     # residues w/o dihedral changes that can be bridged in a hinge
  -t FLOAT, --threshold FLOAT
                        above which dihedral rotations considered hinges. Degrees if > 1.0, else fraction of total change.
  -s START, --start START
                        start of an explicitly defined hinge: chain-residue, requires --end, repeatable.
  -e END, --end END     end of an explicitly defined hinge: chain-residue, requires --end, repeatable.
  -n, --normalized      Color by relative absolute differences not actual signed values, and plot absolute values.
  -p, --pseudo          quasi pseudo-torsion angles, combining phi_i w/ psi_i-1
  -P, --plot            graph the differences.

py

Usage: py [-h]

Run an interactive Python shell

optional arguments:
  -h, --help  show this help message and exit

quit

Usage: quit [-h]

Exit this application

optional arguments:
  -h, --help  show this help message and exit

refine

usage: refine [-h] [-m {l_bfgs_b,bfgs,Powell,CG,Newton_CG,TNC}] [-i FLOAT] [-a FLOAT] [-g FLOAT] [-C INT] [-n] [-r] [-v INT]

Refine atomic model.

options:
  -h, --help            show this help message and exit
  -m {l_bfgs_b,bfgs,Powell,CG,Newton_CG,TNC}, --method {l_bfgs_b,bfgs,Powell,CG,Newton_CG,TNC}
                        Optimization method, see scipy.optimize.minimize documentation. (Find l_bfgs_b more stable than bfgs, more
                        efficient than others.)
  -C INT, --max_cycles INT
                        Max iterations (Powell, bfgs, Newton-CG, l_bfgs_b, TNC).
  -n, --new_batch       New batch (losing prior history), else continue prior (default) if exists/possible.
  -r, --restart         From original coordinates (implies -n).
  -v INT, --verbosity INT
                        Per-cycle logging: -1 (terse) to 3 (verbose) [def. 0].

Convergence criteria for ending:
  applicable to specified methods, else ignored

  -i FLOAT, --min_improvement FLOAT
                        Per-cycle minimal residual fractional improvement (l_bfgs_b, TNC).
  -a FLOAT, --accuracy FLOAT
                        Maximal relative estimated error for any parameter (Powell, Newton_CG).
  -g FLOAT, --min_grad FLOAT
                        Minimal gradient norm, internal (pre-conditioned) units (CG, bfgs, l_bfgs_b, TNC).

restore

usage: restore [-h] FILE

Restore state: coordinates, history, refinement stats

positional arguments:
  FILE        pickle jar format

options:
  -h, --help  show this help message and exit

(Options & parameterization must be preset to those saved in the backup file to avoid unpredictable results).

run_pyscript

Usage: run_pyscript [-h] script_path ...

Run a Python script file inside the console

positional arguments:
  script_path       path to the script file
  script_arguments  arguments to pass to script

optional arguments:
  -h, --help        show this help message and exit

run_script

Usage: run_script [-h] [-t TRANSCRIPT_FILE] script_path

Run commands in script file that is encoded as either ASCII or UTF-8 text

Script should contain one command per line, just like the command would be
typed in the console.

If the -t/--transcript flag is used, this command instead records
the output of the script commands to a transcript for testing purposes.

positional arguments:
  script_path           path to the script file

optional arguments:
  -h, --help            show this help message and exit
  -t, --transcript TRANSCRIPT_FILE
                        record the output of the script as a transcript file

select

usage: select [-h] [-C NAME] [-n NAME] [-a ATOM_ATTR] [-S NAME] [selection_expr]

Name selection(s) of atoms.

positional arguments:
  selection_expr        [SELECTION_EXPR] SELECTION_EXPR: logical expression of <Selection(s)> (string) using array logical
                        operators: &, |, ==, !=, ~, >, >=, (, ), ... or Fortran synonyms, which circumvent the cmd2 parsing of '>'
                        & '|' as redirects before translation into python: .and., .OR., .xor., .NEQV., .eq., /=, .NOT., .gt.,
                        .GE., etc. evaluated in the task namespace. <Selection> is an existing instance of class Selection or a
                        new one instantiated with S(<criterion>), where string <criterion> is a logical array expression using
                        keywords such as residue number, chain, atom number, synonyms or unique abbreviations, see documentation
                        for Selection. SELECTION_EXPR should be quoted to avoid shell commands / redirects etc, and if provided as
                        a named argument, must be devoid of white-space. Examples:: select -C rigid -n N_domain S('chain == A') &
                        S('residue num <= 105') select -C protein -n C S('(chai == C) & (residue_ty != HOH)') select -C all -n
                        catalytic "rigid['N_domain'] | S('resnam == ATP')" select -C mycollection -a resnum -F protein -S C select
                        --collection=chains --attr=chain

options:
  -h, --help            show this help message and exit
  -C NAME, --collection NAME
                        Collection (dictionary) into which selection is placed (default: None --> "collection" or name of
                        attribute if -a specified).
  -n NAME, --name NAME  Unique name to be given to selection (default: None --> "default"). (-a (--attr) & -n (--name) are
                        mutually exclusive.)
  -a ATOM_ATTR, --attr ATOM_ATTR
                        Selections are made (and named) for each unique value of ATOM_ATTR in the coordinates (see -d, -f).
                        ATOM_ATTR must be specified as a single-word abbreviation/synonym recognized in Selection expressions, eg.
                        --attr=chain might give selections A, B..., while --attr=resnum might give 23, 24,... (-a (--attr) & -n
                        (--name) are mutually exclusive.)
  -S NAME, --selection NAME
                        Selection or dictionary (group) from which --attr subset is to be drawn (default: None --> all atoms).

set

usage: set [-h] [-v] [param] [value]

Set a settable parameter or show current settings of parameters

positional arguments:
  param          parameter to set or view
  value          new value for settable

options:
  -h, --help     show this help message and exit
  -v, --verbose  include description of parameters when viewing

shell

Usage: shell [-h] command ...

Execute a command as if at the OS prompt

positional arguments:
  command       the command to run
  command_args  arguments to pass to command

optional arguments:
  -h, --help    show this help message and exit

shortcuts

Usage: shortcuts [-h]

List available shortcuts

optional arguments:
  -h, --help  show this help message and exit

test

usage: test [-h] test-name [argument(s) ...]

Run a test (program development only).

positional arguments:
  test-name    choose from those defined in task.run_test()
  argument(s)

options:
  -h, --help   show this help message and exit
pasto.superpose>  exit

Python Interpreter for advanced functionality

An attempt has been made to balance flexibility with simplicity and ease of use, in deciding which functionalities and parameters are available through command-line or command interpreter control. Many others are accessible through an embedded python interpreter that is invoked with the command py command. It is executed in a namespace that is local to the command interpreter (and not very useful). The command-line options are imported as attributes of the object “option”, and most other needed objects can be accessed as attributes of self.task, for which the alias “my” is provided. The following examples illustrate:

  • Change chain name of target to match refining model: RSRef ‣ py

    import numpy
    my.atoms.chain = numpy.where(my.atoms.chain=='C', 'A', my.atoms.chain)
    exit()
    
  • Remove one of the target alternative locator IDs to match refining model: RSRef ‣ py

    import numpy
    my.atoms.altloc=numpy.where(my.atoms.altloc=='A',' ',my.atoms.altloc)
    exit()
    
  • Change the torsion-restraint weight during a refinement: RSRef ‣ py option.torsion_weight = 2.0

  • Print values of command-line options: RSRef ‣ py print(option)

  • Replace all B values with their mean: RSRef ‣ py my.atoms.b.fill(my.atoms.b.mean())

Atom selections and groups

A parser is provided for flexible selection of atom groups within the command interpreter. It is applicable to stand-alone running (rsref.py, superpose.py etc.), but not usually accessible when modules are wrapped/embedded in other packages (eg. CNS) which then manage atom selections.

Selections

The selection syntax is terse, flexible, and (for better or worse) relies on Python evaluation of expressions. Thus, hopefully, it will be intuitive for many users.

It differs from some other programs in that selections are objects (instances of class Selection). Selection objects are an extension of boolean arrays, specific for a coordinate set (class Atoms). Selection expressions can be used directly in commands like parameterize, or user-named Selections can be pre-defined for repeated use or to simplify/clarify the definition of complicated Selections. Selections can be combined or assigned to new Selection instances through use of Python bit-wise logical operators (&,|,==,!=,~,>,>=,(,),...). This should make it convenient to write refinement scripts in which different subsets of the atomic parameters are refined at different stages, different sets of atoms are subject to positional, B-factor or occupancy refinement, and in which some atoms might be refined individually, others rigidly grouped for simultaneous refinement, etc.. This frees the user of constraints embodied in other programs.

In RSRef, “S” is a synonym of Selection, the class. A selection expression is defined as:

<selection>|S(<criterion>) [<operator> <selection>|S(<criterion>)]

where:

<selection>

is a pre-existing instance of class S.

<criterion>

is a non-quoted string to select atoms for a new instance of S, where <criterion> can be in simple form or compound:

Simple

contains a single operator, and is given without parentheses, such as:

  • chain == A

  • chai==B

  • Residue number >= 30 (or Residue number .GE. 30, avoiding cmd2 redirect parsing)

  • resnu <= 50

Compound

an expression combining operators, using parentheses to set precedence. Spaces in names must be replaced with underscores, and expressions should use lower case as case sensitive. Examples:

  • (chain == A) | (chai==B)

  • ((residue_numb >= 30) & (atnam == CA)) | (chain != C)

Warning

the compound parser will silently do unexpected things with syntax or spelling errors. Check that results are consistent with expectations!

Note

the inner operands are enclosed within parentheses, because the combination operators (&, |) have higher precendence than the comparison operators (>, >=, <, <=).

<operator>

is a bit-wise Python logical operator (&,|,==,!=,~,>,>=,(,),...) Further details of the syntax for making Selections are provided in atoms.Selection.

Selections are named and defined with the select command.

Groups

Groups are dictionary-like collections of named Selections, each constrained as a group in group refinement. (In individual atom refinement, a Group is treated as the logical OR between all the named Selections.) The Group class contains methods for checking that selections do not overlap (work-in-progress). In most commands, the name of a Group can be substituted for that of a Selection.

Groups are defined with the selection command, for example:

select --collection=domains --name=N S('chain == A') & S('residue num <= 105')
select -C domains -n C S('chain == A') & S('residue num .gt. 105')

These 2 statements illustrate several varients of the syntax in together defining a Group called domains, with two Selections, named N & C that contain the N- and C-terminal parts of subunit A.

Troubleshooting

Pairing

AssertionError: Overwriting prior pairing for ... in structure B: Attempted pairing was ambiguous with two atoms of structure A mapped to the same atom of B. Sometimes this is because of multiple conformers, which can be resolved with pair options specifying which conformer. In the following example, the highest occupancy conformer (aka #1) will be used if there is more than one: pasto.superpose ‣ pair –moving ((chain==A)(conformer==1))

Refinement

Stunted convergence: Refinement, started far from convergence, will sometimes do better with additional cycles specified with –new_batch. However, this restarts the history, spoiling impact analysis. Overlay can often bring the structure close enough to the target to avoid this dilemma - this is appropriate when local conformational analysis is important, but not differences in absolute coordinates. (Stunted convergence likely arises when the starting point is so far from the answer that many of the initial partial derivatives are poorly correlated with those needed later. The history that is used by lbgfs to accelerate well-behaved convergence can therefore become a liability, and the –new_batch option may be helping by erasing the history.)

Installation:

Superpose has been developed as part of the RSRef package. It can also be installed, stand-alone, as pure a Python program with its subset of RSRef module dependencies: argparser.py, atoms.py, cmd2nest.py, impact.py, optimize.py, rotamer.py, superpose_user_doc.py, torsion.py & transform.py. External packages cmd2, NumPy and SciPy are also required.

Changed in version 0.1: 10/23/2012 Start

Changed in version 0.5: 05/02/2015 ReStructuredText docs

Changed in version 0.5.5: 10/13/15 Anisotropic Us ride with atom rotations.

Changed in version 0.5.6: 03/23/16 Comparative analysis of anisotropic Us.

Changed in version 1.0.0: 09/26/20 Python 2.7 –> 3.6 cmd2: optparse –> argparse

Credits

Libraries used in rigid-group and torsion angle optimizations were first programmed by Brynmor K. Chapman. Other programming by Michael S. Chapman.

Citations

Publications and database entries should acknowledge use of Superpose by citing [Chapman-2015]. Those analyzing anisotropic B-factors should also cite [Godsey-2016].

[Chapman-2015]

Chapman, B.K., Davulcu, O., Skalicky, J.J., Brűschweiler, R.P. and Chapman, M.S. (2015). Parsimony in Protein Conformational Change, Structure, 23: 1190-1198. doi:10.1016/j.str.2015.05.011

[Godsey-2016]

Godsey, M.H., Davulcu, O., Nix, J., Skalicky, J.J., Brüschweiler, R.P. and Chapman, M.S. (2016). The Sampling of Conformational Dynamics in Ambient-Temperature Crystal Structure of Arginine Kinase, Structure 24: 1658-67. doi:10.1016/j/str.2016.07.013

Colophon

This introductory documentation is generated from the source file: superpose_doc.rst. Its dependents are extracted from specific python modules using documentation.py, see instructions within.

Changed in version 1.0.0: 12/5/20, python3.

Changed in version 0.5.0: 2/18/15, converted to reStructuredText from Epydoc.

Changed in version 0.1: 12/10/10 Start.