SymExp: Symmetry expansion of molecular coordinates

Section author: Michael S. Chapman <chapmanms@missouri.edu>

Authors:

Michael S. Chapman <chapmanms@missouri.edu>,

Oregon Health & Science University and University of Missouri

Version:

1, Nov 26, 2024

Usage:
  • SymExp [options] <input-pdb-file>

  • symexp.py [options] <input-pdb-file>

Synopsis

Expands coordinates by both molecular (non-crystallographic) and crystallographic (lattice) symmetry:

  • Finds atoms in the neighborhood of a point (sphere) or a molecular fragment (neighbor).

  • Optionally completes structures of neighboring atoms into residues or chains.

  • Does not attempt to fill a crystal cell or asymmetric unit.

  • For efficient analysis of interactions.

Symexp is the stand-alone front end for symmetry.py which is called by several scripts, for which this serves as supplementary documentation.

Sources of Documentation

All of the following should be referenced:

Program options

Brief explanations of the arguments for stand-alone runs are given with: symexp -h, further information below.

Commands

Within the program (pasto.symexp> prompt), commands are listed with pasto.symexp ‣ help with the -v option to list synopses, help command for details on command, help -l for details on all commands and help -rl to descend recursively through subcommands.

Examples

The scripts directory in the installation contains run_symexp.sh, as does examples/virus-5A/sun_symmetry.sh. These exemplify two types of usage using coordinate files that are provided.

Details, API

Details are encoded within docstrings that are accessible to programmers using Interactive Development Environments (IDEs). They are also compiled with sphinx into html files, linked from the module index on the home page. This is the searchable, cross-linked (API) reference documentation that will explain the meaning of parameters, performance of different functions etc..

The documentation is accessed from (index.html) on-line or in the distribution directory doc/html. (Additional formats can be generated with sphinx.)

Command-line options

The most up-to-date documentation is generated from symexp.py -h:

(Command: /trihome/chapmanms/Devel/RSRef/FTatom/pasto/symexp.py -h)

(Sources: /trihome/chapmanms/Devel/RSRef/FTatom/pasto v1.0.6)

usage: symexp.py [-h] [–max_shift DIST] [–b_overall B_OVERALL] [–local_symmetry FILE] [–space_group SYMBOL] [–lattice_translations UNIT_CELLS] [–completion residue_or_chain_or_none] [–output unique_or_local_or_full_or_none] [–unit_cell a b c alpha beta gamma] [–in_nomenclature STR] [–out_nomenclature STR] [–version] [–infile FILE] [–outfile FILE] [INPUT.PDB] …

Expansion of molecular & crystal symmetry. (c) OHSU 2010-18; University of Missouri 2018-20, Michael S. Chapman

positional arguments:
… Program commands may follow any required positional arguments, or optional positional arguments terminated

by ‘–’. Commands are space-separated, quoted if containing white space.

options:
-h, --help

show this help message and exit

--version, -v

show program’s version number and exit

--infile FILE

Redirected standard input, like “<”. (default: <_io.TextIOWrapper name=’<stdin>’ mode=’r’ encoding=’utf-8’>)

--outfile FILE

Redirected standard output, like “>”. (default: <_io.TextIOWrapper name=’<stdout>’ mode=’w’ encoding=’utf-8’>)

Input model parameters:
--max_shift DIST, -s DIST

Atomic shift that will trigger recalculations (neighbors etc., A, default (None) –> high_resolution/2). (default: None)

--b_overall B_OVERALL, -B B_OVERALL

Isotropic B-factor to be added to all atoms (A^2, equivalent to EM_ENVELOPE * 4). (default: 0.0)

INPUT.PDB Input PDB file. (default: None)

Symmetry (local & crystal lattice):
--local_symmetry FILE, -l FILE

File of local (molecular) symmetry operators (Cartesian Angstrom). (default: None)

--space_group SYMBOL, -G SYMBOL

Hermann-Mauguin symbol (Int. Tables; None if isolated particle (EM)). (default: None)

--lattice_translations UNIT_CELLS

Number of translations to search in each direction for neighbors. Usually 1 suffices, 2 sometimes adds more, but takes longer (check). Zero disables. (default: 1)

--completion residue_or_chain_or_none

Expand symmetry neighbors to include all atoms within each “residue” or “chain”. (default: None)

--output unique_or_local_or_full_or_none

Select which previously expanded symmetry equivalent atoms to output to PDB. “unique” (“None”) for no equivalents; “local” for NCS symmetry; or “full” for local + neighbors that are related by crystal symmetry operators & lattice translations (that will be output together in a non-standard PDB file). (default: local)

--unit_cell a_and_b_and_c_and_alpha_and_beta_and_gamma, -U a_and_b_and_c_and_alpha_and_beta_and_gamma

Unit cell parameters (over-rides those in header of some maps). (default: None)

Nomenclature translation:
--in_nomenclature STR, -q STR

From input: bmrb, cif, cns, diana, midas, msi, pdb92, sc, sybyl, ucsf, xplor, lax; lax for no translation, None for auto-determine (default: None)

--out_nomenclature STR, -Q STR

To output: bmrb, cif, cns, diana, midas, msi, pdb92, sc, sybyl, ucsf, xplor; default: same as –in_nomenclature (-q) if defined, else internal convention if None (default: None)

+<file> inserts options from <file>, one per line.

Input files

local_symmetry

A file that will be evaluated (in python) as a tuple of operators. Each operator is a tuple of name (str), rotation (tuple) and translation (tuple of 3 Angstrom floats). The rotation is the matrix specified as three row-tuples, each as a tuple of 3 floats). The unit operator is implied and therefore optional. The example below specifies two additional symmetry equivalents:

(   ("p",    (
        (0.500000,       0.809000,      -0.309000),
        (-0.809000,       0.309000,      -0.500000),
        (-0.309000,       0.500000,       0.809000)
            ),
        (0.000000,       0.000000,       0.000000)),
    ("e",    (
        (0.309000,      -0.500000,       0.809000),
        (0.500000,       0.809000,       0.309000),
        (-0.809000,       0.309000,       0.500000)
            ),
       (0.000000,       0.000000,       0.000000)))

Command interpreter

The command interpreter is used for program control in stand-alone mode, but generally not when called from / embedded in another package. After the program has been invoked with options/arguments that are parsed, program flow is controlled with a series of user-entered interactive commands.

These run-time commands are interpreted with an extenstion (cmd2nest) of the cmd2 package that adds support for nested sub-commands. The following will summaraize the essentials of program control and note differences from cmd2. The cmd2 documentation provides additional detail.

Syntax:

Commands are entered at the prompt in a unix-shell style:
  • command [option(s)] [positional argument(s)]

  • where options can be provided as -x [value] (x is a single letter), or --option[[=]value] in “two-dash” long form.

  • Various standard short-cuts are pre-defined, and tab completion is available for commands and arguments.

    Error

    The abbreviation given in the help is not always long enough to be unique (a bug inherited from cmd2.

Error

Tab completion fails after exiting a sub-command.

Shell:

The cmd2 shell-like interface is inherited, offering history, command editing and redirects. Redirects should work (<, |, >).

Hierachical structure:

Sub-commands are only available after entering the command. Higher-level commands are generally not available in sub-commands. The exceptions are general utility commands such as shell, shortcuts & set. Help, by default, is specific to the command level. The --descend (-d) documents one sub-command level deeper & --recursive --long (-rl) decends through all levels exhaustively. Note that load (“@”) & related commands and history, do not transcend different command levels.

Just-in-time calculation & pre-requisites:

A number of efficiencies are possible by pre-calculating and repeatedly using objects. Rather than pre-calculating at startup all objects that might be needed, the program attempts to calculate the minimal needed, just-in-time. For the most part, the pre-requisites are figured out and tasks are executed when needed using pre-assigned (or default) parameters. One exception is that any command with a “parameterize” pre-requisite will issue an error message if not already performed (mind-reading is not an option!).

The order that commands are entered is sometimes important, particularly when the embedded python interpreter is invoked with “py” (see below). Given the flexibility of the “py” command, there is no way to figure out the pre-requisites. Users should be especially attentive to AttributeErrors that might indicate an unmet pre-requisite dependence.

Error recovery:

Inherited from cmd2, exceptions are captured at the Command level, printing at least an error message, but without aborting the whole program. On interactive use, this conveniently often offers a second chance. If run as a script, users should search the output for “Error”, lest one has scrolled by. The default is a terse error message, but this can be changed to a full traceback using “set debug True” (still does not abort).

Selected commands - implementation-specific extensions & limitations.

@FILE or run_script FILE

Used to run commands from an external file. The limitation is that commands cannot descend/ascend through nested sub-commands. Thus, for example, commands within parameterize would have to be given separately. The same limitation applies to variants _relative_run_script (@@).

Command-line commands

Tokens following any of the program’s required positional parameters (e.g. INPUT.PDB) will be run as top-level commands. There is no support for transcending sub-commands or for command options:

Example: rsref.py /dev/null help shortcuts exit

Options can be incorporated by using the cmd2 run_script shortcut on the command line (Example1), where cmd.txt has one command per line that can include options and whitespace. Even better, use the --infile option (imported from argparser)

Example1: rsref.py /dev/null @cmd.txt exit
Example2: rsref.py /dev/null --infile=cmd.txt

done, exit and quit

These are near synonyms to mitigate a problem with cmd2’s error-handling in scripted runs with sub-commands. On an exception within a sub-command, the program terminates (just) the sub-command, and continues reading commands that had been intended for the terminated sub-command, but applying them mistakenly to continued exectuation in the higher level from where the sub-command was invoked. Should a quit (or exit), intended for a sub-command, be encountered at top-level, the program can terminate before any results are saved. To avoid this, the base cmd2 commands have been overridden:

  • exit is only available from the top level.

  • done is only available from sub-commands.

  • quit is unsafely available from both.

Thus, if done is used exclusively to finish a sub-command, if it is invoked accidentally at the top level, it will lead to an unrecognized command error, and remaining top-level commands (eg. saving results) will executed before the program is terminated with exit. The unsafe quit can be used interactively and repeatedly to bail out of a failed run if the sub-command level is unclear.

python interpreter

py (without statement) opens a python shell within which multiple statements can be executed, terminating the shell with Cntrl-D, exit() or quit(). These provide powerful ways of customizing the programs and extending functionality beyond the commands that are provided.

In our extension of cmd2, namespace my provides access to objects within the task-space of the program. Thus, for example, atomic B-factors could be printed or manipulated using my.atoms.b, and program options with my.option.resolution (for example).

The python shell is executed in its own namespace, so modules (such as sys or numpy) have to be imported explicitly.

Additional examples are given in sections “Python Interpreter for advanced functionality” in the documentation for specific programs.

Limitations, bugs & work-arounds

The single-line variant, py statement, executing a single python statement, was deprecated in package cmd2 v2.4. There may be legacy scripts that will need updating.

  • Annotations

    py print("\nAnnotation for stdout") was a common use. Consider the alternative: shell echo -e "\nAnnotation for stdout" which will avoid additional output from interpreter start and termination. The shell alternative is only possible if access to python attributes is not required.

  • Redirects (> , < and |) on the py command line

    are captured by the cmd2 parser, not any shell that might redirect io for the python interpreter. This means that shell heredoc file and variables are not supported within py.

  • Support for compound single-line statements (py stmt1; stmt2)

    with a ‘;’ separator disappeared from cmd2 prior to v2.4. A common usage was to import a module, then use a module attribute. Subsequently, code following ‘;’ was ignored silently, affecting some legacy scripts.

All of these issues are by-passed by invoking the full python shell instead of the single-line py command.

Other commands available in all applications

Use application help <command> for further details:

alias

Manage aliases

edit

Run a text editor and optionally open a file with it

exit

Safe program exit from top level, not subcommands.

help

List available commands or help for one or all commands

history

View, run, edit, save, or clear previously entered commands

macro

Manage macros

parrot

Echo commands (or not)

py

Invoke Python shell, ending exit(); single line “py command” no longer supported; see comments above and also shell

quit

Exit this application

run_pyscript

Run a Python script file inside the console

run_script

Run commands in script file that is encoded as either ASCII or UTF-8 text

set

Set a settable parameter or show current settings of parameters

shell

Execute a command as if at the OS prompt

shortcuts

List available shortcuts

test

test [<name>]: run test (developers only)

Command list

The most up-to-date documentation is generated from symexp.py /dev/null help exit:

Documented commands (use 'help -v' for verbose/'help <topic>' for details):
===========================================================================
alias  help     neighbors  py            run_script  shortcuts
edit   history  parrot     quit          set         sphere
exit   macro    pdbout     run_pyscript  shell       test

pasto.symexp>  exit

Command Help

The most up-to-date documentation is generated from symexp.py then help -rl

alias

Usage: alias [-h] SUBCOMMAND ...

Manage aliases

An alias is a command that enables replacement of a word by another string.

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  SUBCOMMAND
    create    create or overwrite an alias
    delete    delete aliases
    list      list aliases

See also:
  macro

edit

Usage: edit [-h] [file_path]

Run a text editor and optionally open a file with it

The editor used is determined by a settable parameter. To set it:

  set editor (program-name)

positional arguments:
  file_path   optional path to a file to open in editor

optional arguments:
  -h, --help  show this help message and exit

exit

usage: exit [-h]

Safe program exit from top level, not subcommands.

options:
  -h, --help  show this help message and exit

help

usage: help [-h] [-v] [-d] [-l] [-r] [command] ...

List available commands or help for one or all commands

positional arguments:
  command          [command-name, optional]
  subcommands      subcommand(s) to retrieve help for

options:
  -h, --help       show this help message and exit
  -v, --verbose    single-line description of each command
  -d, --descend    Document not the command(s) but sub-commands thereof
  -l, --long       Full (multi-line) help for each command.
  -r, --recursive  Recursively descend through nested command sets.

history

Usage: history [-h] [-r | -e | -o FILE | -t TRANSCRIPT_FILE | -c] [-s] [-x] [-v] [-a] [arg]

View, run, edit, save, or clear previously entered commands

positional arguments:
  arg                   empty               all history items
                        a                   one history item by number
                        a..b, a:b, a:, ..b  items by indices (inclusive)
                        string              items containing string
                        /regex/             items matching regular expression

optional arguments:
  -h, --help            show this help message and exit
  -r, --run             run selected history items
  -e, --edit            edit and then run selected history items
  -o, --output_file FILE
                        output commands to a script file, implies -s
  -t, --transcript TRANSCRIPT_FILE
                        output commands and results to a transcript file,
                        implies -s
  -c, --clear           clear all history

formatting:
  -s, --script          output commands in script format, i.e. without command
                        numbers
  -x, --expanded        output fully parsed commands with any aliases and
                        macros expanded, instead of typed commands
  -v, --verbose         display history and include expanded commands if they
                        differ from the typed command
  -a, --all             display all commands, including ones persisted from
                        previous sessions

macro

Usage: macro [-h] SUBCOMMAND ...

Manage macros

A macro is similar to an alias, but it can contain argument placeholders.

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  SUBCOMMAND
    create    create or overwrite a macro
    delete    delete macros
    list      list macros

See also:
  alias

neighbors

usage: neighbors [-h] [-d FLOAT]

Identify neighbors within distance of (selected) atoms.

options:
  -h, --help            show this help message and exit
  -d FLOAT, --distance FLOAT
                        Searches for neighbors within distance of any atom (default: 3.5 A).

parrot

usage: parrot [-h] [parrot [True|False|Yes|No]]

Echo commands (or not)

positional arguments:
  parrot [True|False|Yes|No]
                        abbrev. OK; default is to toggle

options:
  -h, --help            show this help message and exit

(similar to set echo True|False.)

pdbout

usage: pdbout [-h] -o FILE [-a] [header]

Write coordinates (symmetry expansion as per command-line arguments).

positional arguments:
  header                [Header inserted into top of PDB file]

options:
  -h, --help            show this help message and exit
  -o FILE, --file FILE  Output file name.
  -a, --anisotropic     Output anisotropic U, riding w/ coordinates from input PDB.

py

Usage: py [-h]

Run an interactive Python shell

optional arguments:
  -h, --help  show this help message and exit

quit

Usage: quit [-h]

Exit this application

optional arguments:
  -h, --help  show this help message and exit

run_pyscript

Usage: run_pyscript [-h] script_path ...

Run a Python script file inside the console

positional arguments:
  script_path       path to the script file
  script_arguments  arguments to pass to script

optional arguments:
  -h, --help        show this help message and exit

run_script

Usage: run_script [-h] [-t TRANSCRIPT_FILE] script_path

Run commands in script file that is encoded as either ASCII or UTF-8 text

Script should contain one command per line, just like the command would be
typed in the console.

If the -t/--transcript flag is used, this command instead records
the output of the script commands to a transcript for testing purposes.

positional arguments:
  script_path           path to the script file

optional arguments:
  -h, --help            show this help message and exit
  -t, --transcript TRANSCRIPT_FILE
                        record the output of the script as a transcript file

set

usage: set [-h] [-v] [param] [value]

Set a settable parameter or show current settings of parameters

positional arguments:
  param          parameter to set or view
  value          new value for settable

options:
  -h, --help     show this help message and exit
  -v, --verbose  include description of parameters when viewing

shell

Usage: shell [-h] command ...

Execute a command as if at the OS prompt

positional arguments:
  command       the command to run
  command_args  arguments to pass to command

optional arguments:
  -h, --help    show this help message and exit

shortcuts

Usage: shortcuts [-h]

List available shortcuts

optional arguments:
  -h, --help  show this help message and exit

sphere

usage: sphere [-h] [-c x y z] [-r FLOAT]

Find atoms & symmetry equivalents near given point.

options:
  -h, --help            show this help message and exit
  -c x y z, --center x y z
                        Center of sphere (default: [0.0, 0.0, 0.0]).
  -r FLOAT, --radius FLOAT
                        Radius (default: 10.0)

test [<name>]: run test (developers only)
pasto.symexp>  exit

Python Interpreter for advanced functionality

Rarely or never needed, parameters that are not accessible through options or commands can be changed through the embedded python interpreter that is invoked with the command py command. See RSRef or Superpose for details.

Credits

The underlying algorithms were coded in fortran by Michael S. Chapman in 1991. After several revisions by the same author, expcoord was given a major overhall as it was refactored into C++ in 1995 by Eric Blanc, in the Chapman lab, who introduced various efficiencies. The methods were refactored by Michael into python in 2011, with changes in library dependence, user-interface, inclusion of space group symmetry as well as non-crystallographic (molecular) and dual support for stand-alone and embedded use.

Colophon

This introductory documentation is generated from the source file: symexp_doc.rst. Its dependents are extracted from specific python modules using documentation.py, see instructions within.

Changed in version 0.1: 12/10/10 Start.

Changed in version 0.1.1: 09/09/11 refactoring into python, support of rsref / use of modules

Changed in version 0.4.1: 07/01/13 Space group / lattice symmetry

Changed in version 0.5.0: 03/12/15, converted to reStructuredText from Epydoc.

Changed in version 0.5.5: 10/20/15 Riding anisotropic Us

Changed in version 1.0.0: 11/06/20 Python 2.7 –> 3.6