The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Chemistry::Bond::Find - Detect bonds in a molecule from atomic 3D coordinates and assign formal bond orders

SYNOPSIS

    use Chemistry::Bond::Find ':all'; # export all available functions

    # $mol is a Chemistry::Mol object
    find_bonds($mol);
    assign_bond_orders($mol);

DESCRIPTION

This module provides functions for detecting the bonds in a molecule from its 3D coordinates by using simple cutoffs, and for guessing the formal bond orders.

This module is part of the PerlMol project, http://www.perlmol.org/.

FUNCTIONS

These functions may be exported, although nothing is exported by default.

find_bonds($mol, %options)

Finds and adds the bonds in a molecule. Only use it in molecules that have no explicit bonds; for example, after reading a file with 3D coordinates but no bond orders.

Available options:

tolerance

Defaults to 1.1. Two atoms are considered to be bound if the distance between them is less than the sum of their covalent radii multiplied by the tolerance.

margin

NOTE: in general setting this option is not recommended, unless you know what you are doing. It is used by the space partitioning algorithm to determine the "bucket size". It defaults to 2 * Rmax * tolerance, where Rmax is the largest covalent radius among the elements found in the molecule. For example, if a molecule has C, H, N, O, and I, Rmax = R(I) = 1.33, so the margin defaults to 2 * 1.33 * 1.1 = 2.926. This margin ensures that no bonds are missed by the partitioning algorithm.

Using a smaller value gives faster results, but at the risk of missing some bonds. In this example, if you are certain that your molecule doesn't contain I-I bonds (but it has C-I bonds), you can set margin to (0.77 + 1.33) * 1.1 = 2.31 and you still won't miss any bonds (0.77 is the radius of carbon). This only has a significant impact for molecules with a thousand atoms or more, but it can reduce the execution time by 50% in some cases.

orders

If true, assign the bond orders after finding them, by calling assign_bond_orders($mol, %opts).

bond_class

The class that will be used for creating the new bonds. The default is the bond class returned by $mol->bond_class.

assign_bond_orders($mol, %opts)

Assign the formal bond orders in a molecule. The bonds must already be defined, either by find_bonds or because the molecule was read from a file that included bonds but no bond orders. If the bond orders were already defined (maybe the molecule came from a file that did include bond orders after all), the original bond orders are erased and the process begins from scratch. Two different algorithms are available, and may be selected by using the "method" option:

    assign_bond_orders($mol, method => 'itub');
    assign_bond_orders($mol, method => 'baber');
itub

This is the default if no method is specified. Developed from scratch by the author of this module, this algorithm requires only the connection table information, and it requires that all hydrogen atoms be explicit. It looks for an atom with unsatisfied valence, increases a bond order, and then does the same recursively for the neighbors. If everybody's not happy at the end, it backtracks and tries another bond. The recursive part does not cover the whole molecule, but only the contiguous region of "unhappy" atoms next to the starting atom and their neighbors. This permits separating the molecule into independent regions, so that if one is solved and there's a problem in another, we don't have to backtrack to the first one.

The itub algorithm has the following additional options:

use_coords

Although the algorithm does not require 3D coordinates, it uses them by default to improve the initial guesses of which bond orders should be increased. To avoid using coordinates, add the use_coords option with a false value:

    assign_bond_orders($mol, use_coords => 0);

The results are the same most of the time, but using good coordinates improves the results for complicated cases such as fused heteroaromatic systems.

scratch

If true, start the bond order assignment from scratch by assuming that all bond orders are 1. If false, start from the current bond orders and try to fix the unsatisfied valences. This option is true by default.

baber

A bond order assignment algorithm based on Baber, J. C.; Hodgkin, E. E. J. Chem. Inf. Comp. Sci. 1992, 32, 401-406 (with some interpretation).

This algorithm uses the 3D coordinates along with various cutoffs and confidence formulas to guess the bond orders. It then tries to resolve conflicts by looping through the atoms (but is not recursive or backtracking). It does not require explicit hydrogens (although it's better when they are available) because it was designed for use with real crystallographic data which often doesn't have hydrogen atoms.

This method doesn't always give a good answer, especially for conjugated and aromatic systems. The variation used in this module adds some random numbers to resolve some ambiguities and break loops, so the results are not even entirely deterministic (the 'itub' method is deterministic but the result may depend on the input order of the atoms).

VERSION

0.23

TO DO

Some future version should let the user specify the desired cutoffs, and not always create a bond but call a user-supplied function instead. This way these functions could be used for other purposes such as finding hydrogen bonds or neighbor lists.

Add some tests.

SEE ALSO

Chemistry::Mol, Chemistry::Atom, Chemistry::Bond, http://www.perlmol.org/.

AUTHOR

Ivan Tubert-Brohman<itub@cpan.org>

The new find_bonds algorithm was loosely based on a suggestion by BrowserUK on perlmonks.org (http://perlmonks.org/index.pl?node_id=352838).

COPYRIGHT

Copyright (c) 2009 Ivan Tubert-Brohman. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.