PyVOL Package

Submodules

PyVOL Cluster Module

Contains functions to cluster spheres objects in memory; used in subpocket clustering.

pyvol.cluster.cluster_within_r(spheres, radius, allow_new=True)

Cluster spheres with the same radius using DBSCAN, modifying input data in situ

Args:

spheres (Spheres): complete set of input spheres radius (float): radius at which clustering is to occur allow_new (bool): permit new clusters? (Default value = True)

pyvol.cluster.cluster_between_r(spheres, ref_radius, target_radius)

Cluster spheres from a target radius to a reference radius, modifying input data in situ

Args:

spheres (Spheres): complete set of input spheres ref_radius (float): radius from which cluster identities will be drawn target_radius (float): radius to which cluster identities will be propagated

pyvol.cluster.cluster_improperly_grouped(spheres, radius, min_cluster_size=1, max_clusters=None)

Reassigns improperly clustered spheres to ‘proper’ clusters, modifying input data in situ

Args:

spheres (Spheres): complete set of input spheres radius (float): radius at which closest groups are identified min_cluster_size (int): minimum number of spheres in a ‘proper’ cluster (Default value = 1) max_clusters (int): maximum number of ‘proper’ clusters (Default value = None)

pyvol.cluster.extract_groups(spheres, surf_radius=None, prefix=None, group_names=None)

Extracts spheres belonging to each cluster from the complete input set and optionally calculates bounded surfaces

Args:

spheres (Spheres): complete set of input spheres surf_radius: radius used to calculate bounding spheres for individual groups (Default value = None) prefix: prefix to identify new surfaces (Default value = None)

Returns:

group_list ([Spheres]): a list of Spheres objects each corresponding to a different cluster

pyvol.cluster.hierarchically_cluster_spheres(spheres, ordered_radii, min_new_radius=None, min_cluster_size=10, max_clusters=None)

Cluster spheres by grouping spheres at large radius and propagating those assignments down to smaller radii

Args:

spheres (Spheres): complete set of input spheres ordered_radii ([float]): list of radii ordered from largest to smallest min_new_radius (float): smallest spheres to keep (Default value = None) min_cluster_size (int): minimum number of spheres in a cluster (Default value = 10) max_clusters (int): maximum number of clusters (Default value = None)

pyvol.cluster.identify_closest_grouped(spheres, group, radius)

Identifies the closest ‘properly’ grouped cluster to a specified group

Args:

spheres (Spheres): complete set of input spheres group (float): group for which to identify the closest clusters radius (float): radius at which to perform the search

Returns:

group (float): passthrough of input group closest (float): id of the closest cluster magnitude (int): number of pairwise closest connections between the queried group and the closest identified cluster

pyvol.cluster.merge_sphere_list(s_list, r=None, g=None)
Args:

s_list ([Spheres]): list of input spheres r (float): radius value to assign to output Spheres (Default value = None) g (float): group value to assign to output Spheres (Default value = None)

Returns:

merged_spheres (Spheres): a single Spheres object containing the merged input lists

pyvol.cluster.reassign_group(spheres, source_group, target_group)

Reassign a group in place

Args:

spheres (Spheres): complete set of input spheres source_group (float): group to change target_group (float): new group id

pyvol.cluster.reassign_groups_to_closest(spheres, group_list, radius, iterations=None, preserve_largest=False)

Reassign a group to the closest group as identified by maximum linkage; operates in place

Args:

spheres (Spheres): complete set of input spheres group_list ([float]): list of group ids which are to be iteratively reassigned radius (float): radius at which searches are to take place iterations (int): number of times to attempt to reassign groups (Default value = None) preserve_largest: keep the group id of the group with more members? (Default value = False)

pyvol.cluster.remove_interior(spheres)

Remove all spheres which are completely enclosed in larger spheres; operates in place

Args:

spheres (Spheres): complete set of input spheres

pyvol.cluster.remove_included_spheres(spheres, ref_spheres, radius)

Removes all spheres with centers within radius of ref_spheres

pyvol.cluster.remove_overlap(spheres, radii=None, spacing=0.1, iterations=20, tolerance=0.02, static_last_group=False)

Remove overlap between groups; operates in place

Args:

spheres (Spheres): complete set of input spheres radii ([float]): radii at which to perform searches for overlap (Default value = None) spacing (float): binning radius (Default value = 0.1) iterations (int): number of times to attempt overlap removal (Default value = 20) tolerance (float): overlap tolerance (Default value = 0.02) static_last_group (bool): don’t move the ‘other’ group but rather the first group twice as much (effectively leaves the group with the highest index in place while moving everything else around it)

PyVOL Identify Module

pyvol.identify.load_calculation(data_dir, input_opts=None)

load the results of a calculation from file

Args:

data_dir (str): directory where previous calculation results are stored input_opts (dict): dictionary of pyvol options that is used to update the options read in from file

Returns:

pockets ([Spheres]): a list of Spheres objects each of which contains the geometric information describing a distinct pocket or subpocket opts (dict): updated PyVOL options dictionary

pyvol.identify.pocket(**opts)

Calculates the SES for a binding pocket

Args:

opts (dict): dictionary containing all PyVOL options (see pyvol.pymol_interface.pymol_pocket_cmdline for details)

Returns:

pockets ([Spheres]): a list of Spheres objects each of which contains the geometric information describing a distinct pocket or subpocket

pyvol.identify.pocket_wrapper(**opts)

wrapper for pocket that configures the logger, sanitizes inputs, and catches errors; useful when running from the command line or PyMOL but split from the core code for programmatic usage

Args:

opts (dict): dictionary containing all PyVOL options (see pyvol.pymol_interface.pymol_pocket_cmdline for details)

Returns:

pockets ([Spheres]): a list of Spheres objects each of which contains the geometric information describing a distinct pocket or subpocket output_opts (dict): dictionary containing the actual options used in the pocket calculation

pyvol.identify.subpockets(bounding_spheres, ref_spheres, **opts)
Args:

bounding_spheres (Spheres): a Spheres object containing both the peptide and solvent exposed face external spheres ref_spheres (Spheres): a Spheres object holding the interior spheres that define the pocket to be subdivided opts (dict): a dictionary containing all PyVOL options (see pyvol.configuration.clean_opts for details)

Returns:

grouped_list ([Spheres]): a list of Spheres objects each of which contains the geometric information describing a distinct subpocket

pyvol.identify.write_cfg(**opts)

write the processed configuration to file

Args:

output_dir (str): output directory, relative or absolute prefix (str): identifying prefix for the output files

pyvol.identify.write_report(all_pockets, **opts)

Write a brief report of calculated volumes to file

Args:

all_pockets ([Spheres]): a list of Spheres objects each of which contains the complete information about a distinct pocket or subpocket output_dir (str): output directory, relative or absolute prefix (str): identifying prefix for output files

PyVOL Pymol Interface Module

Front facing PyMOL functions

pyvol.pymol_interface.display_pockets(pockets, **opts)

Display a list of pockets

Args:

pockets ([Spheres]): list of spheres object to display opts (dict): a dictionary containing all PyVOL options (see pyvol.pymol_interface.pymol_pocket_cmdline for details)

pyvol.pymol_interface.load_calculation_cmdline(data_dir, prefix=None, display_mode=None, palette=None, alpha=None)

Loads a pocket from memory and displays it in PyMOL

Args:

data_dir (str): directory containing PyVOL output (by default ends in .pyvol) prefix (str): internal display name (Default value = None) display_mode (str): display mode (Default value = “solid”) palette (str): comma-separated list of PyMOL color strings (Default value = None) alpha (float): transparency value (Default value = 1.0)

pyvol.pymol_interface.pymol_pocket_cmdline(protein=None, ligand=None, prot_file=None, lig_file=None, min_rad=1.4, max_rad=3.4, constrain_radii=True, mode='largest', coordinates=None, residue=None, resid=None, lig_excl_rad=None, lig_incl_rad=None, min_volume=200, subdivide=False, max_clusters=None, min_subpocket_rad=1.7, max_subpocket_rad=3.4, min_subpocket_surf_rad=1.0, radial_sampling=0.1, inclusion_radius_buffer=1.0, min_cluster_size=50, project_dir=None, output_dir=None, prefix=None, logger_stream_level='INFO', logger_file_level='DEBUG', protein_only=False, display_mode='solid', alpha=1.0, palette=None)

PyMOL-compatible command line entry point

Args:

protein (str): PyMOL-only PyMOL selection string for the protein (Default value = None) ligand (str): PyMOL-only PyMOL selection string for the ligand (Default value = None) prot_file (str): filename for the input pdb file containing the peptide–redundant with protein argument (Default value =- ) lig_file (str): filename for the input pdb file containing a ligand–redundant with ligand argument (Default value = None) min_rad (float): radius for SES calculations (Default value = 1.4) max_rad (float): radius used to identify the outer, bulk solvent exposed surface (Default value = 3.4) constrain_radii (bool): restrict input radii to tested values? (Default value = False) mode (str): pocket identification mode (can be largest, all, or specific) (Default value = “largest”) coordinates ([float]): 3D coordinate used for pocket specification (Default value = None) residue (str): Pymol-only PyMOL selection string for a residue to use for pocket specification (Default value=None) resid (str): residue identifier for pocket specification (Default value = None) lig_excl_rad (float): maximum distance from a provided ligand that can be included in calculated pockets (Default value = None) lig_incl_rad (float): minimum distance from a provided ligand that should be included in calculated pockets when solvent border is ambiguous (Default value = None) min_volume (float): minimum volume of pockets returned when running in ‘all’ mode (Default value = 200) subdivide (bool): calculate subpockets? (Default value = False) max_clusters (int): maximum number of clusters (Default value = None) min_subpocket_rad (float): minimum radius that identifies distinct subpockets (Default value = 1.7) max_subpocket_rad (float): maximum sampling radius used in subpocket identification (Default value = 3.4) min_subpocket_surf_rad (float): radius used to calculate subpocket surfaces (Default value = 1.0) inclusion_radius_buffer (float): buffer radius in excess of the nonextraneous radius from the identified pocket used to identify atoms pertinent to subpocket clustering (Default value = 1.0) radial_sampling (float): radial sampling used for subpocket clustering (Default value = 0.1) min_cluster_size (int): minimum number of spheres in a proper cluster; used to eliminate insignificant subpockets (Default value = 50) project_dir (str): parent directory in which to create the output directory if the output directory is unspecified (Default value = None) output_dir (str): filename of the directory in which to place all output; can be absolute or relative (Default value = None) prefix (str): identifying string for output (Default value = None) logger_stream_level (str): sets the logger level for stdio output (Default value = “INFO”) logger_file_level (str): sets the logger level for file output (Default value = “DEBUG”) protein_only (bool): PyMOL-only include only peptides in protein file display_mode (str): PyMOL-only display mode for calculated pockets (Default value = “solid”) alpha (float): PyMOL-only display option specifying translucency of CGO objects (Default value = 1.0) palette (str): PyMOL-only display option representing a comma separated list of PyMOL color strings (Default value = None)

pyvol.pymol_interface.pymol_pocket(**opts)

Perform PyMOL-dependent processing of inputs to generate input files for PyVOL pocket processing

Args:

opts (dict): dictionary containing all PyVOL options (see pyvol.pymol_interface.pymol_pocket_cmdline for details)

Returns:

pockets ([Spheres]): a list of Spheres objects each of which contains the geometric information describing a distinct pocket or subpocket output_opts (dict): dictionary containing the actual options used in the pocket calculation

PyVOL Pymol Utilities Module

PyMOL convenience functions used by the front-end contained in pymol_interface.

pyvol.pymol_utilities.construct_palette(color_list=None, max_value=7, min_value=0)

Construct a palette

Args:

color_list ([str]): list of PyMOL color strings (Default value = None) max_value (int): max palette index (Default value = 7) min_value (int): min palette index (Default value = 1)

Returns:

palette ([str]): list of color definitions

pyvol.pymol_utilities.display_pseudoatom_group(spheres, name, color='gray60', palette=None)

Displays a collection of pseudoatoms

Args:

spheres (Spheres): Spheres object holding pocket geometry name (str): display name color (str): PyMOL color string (Default value = ‘gray60’) palette ([str]): palette (Default value = None)

pyvol.pymol_utilities.display_spheres_object(spheres, name, state=1, color='marine', alpha=1.0, mode='solid')

Loads a mesh object into a cgo list for display in PyMOL

Args:

spheres (Spheres): Spheres object containing all geometry name (str): display name state (int): model state (Default value = 1) color (str): PyMOL color string (Default value = ‘marine’) alpha (float): transparency value (Default value = 1.0) mode (str): display mode (Default value = “solid”) palette ([str]): palette (Default value = None)

pyvol.pymol_utilities.mesh_to_solid_CGO(mesh, color, alpha=1.0)

Creates a solid CGO object for a mesh for display in PyMOL

Args:

mesh (Trimesh): Trimesh mesh object color (str): PyMOL color string (Default value = ‘gray60’) alpha (float): transparency value (Default value = 1.0)

Returns:

cgobuffer (str): CGO buffer that contains the instruction to load a solid object

pyvol.pymol_utilities.mesh_to_wireframe_CGO(mesh, color_tuple, alpha=1.0)

Creates a wireframe CGO object for a mesh for display in PyMOL

Args:

mesh (Trimesh): Trimesh mesh object color (str): PyMOL color string (Default value = ‘gray60’) alpha (float): transparency value (Default value = 1.0)

Returns:

cgobuffer (str): CGO buffer that contains the instruction to load a wireframe object

PyVOL Spheres Module

Defines the Spheres class which holds geometric information and performs basic operations on its data

class pyvol.spheres.Spheres(xyz=None, r=None, xyzr=None, xyzrg=None, g=None, pdb=None, bv=None, mesh=None, name=None, spheres_file=None)

Bases: object

copy()

Creates a copy in memory of itself

calculate_surface(probe_radius=1.4, cavity_atom=None, coordinate=None, all_components=False, exclusionary_radius=2.5, largest_only=False, noh=True, min_volume=200)

Calculate the SAS for a given probe radius

Args:

probe_radius (float): radius for surface calculations (Default value = 1.4) cavity_atom (int): id of a single atom which lies on the surface of the interior cavity of interest (Default value = None) coordinate ([float]): 3D coordinate to identify a cavity atom (Default value = None) all_components (bool): return all pockets? (Default value = False) exclusionary_radius (float): maximum permissibile distance to the closest identified surface element from the supplied coordinate (Default value = 2.5) largest_only (bool): return only the largest pocket? (Default value = False) noh (bool): remove waters before surface calculation? (Default value = True) minimum_volume (int): minimum volume of pockets returned when using ‘all_components’ (Default value = 200)

identify_nonextraneous(ref_spheres, radius)

Returns all spheres less than radius away from any center in ref_spheres using cKDTree search built on the non-reference set

Args:

ref_spheres (Spheres): object that defines the pocket of interest radius (float): maximum distance to sphere centers to be considered nonextraneous

Returns:

nonextraneous (Spheres): a filtered Spheres object

nearest(coordinate, max_radius=None)

Returns the index of the sphere closest to a coordinate; if max_radius is specified, the sphere returned must have a radius <= max_radius

Args:

coordinate (float nx3): 3D input coordinate max_radius (float): maximum permissibile distance to the nearest sphere (Default value = None)

Returns:

nearest_index: index of the closest sphere

propagate_groups_to_external(coordinates, tolerance=3)

Propagates group identifications to an external set of coordinates

Args:

coordinates (Nx3 ndarray): coordinates of the external spheres tolerance (float): maximum distance exclusive of the radii of the internal spheres

Returns:

prop_groups ([int]): list of group identifications for the supplied external coordinates

nearest_coord_to_external(coordinates)

Returns the coordinate of the sphere closest to the supplied coordinates

Args:

coordinates (float nx3): set of coordinates

Returns:

coordinate (float 1x3): coordinate of internal sphere closest to the supplied coordinates

remove_duplicates(eps=0.01)

Remove duplicate spheres by identifying centers closer together than eps using DBSCAN

Args:

eps (float): DBSCAN input parameter (Default value = 0.01)

remove_ungrouped()

Remove all spheres that did not adequately cluster with the remainder of the set

remove_groups(groups)

Remove all spheres with specified group affiliations

Args:

groups ([float]): list of groups to remove

write(filename, contents='xyzrg', output_mesh=True)

Writes the contents of _xyzrg to a space delimited file

Args:

filename (str): filename to write the report and mesh if indicated contents (str): string describing which columns to write to file (Default value = “xyzrg”) output_mesh (bool): write mesh to file? (Default value = True)

property xyzrg

Retrieve the coordinates, radii, and group ids

property xyzr

Retrieve coordinates and radii

property xyz

Retrieve the coordinates

property r

Retrieve the radii

property g

Retrieve the group indices

PyVOL Utilities Module

pyvol.utilities.calculate_rotation_matrix(ref_vector, new_vector)

Calculates the 3D rotation matrix to convert from ref_vector to new_vector; not used in main PyVOL calculations

Args:

ref_vector (3x1 ndarray): original vector new_vector (3x1 ndarray): target vector

Returns:

rot_matrix (3x3 ndarray): rotation matrix to convert the original vector to the target vector

pyvol.utilities.closest_vertex_normals(ref_mesh, query_mesh, ref_coordinates=None, ref_radius=2, interface_gap=2)

Returns the location and normal for the closest point between two meshes

Args:

ref_mesh (trimesh): origin mesh query_mesh (trimesh): target mesh ref_coordinates (3xN ndarray): coordinates used to specify the pertinent subregion on the ref_mesh ref_radius (float): radius used to identify points on the ref_mesh that are sufficiently close to the ref_coordinates interface_gap (float): maximum distance between the ref and query meshes at the identified point

Returns:

mean_pos (3x1 ndarray): coordinate of the central point between the meshes mean_normal (3x1 ndarray): normalized vector pointing from the ref_mesh to the query_mesh

pyvol.utilities.check_dir(location)

Ensure that a specified directory exists

Args:

location (str): target directory

pyvol.utilities.configure_logger(filename=None, stream_level=None, file_level=None)

Configures the base logger

Args:

filename (str): target filename is the log is to be written to file (Default value = None) stream_level (str): log level for the stream handler (Default value = None) file_level (str): log level for the file handler (Default value = None)

pyvol.utilities.clean_logger()

Removes current handlers from the main PyVOL logger so that new ones can be assigned

pyvol.utilities.coordinates_for_resid(pdb_file, resid, chain=None, model=0, sidechain_only=True)

Extract the 3D coordinates for all atoms in a specified residue from a pdb file

Args:

pdb_file (str): filename of the specified pdb file resid (int): residue number chain (str): chain identifier (Default value = None) model (int): model identifier (Default value = 0) sidechain_only (bool): return only sidechain atom coordinates? (Default value = True)

Returns:

coordinates ([[float]]): 3xN array containing all atomic positions

pyvol.utilities.run_cmd(options, in_directory=None)

Run a program using the command line

Args:

options ([str]): list of command line options in_directory (str): directory in which to run the command (Default value = None)

pyvol.utilities.surface_multiprocessing(args)

A single surface calculation designed to be run in parallel

Args:
args: a tuple containing:

spheres (Spheres): a Spheres object containing all surface producing objects probe_radius (float): radius to use for probe calculations kwargs (dict): all remaining arguments accepted by the surface calculation algorithm

Returns:

surface (Spheres): the input Spheres object but with calculated surface parameters

pyvol.utilities.sphere_multiprocessing(spheres, radii, workers=None, **kwargs)

A wrapper function to calculate multiple surfaces using multiprocessing

Args:

spheres (Spheres): input Spheres object radii ([float]): list of radii at which surfaces will be calculated workers (int): number of workers (Default value = None) kwargs (dict): all remaining arguments accepted by surface calculation that are constant across parallel calculations

Returns:

surfaces ([Spheres]): a list of Spheres object each with its surface calculated