Utilities¶

Overview

This module contains various utility functions that are used throughout the htf module. These are imported automatically when htf is imported.

Details

utils.center_of_mass(positions, mapping, box_size, name='center-of-mass')¶

Computes mapped positions given positions and system-level mapping by considering PBC.

Parameters:	positions (N x 3 tensor) – The tensor of particle positions mapping (M x N tensor) – The coarse-grain system-level mapping used to produce the particles in system box_size (list) – A list that contains the size of the box `[Lx, Ly, Lz]` name (string) – The name of the op to add to the TF graph
Returns:	An [M x 3] mapped particles

utils.compute_adj_mat(obj)¶

Given a CG mapping file in json format, outputs the adjacency matrix. See utils.compute_cg_graph().

Parameters:	obj (dict) – mapping output from DSGPM
Returns:	Adjacency matrix of the mapping

utils.compute_cg_graph(DSGPM=True, infile=None, adj_mat=None, cg_beads=None, group_atoms=False, u_no_H=None, u_H=None)¶

Given a CG mapping in JSON format(from DSGPM model) OR adjacency matrix, outputs indices of connected CG beads to compute CG bond distances, CG angles and CG dihedrals. If DSGPM is True, path to jsonfiles must be specified. If DSGPM is False, adjacency matrix and the number of CG beads must be specified. If group_atoms is given as True outputs CG coordinates as well. If group_atoms flag is set to True, two MDAnalysis universes with Hydrogens and without Hydrogens must be given as arguments.

Optional dependencies: MDAnalysis, networkx

Parameters:

DSGPM (bool) – flag to identify if mapping in json format is used or not
infile (str) – path to the CG mapping in JSON format
adj_matrix (numpy array) – adjacency matrix (if DSGPM=False)
cg_beads (int) – number of CG beads per molecule
group_atoms (bool) – flag to output CG coordinates
u_no_H (MDAnalysis universe) – All atom structure without hydrogens
u_H (MDAnalysis universe) – All atom structure with hydrogens

Returns:

List of indices bonded CG bead pairs, list of indices of CG beads making angles, list of indices of CG beads making dihedrals, and/or CG coordinates

utils.compute_nlist(positions, r_cut, NN, box_size, sorted=False, return_types=False, exclusion_matrix=None)¶

Computes particle pairwise neighbor lists. This is intended to be used with parsing a trajectory, not while running a simulation. It is much slower than to SimModel.mapped_nlist(), but doesn’t require running hoomd.

Parameters:

positions (N x 4 or N x 3 tensor) – Positions of the particles
r_cut (float) – Cutoff radius (Hoomd units)
NN (int) – Maximum number of neighbors per particle
box_size (list or shape 3 tensor) – A list contain the size of the box [Lx, Ly, Lz]
sorted (bool) – Whether to sort neighbor lists by distance
return_types (bool) – If true, requires N x 4 positions array and last element of nlist is type. Otherwise last element is index of neighbor
exclusion_matrix (Tensor of dtype bools) – Matrix (True = exclude) with size B x B (where B is the total number of bead particles in the system) indicating which pairs should be excluded from nlist

Returns:

An [N x NN x 4] tensor containing neighbor lists of all particles and index

utils.compute_ohe_bead_type_interactions(pos_btype, nlist_btype, n_btypes)¶

Computes bead type interactions as a one-hot encoding.

Parameters:	pos_btype (N tensor with dtype tf.int32) – type of the beads based on the mapped positions[…,-1] nlist_btype (N x M tensor with dtype tf.int32) – type of the beads based on the mapped neighborlist[…,-1] n_btypes (int) – number of unique bead types in the CG molecule
Returns:	An [N x M x I] array, where M is the total number of beads in the system, N is the size of CG neighborlist and I is the total number of possible interactions between two beads

utils.compute_pairwise(model, r, type_i=0, type_j=0)¶

Computes model output for a 2 particle system of type_i and type_j at distances set by r. If the model outputs two tensors of shape L x M and K, then the output will be a tuple of numpy arrays of size N x L x M and N x K, where N is number of points in r.

Parameters:	model (`SimModel`) – The model r (numpy array) – A 1D grid of points at which to compute the potential. type_i (int) – First particle (bead) type type_j (int) – Second particle (bead) type
Returns:	A tuple of numpy arrays as output from `model`

utils.create_frame(frame_number, N, types, typeids, positions, box)¶

Creates snapshots of a system state.

Parameters:	frame_number (int) – Frame number in a trajectory N (int) – Number of CG beads types (List of strings (N,)) – Names of particle types typeids (Numpy array (N,)) – CG bead type id positions (Numpy array (N,3)) – CG beads positions box (Numpy array (6,)) – System box dimensions
Returns:	Snapshot of a system state

utils.find_cgnode_id(atm_id, cg)¶

Computes the CG bead index. Supports only outputs formats from DSGPM model. Called by compute_adj_mat function.

Parameters:	atm_id (int) – index of the atom to find its CG node id cg (numpy array) – array of cg beads
Returns:	CG bead index of the given atom

utils.find_molecules(system)¶

Given a hoomd system, return a mapping from molecule index to particle index. This is a slow function and should only be called once.

Parameters:	system – The molecular system in Hoomd.
Returns:	A list of length L (number of molecules) whose elements are lists of atom indices

utils.find_molecules_from_topology(universe, atoms_in_molecule_list, selection='all')¶

Given a universe from MDAnaylis and list of atoms in every molecule type in the system, return a mapping from molecule index to particle index. Depending on the size of your system, this fuction might be slow to run.

Parameters:	universe (MDAnalysis Universe object) – Use MDAnalysis universe to read the tpr topology file from GROMACS. selection – The atom groups to extract from universe atoms_in_molecule_list – This is a list of atoms lists in every molecule type in the system.
Returns:	A list of length L (number of molecules) whose elements are lists of atom indices.

Here’s an example:

TPR = 'nvt_prod.tpr'
TRAJECTORY = 'Molecules_CG_Mapping/traj.trr'
u = mda.Universe(TPR, TRAJECTORY)
atoms_in_molecule_list = [
    u.select_atoms("resname PHE and resid 0:1").names]
find_molecules_from_topology(
    u, atoms_in_molecule_list, selection = "resname PHE")

utils.gen_bonds_group(mapped_exclusion_list)¶

Generates HOOMD snapshot bonds group based on bead particles’ exclusion list.

Parameters:	mapped_exclusion_list – A [B x B] array of dtype bools, indicating which pairs should be excluded from nlist (True = exclude)
Returns:	`hoomd.data.make_snapshot.bonds.group()`

utils.gen_mapped_exclusion_list(universe, atoms_in_molecule, beads_mappings, selection='all')¶

Generates mapped exclusion list to compute mapped_nlist for non-bonded bead-type interactions.

Parameters:

universe (MDAnalysis Universe object) – MDAnalysis Universe that contains bond information
atoms_in_molecule (MDAnalysis.core.groups.AtomGroup) – Selection of atoms in the molecule from MDAnalysis universe
beads_mappings (Array) – List of lists of beads mappings. Note that each list should contain atoms as strings just like how they appear in the topology file.
selection (string) – The atom groups to extract from universe

Returns:

A [B x B] array of dtype bools, indicating which pairs should be excluded from nlist (True = exclude). B is the total number of bead particles in the system

utils.iter_from_trajectory(nneighbor_cutoff, universe, selection='all', r_cut=10.0, period=1, start=0.0, end=None)¶

This generator will process information from a trajectory and yield a tuple of [nlist, positions, box] and MDAnalysis.TimeStep object. The first list can be directly used to call a SimModel (e.g., model(inputs)). See SimModel.compute() for details of these terms.

Here’s an example:

model = MyModel(16)
for inputs, ts in iter_from_trajectory(16, universe):
    result = model(inputs)

Parameters:

nneighbor_cutoff (int) – The maximum size of neighbor list
universe – The MDAnalysis universe
selection (string) – The atom groups to extract from universe
r_cut (float) – The cutoff raduis to use in neighbor list calculations
period (int) – Period of reading the trajectory frames
start – Starting frame for reading the trajectory frames
period – Ending frame for reading the trajectory frames

utils.matrix_mapping(molecule, beads_mappings, mass_weighted=True)¶

Creates a M x N mass weighted mapping matrix where M is the number of atoms in the molecule and N is the number of mapping beads.

Parameters:	molecule (MDAnalysis Atoms object) – This is atom selection in the molecule. beads_mappings (Array) – List of lists of beads mapping. Note that each list should contain atoms as strings just like how they appear in the topology file. mass_weighted – Returns mass weighted mapping matrix (if True) or both mass weighted and non-mass weighted matrices (if False)
Returns:	Mappying operator at the molecule level. (Array/arrays) of shape M x N. Use `utils.sparse_mapping()` to get mapping operator at the system level

utils.mol_angle(mol_positions=None, type_i=None, type_j=None, type_k=None, CG=False, cg_positions=None, b1=None, b2=None, b3=None, box=None)¶

This method calculates the bond angle given three atoms batched by molecule. Or to output CG angles input CG=True and indices of the CG beads making the angles. cg_positions and bead indices can be computed by calling generate_cg_graph()

Parameters:

mol_positions (float) – Positions tensor of atoms batched by molecules. Can be created by calling build_mol_rep() method in simmodel
type_i (int) – Index of the first atom
type_j (int) – Index of the second atom
type_k (int) – Index of the third atom
CG (bool) – flag to compute CG angles must be given with b1, b2 and b3
cg_positions (float) – array of CG coordinates
b1 (list of ints) – indices of first set of CG beads
b2 (list of int) – indices of second set of CG beads
b3 (list of ints) – indices of third set of CG beads
box ([3,3] array) – low, high coordinates and tilt vectors of the box

Returns:

angles:Tensor containing all atom angles (CG=False) or cg_angles: list containing CG angles (CG=True)

utils.mol_bond_distance(mol_positions=None, type_i=None, type_j=None, CG=False, cg_positions=None, b1=None, b2=None, box=None)¶

This method calculates the bond distance given two atoms batched by molecule. Or to output CG bond distances, input CG=True and indices of the CG bead pairs cg_positions and bead indices can be computed by calling generate_cg_graph()

Parameters:

mol_positions (float) – Positions tensor of atoms batched by molecules. Can be created by calling build_mol_rep() method in simmodel
type_i (int) – Index of the first atom
type_j (int type) – Index of the second atom
CG (bool) – flag to compute CG bond distances (must be given with b1 and b2)
cg_positions (float) – array of CG coordinates
b1 (list of ints) – indcies of first set of CG beads
b2 (list of ints) – indices of second set of CG beads
box ([3,3] array) – low, high coordinates and tilt vectors of the box

Returns:

v_ij: Tensor containing bond distances(CG=False) or u_ij: Array containig CG bond distances(CG=True)

utils.mol_dihedral(mol_positions=None, type_i=None, type_j=None, type_k=None, type_l=None, CG=False, cg_positions=None, b1=None, b2=None, b3=None, b4=None, box=None)¶

This method calculates the dihedral angles given three atoms batched by molecule. Or to output CG dihedral angles input CG=True and indices of the CG beads making the angles. cg_positions and bead indices can be computed by calling generate_cg_graph()

Parameters:

mol_positions (float) – Positions tensor of atoms batched by molecules. Can be created by calling build_mol_rep() method in simmodel
type_i (int) – Index of the first atom
type_j (int) – Index of the second atom
type_k (int) – Index of the third atom
type_l – Index of the fourth atom
CG (bool) – flag to compute CG dihedral angles must be given with b1,b2,b3 and b4
cg_positions (float) – array of CG coordinates
b1 (list of ints) – indices of first set of CG beads
b2 (list of ints) – indices of second set of CG beads
b3 (list of ints) – indices of third set of CG beads
b4 (list of ints) – indices of fourth set of CG beads
box ([3,3] array) – low, high coordinates and tilt vectors of the box

Returns:

dihedrals:Tensor containing all atom dihedral angles (CG=False) or cg_dihedrals: list containing CG dihedral angles (CG=True)

utils.mol_features_multiple(bnd_indices=None, ang_indices=None, dih_indices=None, molecules=None, beads=None)¶

Use to apply the result of compute_cg_graph to multiple molecules in a system. Computes the atom/cg bead indices of bonds,angles and dihedrals for multiple molecules.

Parameters:	bnd_indices (numpy array of type int) – indices of atoms making bonds ang_indices (numpy array of type int) – indices of atoms making angles dih_indices (numpy array of type int) – indices of atoms making dihedrals molecules (int) – number of molecules in the system beads (int) – number of atoms/CG beads in a molecule
Returns:	arrays of indices of atoms/cg beads making bonds,

angles and dihedrals in a system of molecules.

utils.sparse_mapping(molecule_mapping, molecule_mapping_index, system=None)¶

This will create the necessary indices and values for defining a sparse tensor in tensorflow that is a mass-weighted B x N mapping operator. where B is the number of coarse-grained beads. This is a slow function and should not be called frequently.

Parameters:

molecule_mapping (list of numpy arrays) – This is a list of L x M matrices, where M is the number of atoms in the molecule and L is the number of coarse-grained beads. These dimensions can be different for different molecules. There should be one matrix per molecule. The ordering of the atoms should follow what is passed in from molecule_mapping_index
molecule_mapping_index (list of lists) – This is the output from find_molecules. A list of the same length as molecule_mapping whose elements are lists of atom indices
system – The hoomd system. This is used to get mass values for the mapping, if you would like to weight by mass of the atoms.

Returns:

A sparse tensorflow tensor of dimension M x N, where M is the number of molecules and N is number of atoms