Utilities

Overview

This module contains various utility functions that are used throughout the htf module. These are imported automatically when htf is imported.

Details

utils.center_of_mass(positions, mapping, box_size, name='center-of-mass')

Computes mapped positions given positions and system-level mapping by considering PBC.

Parameters:
  • positions (N x 3 tensor) – The tensor of particle positions
  • mapping (M x N tensor) – The coarse-grain system-level mapping used to produce the particles in system
  • box_size (list) – A list that contains the size of the box [Lx, Ly, Lz]
  • name (string) – The name of the op to add to the TF graph
Returns:

An [M x 3] mapped particles

utils.compute_adj_mat(obj)

Given a CG mapping file in json format, outputs the adjacency matrix. See utils.compute_cg_graph().

Parameters:obj (dict) – mapping output from DSGPM
Returns:Adjacency matrix of the mapping
utils.compute_cg_graph(DSGPM=True, infile=None, adj_mat=None, cg_beads=None, group_atoms=False, u_no_H=None, u_H=None)

Given a CG mapping in JSON format(from DSGPM model) OR adjacency matrix, outputs indices of connected CG beads to compute CG bond distances, CG angles and CG dihedrals. If DSGPM is True, path to jsonfiles must be specified. If DSGPM is False, adjacency matrix and the number of CG beads must be specified. If group_atoms is given as True outputs CG coordinates as well. If group_atoms flag is set to True, two MDAnalysis universes with Hydrogens and without Hydrogens must be given as arguments.

Optional dependencies: MDAnalysis, networkx

Parameters:
  • DSGPM (bool) – flag to identify if mapping in json format is used or not
  • infile (str) – path to the CG mapping in JSON format
  • adj_matrix (numpy array) – adjacency matrix (if DSGPM=False)
  • cg_beads (int) – number of CG beads per molecule
  • group_atoms (bool) – flag to output CG coordinates
  • u_no_H (MDAnalysis universe) – All atom structure without hydrogens
  • u_H (MDAnalysis universe) – All atom structure with hydrogens
Returns:

List of indices bonded CG bead pairs, list of indices of CG beads making angles, list of indices of CG beads making dihedrals, and/or CG coordinates

utils.compute_nlist(positions, r_cut, NN, box_size, sorted=False, return_types=False, exclusion_matrix=None)

Computes particle pairwise neighbor lists. This is intended to be used with parsing a trajectory, not while running a simulation. It is much slower than to SimModel.mapped_nlist(), but doesn’t require running hoomd.

Parameters:
  • positions (N x 4 or N x 3 tensor) – Positions of the particles
  • r_cut (float) – Cutoff radius (Hoomd units)
  • NN (int) – Maximum number of neighbors per particle
  • box_size (list or shape 3 tensor) – A list contain the size of the box [Lx, Ly, Lz]
  • sorted (bool) – Whether to sort neighbor lists by distance
  • return_types (bool) – If true, requires N x 4 positions array and last element of nlist is type. Otherwise last element is index of neighbor
  • exclusion_matrix (Tensor of dtype bools) – Matrix (True = exclude) with size B x B (where B is the total number of bead particles in the system) indicating which pairs should be excluded from nlist
Returns:

An [N x NN x 4] tensor containing neighbor lists of all particles and index

utils.compute_ohe_bead_type_interactions(pos_btype, nlist_btype, n_btypes)

Computes bead type interactions as a one-hot encoding.

Parameters:
  • pos_btype (N tensor with dtype tf.int32) – type of the beads based on the mapped positions[…,-1]
  • nlist_btype (N x M tensor with dtype tf.int32) – type of the beads based on the mapped neighborlist[…,-1]
  • n_btypes (int) – number of unique bead types in the CG molecule
Returns:

An [N x M x I] array, where M is the total number of beads in the system, N is the size of CG neighborlist and I is the total number of possible interactions between two beads

utils.compute_pairwise(model, r, type_i=0, type_j=0)

Computes model output for a 2 particle system of type_i and type_j at distances set by r. If the model outputs two tensors of shape L x M and K, then the output will be a tuple of numpy arrays of size N x L x M and N x K, where N is number of points in r.

Parameters:
  • model (SimModel) – The model
  • r (numpy array) – A 1D grid of points at which to compute the potential.
  • type_i (int) – First particle (bead) type
  • type_j (int) – Second particle (bead) type
Returns:

A tuple of numpy arrays as output from model

utils.create_frame(frame_number, N, types, typeids, positions, box)

Creates snapshots of a system state.

Parameters:
  • frame_number (int) – Frame number in a trajectory
  • N (int) – Number of CG beads
  • types (List of strings (N,)) – Names of particle types
  • typeids (Numpy array (N,)) – CG bead type id
  • positions (Numpy array (N,3)) – CG beads positions
  • box (Numpy array (6,)) – System box dimensions
Returns:

Snapshot of a system state

utils.find_cgnode_id(atm_id, cg)

Computes the CG bead index. Supports only outputs formats from DSGPM model. Called by compute_adj_mat function.

Parameters:
  • atm_id (int) – index of the atom to find its CG node id
  • cg (numpy array) – array of cg beads
Returns:

CG bead index of the given atom

utils.find_molecules(system)

Given a hoomd system, return a mapping from molecule index to particle index. This is a slow function and should only be called once.

Parameters:system – The molecular system in Hoomd.
Returns:A list of length L (number of molecules) whose elements are lists of atom indices
utils.find_molecules_from_topology(universe, atoms_in_molecule_list, selection='all')

Given a universe from MDAnaylis and list of atoms in every molecule type in the system, return a mapping from molecule index to particle index. Depending on the size of your system, this fuction might be slow to run.

Parameters:
  • universe (MDAnalysis Universe object) – Use MDAnalysis universe to read the tpr topology file from GROMACS.
  • selection – The atom groups to extract from universe
  • atoms_in_molecule_list – This is a list of atoms lists in every molecule type in the system.
Returns:

A list of length L (number of molecules) whose elements are lists of atom indices.

Here’s an example:

TPR = 'nvt_prod.tpr'
TRAJECTORY = 'Molecules_CG_Mapping/traj.trr'
u = mda.Universe(TPR, TRAJECTORY)
atoms_in_molecule_list = [
    u.select_atoms("resname PHE and resid 0:1").names]
find_molecules_from_topology(
    u, atoms_in_molecule_list, selection = "resname PHE")
utils.gen_bonds_group(mapped_exclusion_list)

Generates HOOMD snapshot bonds group based on bead particles’ exclusion list.

Parameters:mapped_exclusion_list – A [B x B] array of dtype bools, indicating which pairs should be excluded from nlist (True = exclude)
Returns:hoomd.data.make_snapshot.bonds.group()
utils.gen_mapped_exclusion_list(universe, atoms_in_molecule, beads_mappings, selection='all')

Generates mapped exclusion list to compute mapped_nlist for non-bonded bead-type interactions.

Parameters:
  • universe (MDAnalysis Universe object) – MDAnalysis Universe that contains bond information
  • atoms_in_molecule (MDAnalysis.core.groups.AtomGroup) – Selection of atoms in the molecule from MDAnalysis universe
  • beads_mappings (Array) – List of lists of beads mappings. Note that each list should contain atoms as strings just like how they appear in the topology file.
  • selection (string) – The atom groups to extract from universe
Returns:

A [B x B] array of dtype bools, indicating which pairs should be excluded from nlist (True = exclude). B is the total number of bead particles in the system

utils.iter_from_trajectory(nneighbor_cutoff, universe, selection='all', r_cut=10.0, period=1, start=0.0, end=None)

This generator will process information from a trajectory and yield a tuple of [nlist, positions, box] and MDAnalysis.TimeStep object. The first list can be directly used to call a SimModel (e.g., model(inputs)). See SimModel.compute() for details of these terms.

Here’s an example:

model = MyModel(16)
for inputs, ts in iter_from_trajectory(16, universe):
    result = model(inputs)
Parameters:
  • nneighbor_cutoff (int) – The maximum size of neighbor list
  • universe – The MDAnalysis universe
  • selection (string) – The atom groups to extract from universe
  • r_cut (float) – The cutoff raduis to use in neighbor list calculations
  • period (int) – Period of reading the trajectory frames
  • start – Starting frame for reading the trajectory frames
  • period – Ending frame for reading the trajectory frames
utils.matrix_mapping(molecule, beads_mappings, mass_weighted=True)

Creates a M x N mass weighted mapping matrix where M is the number of atoms in the molecule and N is the number of mapping beads.

Parameters:
  • molecule (MDAnalysis Atoms object) – This is atom selection in the molecule.
  • beads_mappings (Array) – List of lists of beads mapping. Note that each list should contain atoms as strings just like how they appear in the topology file.
  • mass_weighted – Returns mass weighted mapping matrix (if True) or both mass weighted and non-mass weighted matrices (if False)
Returns:

Mappying operator at the molecule level. (Array/arrays) of shape M x N. Use utils.sparse_mapping() to get mapping operator at the system level

utils.mol_angle(mol_positions=None, type_i=None, type_j=None, type_k=None, CG=False, cg_positions=None, b1=None, b2=None, b3=None, box=None)

This method calculates the bond angle given three atoms batched by molecule. Or to output CG angles input CG=True and indices of the CG beads making the angles. cg_positions and bead indices can be computed by calling generate_cg_graph()

Parameters:
  • mol_positions (float) – Positions tensor of atoms batched by molecules. Can be created by calling build_mol_rep() method in simmodel
  • type_i (int) – Index of the first atom
  • type_j (int) – Index of the second atom
  • type_k (int) – Index of the third atom
  • CG (bool) – flag to compute CG angles must be given with b1, b2 and b3
  • cg_positions (float) – array of CG coordinates
  • b1 (list of ints) – indices of first set of CG beads
  • b2 (list of int) – indices of second set of CG beads
  • b3 (list of ints) – indices of third set of CG beads
  • box ([3,3] array) – low, high coordinates and tilt vectors of the box
Returns:

angles:Tensor containing all atom angles (CG=False) or cg_angles: list containing CG angles (CG=True)

utils.mol_bond_distance(mol_positions=None, type_i=None, type_j=None, CG=False, cg_positions=None, b1=None, b2=None, box=None)

This method calculates the bond distance given two atoms batched by molecule. Or to output CG bond distances, input CG=True and indices of the CG bead pairs cg_positions and bead indices can be computed by calling generate_cg_graph()

Parameters:
  • mol_positions (float) – Positions tensor of atoms batched by molecules. Can be created by calling build_mol_rep() method in simmodel
  • type_i (int) – Index of the first atom
  • type_j (int type) – Index of the second atom
  • CG (bool) – flag to compute CG bond distances (must be given with b1 and b2)
  • cg_positions (float) – array of CG coordinates
  • b1 (list of ints) – indcies of first set of CG beads
  • b2 (list of ints) – indices of second set of CG beads
  • box ([3,3] array) – low, high coordinates and tilt vectors of the box
Returns:

v_ij: Tensor containing bond distances(CG=False) or u_ij: Array containig CG bond distances(CG=True)

utils.mol_dihedral(mol_positions=None, type_i=None, type_j=None, type_k=None, type_l=None, CG=False, cg_positions=None, b1=None, b2=None, b3=None, b4=None, box=None)

This method calculates the dihedral angles given three atoms batched by molecule. Or to output CG dihedral angles input CG=True and indices of the CG beads making the angles. cg_positions and bead indices can be computed by calling generate_cg_graph()

Parameters:
  • mol_positions (float) – Positions tensor of atoms batched by molecules. Can be created by calling build_mol_rep() method in simmodel
  • type_i (int) – Index of the first atom
  • type_j (int) – Index of the second atom
  • type_k (int) – Index of the third atom
  • type_l – Index of the fourth atom
  • CG (bool) – flag to compute CG dihedral angles must be given with b1,b2,b3 and b4
  • cg_positions (float) – array of CG coordinates
  • b1 (list of ints) – indices of first set of CG beads
  • b2 (list of ints) – indices of second set of CG beads
  • b3 (list of ints) – indices of third set of CG beads
  • b4 (list of ints) – indices of fourth set of CG beads
  • box ([3,3] array) – low, high coordinates and tilt vectors of the box
Returns:

dihedrals:Tensor containing all atom dihedral angles (CG=False) or cg_dihedrals: list containing CG dihedral angles (CG=True)

utils.mol_features_multiple(bnd_indices=None, ang_indices=None, dih_indices=None, molecules=None, beads=None)

Use to apply the result of compute_cg_graph to multiple molecules in a system. Computes the atom/cg bead indices of bonds,angles and dihedrals for multiple molecules.

Parameters:
  • bnd_indices (numpy array of type int) – indices of atoms making bonds
  • ang_indices (numpy array of type int) – indices of atoms making angles
  • dih_indices (numpy array of type int) – indices of atoms making dihedrals
  • molecules (int) – number of molecules in the system
  • beads (int) – number of atoms/CG beads in a molecule
Returns:

arrays of indices of atoms/cg beads making bonds,

angles and dihedrals in a system of molecules.

utils.sparse_mapping(molecule_mapping, molecule_mapping_index, system=None)

This will create the necessary indices and values for defining a sparse tensor in tensorflow that is a mass-weighted B x N mapping operator. where B is the number of coarse-grained beads. This is a slow function and should not be called frequently.

Parameters:
  • molecule_mapping (list of numpy arrays) – This is a list of L x M matrices, where M is the number of atoms in the molecule and L is the number of coarse-grained beads. These dimensions can be different for different molecules. There should be one matrix per molecule. The ordering of the atoms should follow what is passed in from molecule_mapping_index
  • molecule_mapping_index (list of lists) – This is the output from find_molecules. A list of the same length as molecule_mapping whose elements are lists of atom indices
  • system – The hoomd system. This is used to get mass values for the mapping, if you would like to weight by mass of the atoms.
Returns:

A sparse tensorflow tensor of dimension M x N, where M is the number of molecules and N is number of atoms