Artificial Neural networks for Molecular System — `molann.ann`

Author:: Wei Zhang
Year:: 2022
Copyright:: GNU Public License v3

This module implements several PyTorch artificial neural network (ANN) classes, i.e. derived classes of torch.nn.Module, which take into acount alignment, or which use features of molecular system as input.

Classes

class molann.ann.AlignmentLayer(align_atom_group, input_atom_group)[source]

ANN layer that performs alignment based on Kabsch algorithm

Parameters:

align_atom_group (MDAnalysis.core.groups.AtomGroup) – specifies atom group whose coordinates are taken as reference when performing alignment.
input_atom_group (MDAnalysis.core.groups.AtomGroup) – specifies atoms that are used as input of the neural network.

Let \(x_{ref}\in \mathbb{R}^{n_r\times 3}\) be the coordinates of the reference atoms, where \(n_r\) is the number of atoms in the atom group align_atom_group. Then, this class defines the map

\[x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow (x-c(x))A(x) \in \mathbb{R}^{n_{inp} \times 3}\,,\]

where, given coordinates \(x\) of \(n_{inp}\) atoms, \(A(x)\in \mathbb{R}^{3\times 3}\) and \(c(x)\in \mathbb{R}^{n_{inp}\times 3}\) (\(n_{inp}\) repetitions of a vector in \(\mathbb{R}^{3}\)) are respectively the optimal rotation and translation determined (with respect to \(x_{ref}\)) using the Kabsch algorithm.

Note that \(x_{ref}\) will be shifted to have zero mean before it is used to align states.

Example:

import torch
import MDAnalysis as mda
from molann.ann import AlignmentLayer

# pdb file of the system
pdb_filename = '/path/to/system.pdb'
ref = mda.Universe(pdb_filename)
ag=ref.select_atoms('bynum 1 2 5')
input_ag = ref.atoms

align = AlignmentLayer(ag, input_ag)
align.show_info()

# for illustration, use the state in the pdb file (length 1)
x = torch.tensor(ref.atoms.positions).unsqueeze(0)
print (align(x))

# save the model to file
align_model_name = 'algin.pt'
torch.jit.script(align).save(align_model_name)

align_atom_indices

(0-based) indices of atoms used to align coordinates.

Type:: list of int

input_atom_indices

(0-based) indices of atoms in the input tensor.

Type:: list of int

input_atom_num

atom number (i.e. \(n_{inp}\)) in the input tensor.

Type:: int

ref_x

reference coordinates \(x_{ref}\).

Type:: torch.Tensor

Raises:: ValueError – if some reference atom is not in the atom group input_atom_group.

forward(x)[source]

align states by translation and rotation.

Parameters:: x (torch.Tensor) – states to be aligned
Returns:: torch.Tensor that stores the aligned states
Raises:: AssertionError – if x is not a Torch tensor with sizes \([*, n_{inp},3]\).

x should be a 3d tensor, whose shape is \([l, n_{inp}, 3]\), where \(l\) is the number of states in x and \(n_{inp}\) is the total number of atoms in the atom group input_atom_group. The returned tensor has the same shape.

This method implements the Kabsch algorithm.

show_info()[source]: display indices of input atoms, indices and positions of the reference atoms that are used to perform alignment

class molann.ann.FeatureMap(feature, input_atom_group, use_angle_value=False)[source]

ANN that maps coordinates to a feature

Parameters:

feature (molann.feature.Feature) – feature that defines the map
input_atom_group –

:param (MDAnalysis.core.groups.AtomGroup): atom group used as input. This atom group must include all the atoms used to define feature. :param use_angle_value: if true, use angle value in radians, else

use sine and/or cosine values. It has no effect if the type of feature is ‘position’.

This class defines the feature map

\[f: x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow f(x) \in \mathbb{R}^{d}\,,\]

corresponding to the input feature, where \(n_{inp}\) is the number of atoms in the input atom group.

Example:

import MDAnalysis as mda
from molann.ann import FeatureMap
from molann.feature import Feature
import torch

# pdb file of the system
pdb_filename = '/path/to/system.pdb'
ref = mda.Universe(pdb_filename)

f = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4'))
input_ag = ref.select_atoms('bynum 1 2 3 4 5')
fmap = FeatureMap(f, input_ag, use_angle_value=False)
print ('dim=', fmap.dim())

x = torch.tensor(input_ag.positions).unsqueeze(0)
print (fmap(x))
feature_model_name = 'feature_map.pt'
torch.jit.script(fmap).save(feature_model_name)

Raises:: ValueError – if some atom used to define feature is not in the atom group for input.

dim()[source]

Returns:: d (int), total dimension of features or, equivalently, dimension of the output layer of the ANN.

\(d=1\) for ‘angle’ and ‘bond’, as well as for ‘dihedral’ when use_angle_value =True.

\(d=2\) for ‘dihedral’, when use_angle_value =False.

\(d=3n\) for ‘position’, where \(n\) is the number of atoms involved in feature (note: it is possible that \(n<n_{inp}\)).

forward(x)[source]

map position to feature

Parameters:: x (torch.Tensor) – 3d tensor that contains coordinates of states
Returns:: torch.Tensor, 2d tensor that contains features of the states

x should be a 3d tensor with shape \([l, n_{inp}, 3]\), where \(l\) is the number of states in x and \(n_{inp}\) is the total number of atoms in the atom group input_atom_group.

The output is a tensor with shape \([l, d]\), where \(l\) is the number of states in x and \(d\) is the dimension returned by dim().

For ‘angle’, if use_angle_value=True, it returns angle values in \([0, \pi]\); otherwise, it retuns the cosine values of the angles.

For ‘dihedral’, if use_angle_value=True, it returns angle values in \([-\pi, \pi]\); otherwise, it retuns [cosine, sine] of the angles.

For ‘position’, it returns the coordinates of all the atoms in the feature.

Raises:: AssertionError – if x is not a Torch tensor with sizes \([*, n_{inp},3]\).

class molann.ann.FeatureLayer(feature_list, input_atom_group, use_angle_value=False)[source]

ANN layer that maps coordinates to all features in a feature list

Parameters:

feature_list (list of molann.feature.Feature) – list of features
input_atom_group (MDAnalysis.core.groups.AtomGroup) – atom group used as input.
use_angle_value (boolean) – whether to use angle value in radians

This class encapsulates FeatureMap and maps input coordinates to multiple features. More concretely, it defines the map

\[x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow (f_1(x), f_2(x), \dots, f_l(x))\,,\]

where \(n_{inp}\) is the number of atoms in the atom group input_atom_group, \(l\) is the number of features in the feature list, and each \(f_i\) is the feature map defined by the class FeatureMap.

Raises:: AssertionError – if feature_list is empty.

Example:

import MDAnalysis as mda
from molann.ann import FeatureLayer
from molann.feature import Feature
import torch

# pdb file of the system
pdb_filename = '/path/to/system.pdb'
ref = mda.Universe(pdb_filename)

f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4'))
f2 = Feature('name', 'angle', ref.select_atoms('bynum 1 3 2'))
f3 = Feature('name', 'bond', ref.select_atoms('bynum 1 3'))

input_ag = ref.select_atoms('bynum 1 2 3 4 5 6')
# define feature layer using features f1, f2 and f3
f_layer = FeatureLayer([f1, f3, f2], input_ag, use_angle_value=False)

print ('output dim=', f_layer.output_dimension())
x = torch.tensor(input_ag.positions).unsqueeze(0)
print (f_layer(x))
ff = f_layer.get_feature(0)
print (f_layer.get_feature_info())

feature_layer_model_name = 'feature_layer.pt'
torch.jit.script(f_layer).save(feature_layer_model_name)

The following code defines an identity feature layer (for the first three atoms).

ag = ref.select_atoms('bynum 1 2 3')
f4 = Feature('identity', 'position', ag)
identity_f_layer = FeatureLayer([f4], ag, use_angle_value=False)

forward(x)[source]

forward map

Parameters:: x (torch.Tensor) – 3d tensor that contains coordinates of states
Returns:: torch.Tensor, 2d tensor that contains all features (in the feature list) of states

This function simply calls FeatureMap.forward() for each feature in the feature list and then concatenates the tensors.

get_feature(idx)[source]

Parameters:: idx (int) – index of feature in feature list
Returns:: molann.feature.Feature, the :math:`idx`th feature in the feature list

get_feature_info()[source]

display information of features

Returns:: pandas.DataFrame, information of features

output_dimension()[source]

Returns:: int, total dimension of features in the feature list, or, equivalently, the size of the output layer of ANN

class molann.ann.PreprocessingANN(align_layer, feature_layer)[source]

ANN that performs preprocessing of states

Parameters:

align_layer (AlignmentLayer or None) – alignment layer
feature_layer (FeatureLayer) – feature layer

Example:

import MDAnalysis as mda
from molann.ann import FeatureLayer, PreprocessingANN
from molann.feature import Feature
import torch

# pdb file of the system
pdb_filename = '/path/to/system.pdb'
ref = mda.Universe(pdb_filename)

ag=ref.select_atoms('bynum 1 2 3')
input_ag=ref.select_atoms('bynum 1 2 3 4 5 6 7')

# define alignment layer
align = AlignmentLayer(ag, input_ag)

# features are just positions of atoms 1,2 and 3.
f1 = Feature('name', 'position', ag)
f_layer = FeatureLayer([f1], input_ag, use_angle_value=False)

# put together to get the preprocessing layer
pp_layer = PreprocessingANN(align, f_layer)

x = torch.tensor(input_ag.positions).unsqueeze(0)
print (pp_layer(x))

When feature is already both translation- and rotation-invariant, alignment is not neccessary:

# define feature as dihedral angle
f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4'))
f_layer = FeatureLayer([f1], input_ag, use_angle_value=False)

# since feature is both translation- and rotation-invariant, alignment is not neccessary
pp_layer = PreprocessingANN(None, f_layer)

If only alignment is desired, one can provide an identity feature layer when defining PreprocessingANN.

f = Feature('identity', 'position', input_ag)
identity_f_layer = FeatureLayer([f], input_ag, use_angle_value=False)
pp_layer = PreprocessingANN(align, identity_f_layer)

forward(x)[source]

forward map that aligns states and then maps to features

Parameters:: x (torch.Tensor) – 3d tensor that contains coordinates of states
Returns:: 2d torch.Tensor

output_dimension()[source]

Returns:: int, the dimension of the output layer

class molann.ann.MolANN(preprocessing_layer, ann_layers)[source]

ANN that incoorporates preprocessing layer and the remaining layers which contains training parameters.

Parameters:

preprocessing_layer (PreprocessingANN) – preprocessing layer
ann_layers (torch.nn.Module) – remaining layers

Example:

import MDAnalysis as mda
from molann.ann import FeatureLayer, PreprocessingANN, MolANN, create_sequential_nn
from molann.feature import Feature

# pdb file of the system
pdb_filename = '/path/to/system.pdb'
ref = mda.Universe(pdb_filename)

f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4'))
input_ag=ref.select_atoms('bynum 1 2 3 4 5 6 7')

f_layer = FeatureLayer([f1], input_ag, use_angle_value=False)
pp_layer = PreprocessingANN(None, f_layer)

output_dim = pp_layer.output_dimension()

# neural networks layers which contains training parameters
nn = create_sequential_nn([output_dim, 5, 3])

model = MolANN(pp_layer, nn)

preprocessing_layer

Type:: PreprocessingANN

ann_layers

Type:: torch.nn.Module

forward(x)[source]: the forward map

get_preprocessing_layer()[source]

Returns:: PreprocessingANN, the preprocessing_layer

molann.ann.create_sequential_nn(layer_dims, activation=Tanh())[source]

Construct a feedforward Pytorch neural network

Parameters:

layer_dims (list of int) – dimensions of layers
activation – PyTorch non-linear activation function

Raises:

AssertionError – if length of layer_dims is not larger than 1.

Example

from molann.ann import create_sequential_nn
import torch

nn1 = create_sequential_nn([10, 5, 1])
nn2 = create_sequential_nn([10, 2], activation=torch.nn.ReLU())

Artificial Neural networks for Molecular System — molann.ann

Classes

Artificial Neural networks for Molecular System — `molann.ann`