Artificial Neural networks for Molecular System — molann.ann
- Author:
Wei Zhang
- Year:
2022
- Copyright:
GNU Public License v3
This module implements several PyTorch artificial neural network (ANN)
classes, i.e. derived classes of torch.nn.Module
,
which take into acount alignment, or which use features of molecular system as input.
Classes
- class molann.ann.AlignmentLayer(align_atom_group, input_atom_group)[source]
ANN layer that performs alignment based on Kabsch algorithm
- Parameters:
align_atom_group (
MDAnalysis.core.groups.AtomGroup
) – specifies atom group whose coordinates are taken as reference when performing alignment.input_atom_group (
MDAnalysis.core.groups.AtomGroup
) – specifies atoms that are used as input of the neural network.
Let \(x_{ref}\in \mathbb{R}^{n_r\times 3}\) be the coordinates of the reference atoms, where \(n_r\) is the number of atoms in the atom group align_atom_group. Then, this class defines the map
\[x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow (x-c(x))A(x) \in \mathbb{R}^{n_{inp} \times 3}\,,\]where, given coordinates \(x\) of \(n_{inp}\) atoms, \(A(x)\in \mathbb{R}^{3\times 3}\) and \(c(x)\in \mathbb{R}^{n_{inp}\times 3}\) (\(n_{inp}\) repetitions of a vector in \(\mathbb{R}^{3}\)) are respectively the optimal rotation and translation determined (with respect to \(x_{ref}\)) using the Kabsch algorithm.
Note that \(x_{ref}\) will be shifted to have zero mean before it is used to align states.
Example:
import torch import MDAnalysis as mda from molann.ann import AlignmentLayer # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) ag=ref.select_atoms('bynum 1 2 5') input_ag = ref.atoms align = AlignmentLayer(ag, input_ag) align.show_info() # for illustration, use the state in the pdb file (length 1) x = torch.tensor(ref.atoms.positions).unsqueeze(0) print (align(x)) # save the model to file align_model_name = 'algin.pt' torch.jit.script(align).save(align_model_name)
- align_atom_indices
(0-based) indices of atoms used to align coordinates.
- Type:
list of int
- input_atom_indices
(0-based) indices of atoms in the input tensor.
- Type:
list of int
- input_atom_num
atom number (i.e. \(n_{inp}\)) in the input tensor.
- Type:
int
- ref_x
reference coordinates \(x_{ref}\).
- Type:
- Raises:
ValueError – if some reference atom is not in the atom group input_atom_group.
- forward(x)[source]
align states by translation and rotation.
- Parameters:
x (
torch.Tensor
) – states to be aligned- Returns:
torch.Tensor
that stores the aligned states- Raises:
AssertionError – if x is not a Torch tensor with sizes \([*, n_{inp},3]\).
x should be a 3d tensor, whose shape is \([l, n_{inp}, 3]\), where \(l\) is the number of states in x and \(n_{inp}\) is the total number of atoms in the atom group input_atom_group. The returned tensor has the same shape.
This method implements the Kabsch algorithm.
- class molann.ann.FeatureMap(feature, input_atom_group, use_angle_value=False)[source]
ANN that maps coordinates to a feature
- Parameters:
feature (
molann.feature.Feature
) – feature that defines the mapinput_atom_group –
:param (
MDAnalysis.core.groups.AtomGroup
): atom group used as input. This atom group must include all the atoms used to define feature. :param use_angle_value: if true, use angle value in radians, elseuse sine and/or cosine values. It has no effect if the type of feature is ‘position’.
This class defines the feature map
\[f: x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow f(x) \in \mathbb{R}^{d}\,,\]corresponding to the input feature, where \(n_{inp}\) is the number of atoms in the input atom group.
Example:
import MDAnalysis as mda from molann.ann import FeatureMap from molann.feature import Feature import torch # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) f = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) input_ag = ref.select_atoms('bynum 1 2 3 4 5') fmap = FeatureMap(f, input_ag, use_angle_value=False) print ('dim=', fmap.dim()) x = torch.tensor(input_ag.positions).unsqueeze(0) print (fmap(x)) feature_model_name = 'feature_map.pt' torch.jit.script(fmap).save(feature_model_name)
- Raises:
ValueError – if some atom used to define feature is not in the atom group for input.
- dim()[source]
- Returns:
d (int), total dimension of features or, equivalently, dimension of the output layer of the ANN.
\(d=1\) for ‘angle’ and ‘bond’, as well as for ‘dihedral’ when use_angle_value =True.
\(d=2\) for ‘dihedral’, when use_angle_value =False.
\(d=3n\) for ‘position’, where \(n\) is the number of atoms involved in feature (note: it is possible that \(n<n_{inp}\)).
- forward(x)[source]
map position to feature
- Parameters:
x (
torch.Tensor
) – 3d tensor that contains coordinates of states- Returns:
torch.Tensor
, 2d tensor that contains features of the states
x should be a 3d tensor with shape \([l, n_{inp}, 3]\), where \(l\) is the number of states in x and \(n_{inp}\) is the total number of atoms in the atom group input_atom_group.
The output is a tensor with shape \([l, d]\), where \(l\) is the number of states in x and \(d\) is the dimension returned by
dim()
.For ‘angle’, if use_angle_value=True, it returns angle values in \([0, \pi]\); otherwise, it retuns the cosine values of the angles.
For ‘dihedral’, if use_angle_value=True, it returns angle values in \([-\pi, \pi]\); otherwise, it retuns [cosine, sine] of the angles.
For ‘position’, it returns the coordinates of all the atoms in the feature.
- Raises:
AssertionError – if x is not a Torch tensor with sizes \([*, n_{inp},3]\).
- class molann.ann.FeatureLayer(feature_list, input_atom_group, use_angle_value=False)[source]
ANN layer that maps coordinates to all features in a feature list
- Parameters:
feature_list (list of
molann.feature.Feature
) – list of featuresinput_atom_group (
MDAnalysis.core.groups.AtomGroup
) – atom group used as input.use_angle_value (boolean) – whether to use angle value in radians
This class encapsulates
FeatureMap
and maps input coordinates to multiple features. More concretely, it defines the map\[x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow (f_1(x), f_2(x), \dots, f_l(x))\,,\]where \(n_{inp}\) is the number of atoms in the atom group input_atom_group, \(l\) is the number of features in the feature list, and each \(f_i\) is the feature map defined by the class
FeatureMap
.- Raises:
AssertionError – if feature_list is empty.
Example:
import MDAnalysis as mda from molann.ann import FeatureLayer from molann.feature import Feature import torch # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) f2 = Feature('name', 'angle', ref.select_atoms('bynum 1 3 2')) f3 = Feature('name', 'bond', ref.select_atoms('bynum 1 3')) input_ag = ref.select_atoms('bynum 1 2 3 4 5 6') # define feature layer using features f1, f2 and f3 f_layer = FeatureLayer([f1, f3, f2], input_ag, use_angle_value=False) print ('output dim=', f_layer.output_dimension()) x = torch.tensor(input_ag.positions).unsqueeze(0) print (f_layer(x)) ff = f_layer.get_feature(0) print (f_layer.get_feature_info()) feature_layer_model_name = 'feature_layer.pt' torch.jit.script(f_layer).save(feature_layer_model_name)
The following code defines an identity feature layer (for the first three atoms).
ag = ref.select_atoms('bynum 1 2 3') f4 = Feature('identity', 'position', ag) identity_f_layer = FeatureLayer([f4], ag, use_angle_value=False)
- forward(x)[source]
forward map
- Parameters:
x (
torch.Tensor
) – 3d tensor that contains coordinates of states- Returns:
torch.Tensor
, 2d tensor that contains all features (in the feature list) of states
This function simply calls
FeatureMap.forward()
for each feature in the feature list and then concatenates the tensors.
- get_feature(idx)[source]
- Parameters:
idx (int) – index of feature in feature list
- Returns:
molann.feature.Feature
, the :math:`idx`th feature in the feature list
- get_feature_info()[source]
display information of features
- Returns:
pandas.DataFrame
, information of features
- class molann.ann.PreprocessingANN(align_layer, feature_layer)[source]
ANN that performs preprocessing of states
- Parameters:
align_layer (
AlignmentLayer
or None) – alignment layerfeature_layer (
FeatureLayer
) – feature layer
Example:
import MDAnalysis as mda from molann.ann import FeatureLayer, PreprocessingANN from molann.feature import Feature import torch # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) ag=ref.select_atoms('bynum 1 2 3') input_ag=ref.select_atoms('bynum 1 2 3 4 5 6 7') # define alignment layer align = AlignmentLayer(ag, input_ag) # features are just positions of atoms 1,2 and 3. f1 = Feature('name', 'position', ag) f_layer = FeatureLayer([f1], input_ag, use_angle_value=False) # put together to get the preprocessing layer pp_layer = PreprocessingANN(align, f_layer) x = torch.tensor(input_ag.positions).unsqueeze(0) print (pp_layer(x))
When feature is already both translation- and rotation-invariant, alignment is not neccessary:
# define feature as dihedral angle f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) f_layer = FeatureLayer([f1], input_ag, use_angle_value=False) # since feature is both translation- and rotation-invariant, alignment is not neccessary pp_layer = PreprocessingANN(None, f_layer)
If only alignment is desired, one can provide an identity feature layer when defining
PreprocessingANN
.f = Feature('identity', 'position', input_ag) identity_f_layer = FeatureLayer([f], input_ag, use_angle_value=False) pp_layer = PreprocessingANN(align, identity_f_layer)
- forward(x)[source]
forward map that aligns states and then maps to features
- Parameters:
x (
torch.Tensor
) – 3d tensor that contains coordinates of states- Returns:
2d
torch.Tensor
- class molann.ann.MolANN(preprocessing_layer, ann_layers)[source]
ANN that incoorporates preprocessing layer and the remaining layers which contains training parameters.
- Parameters:
preprocessing_layer (
PreprocessingANN
) – preprocessing layerann_layers (
torch.nn.Module
) – remaining layers
Example:
import MDAnalysis as mda from molann.ann import FeatureLayer, PreprocessingANN, MolANN, create_sequential_nn from molann.feature import Feature # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) input_ag=ref.select_atoms('bynum 1 2 3 4 5 6 7') f_layer = FeatureLayer([f1], input_ag, use_angle_value=False) pp_layer = PreprocessingANN(None, f_layer) output_dim = pp_layer.output_dimension() # neural networks layers which contains training parameters nn = create_sequential_nn([output_dim, 5, 3]) model = MolANN(pp_layer, nn)
- preprocessing_layer
- Type:
- ann_layers
- Type:
- get_preprocessing_layer()[source]
- Returns:
PreprocessingANN
, the preprocessing_layer
- molann.ann.create_sequential_nn(layer_dims, activation=Tanh())[source]
Construct a feedforward Pytorch neural network
- Parameters:
layer_dims (list of int) – dimensions of layers
activation – PyTorch non-linear activation function
- Raises:
AssertionError – if length of layer_dims is not larger than 1.
Example
from molann.ann import create_sequential_nn import torch nn1 = create_sequential_nn([10, 5, 1]) nn2 = create_sequential_nn([10, 2], activation=torch.nn.ReLU())