Artificial Neural networks for Molecular System — molann.ann
- Author:
Wei Zhang
- Year:
2022
- Copyright:
GNU Public License v3
This module implements several PyTorch artificial neural network (ANN)
classes, i.e. derived classes of torch.nn.Module,
which take into acount alignment, or which use features of molecular system as input.
Classes
- class molann.ann.AlignmentLayer(align_atom_group, input_atom_group)[source]
ANN layer that performs alignment based on Kabsch algorithm
- Parameters:
align_atom_group (
MDAnalysis.core.groups.AtomGroup) – specifies atom group whose coordinates are taken as reference when performing alignment.input_atom_group (
MDAnalysis.core.groups.AtomGroup) – specifies atoms that are used as input of the neural network.
Let \(x_{ref}\in \mathbb{R}^{n_r\times 3}\) be the coordinates of the reference atoms, where \(n_r\) is the number of atoms in the atom group align_atom_group. Then, this class defines the map
\[x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow (x-c(x))A(x) \in \mathbb{R}^{n_{inp} \times 3}\,,\]where, given coordinates \(x\) of \(n_{inp}\) atoms, \(A(x)\in \mathbb{R}^{3\times 3}\) and \(c(x)\in \mathbb{R}^{n_{inp}\times 3}\) (\(n_{inp}\) repetitions of a vector in \(\mathbb{R}^{3}\)) are respectively the optimal rotation and translation determined (with respect to \(x_{ref}\)) using the Kabsch algorithm.
Note that \(x_{ref}\) will be shifted to have zero mean before it is used to align states.
Example:
import torch import MDAnalysis as mda from molann.ann import AlignmentLayer # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) ag=ref.select_atoms('bynum 1 2 5') input_ag = ref.atoms align = AlignmentLayer(ag, input_ag) align.show_info() # for illustration, use the state in the pdb file (length 1) x = torch.tensor(ref.atoms.positions).unsqueeze(0) print (align(x)) # save the model to file align_model_name = 'algin.pt' torch.jit.script(align).save(align_model_name)
- align_atom_indices
(0-based) indices of atoms used to align coordinates.
- Type:
list of int
- input_atom_indices
(0-based) indices of atoms in the input tensor.
- Type:
list of int
- input_atom_num
atom number (i.e. \(n_{inp}\)) in the input tensor.
- Type:
int
- ref_x
reference coordinates \(x_{ref}\).
- Type:
- Raises:
ValueError – if some reference atom is not in the atom group input_atom_group.
- forward(x)[source]
align states by translation and rotation.
- Parameters:
x (
torch.Tensor) – states to be aligned- Returns:
torch.Tensorthat stores the aligned states- Raises:
AssertionError – if x is not a Torch tensor with sizes \([*, n_{inp},3]\).
x should be a 3d tensor, whose shape is \([l, n_{inp}, 3]\), where \(l\) is the number of states in x and \(n_{inp}\) is the total number of atoms in the atom group input_atom_group. The returned tensor has the same shape.
This method implements the Kabsch algorithm.
- class molann.ann.FeatureMap(feature, input_atom_group, use_angle_value=False)[source]
ANN that maps coordinates to a feature
- Parameters:
feature (
molann.feature.Feature) – feature that defines the mapinput_atom_group –
:param (
MDAnalysis.core.groups.AtomGroup): atom group used as input. This atom group must include all the atoms used to define feature. :param use_angle_value: if true, use angle value in radians, elseuse sine and/or cosine values. It has no effect if the type of feature is ‘position’.
This class defines the feature map
\[f: x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow f(x) \in \mathbb{R}^{d}\,,\]corresponding to the input feature, where \(n_{inp}\) is the number of atoms in the input atom group.
Example:
import MDAnalysis as mda from molann.ann import FeatureMap from molann.feature import Feature import torch # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) f = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) input_ag = ref.select_atoms('bynum 1 2 3 4 5') fmap = FeatureMap(f, input_ag, use_angle_value=False) print ('dim=', fmap.dim()) x = torch.tensor(input_ag.positions).unsqueeze(0) print (fmap(x)) feature_model_name = 'feature_map.pt' torch.jit.script(fmap).save(feature_model_name)
- Raises:
ValueError – if some atom used to define feature is not in the atom group for input.
- dim()[source]
- Returns:
d (int), total dimension of features or, equivalently, dimension of the output layer of the ANN.
\(d=1\) for ‘angle’ and ‘bond’, as well as for ‘dihedral’ when use_angle_value =True.
\(d=2\) for ‘dihedral’, when use_angle_value =False.
\(d=3n\) for ‘position’, where \(n\) is the number of atoms involved in feature (note: it is possible that \(n<n_{inp}\)).
- forward(x)[source]
map position to feature
- Parameters:
x (
torch.Tensor) – 3d tensor that contains coordinates of states- Returns:
torch.Tensor, 2d tensor that contains features of the states
x should be a 3d tensor with shape \([l, n_{inp}, 3]\), where \(l\) is the number of states in x and \(n_{inp}\) is the total number of atoms in the atom group input_atom_group.
The output is a tensor with shape \([l, d]\), where \(l\) is the number of states in x and \(d\) is the dimension returned by
dim().For ‘angle’, if use_angle_value=True, it returns angle values in \([0, \pi]\); otherwise, it retuns the cosine values of the angles.
For ‘dihedral’, if use_angle_value=True, it returns angle values in \([-\pi, \pi]\); otherwise, it retuns [cosine, sine] of the angles.
For ‘position’, it returns the coordinates of all the atoms in the feature.
- Raises:
AssertionError – if x is not a Torch tensor with sizes \([*, n_{inp},3]\).
- class molann.ann.FeatureLayer(feature_list, input_atom_group, use_angle_value=False)[source]
ANN layer that maps coordinates to all features in a feature list
- Parameters:
feature_list (list of
molann.feature.Feature) – list of featuresinput_atom_group (
MDAnalysis.core.groups.AtomGroup) – atom group used as input.use_angle_value (boolean) – whether to use angle value in radians
This class encapsulates
FeatureMapand maps input coordinates to multiple features. More concretely, it defines the map\[x \in \mathbb{R}^{n_{inp} \times 3} \longrightarrow (f_1(x), f_2(x), \dots, f_l(x))\,,\]where \(n_{inp}\) is the number of atoms in the atom group input_atom_group, \(l\) is the number of features in the feature list, and each \(f_i\) is the feature map defined by the class
FeatureMap.- Raises:
AssertionError – if feature_list is empty.
Example:
import MDAnalysis as mda from molann.ann import FeatureLayer from molann.feature import Feature import torch # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) f2 = Feature('name', 'angle', ref.select_atoms('bynum 1 3 2')) f3 = Feature('name', 'bond', ref.select_atoms('bynum 1 3')) input_ag = ref.select_atoms('bynum 1 2 3 4 5 6') # define feature layer using features f1, f2 and f3 f_layer = FeatureLayer([f1, f3, f2], input_ag, use_angle_value=False) print ('output dim=', f_layer.output_dimension()) x = torch.tensor(input_ag.positions).unsqueeze(0) print (f_layer(x)) ff = f_layer.get_feature(0) print (f_layer.get_feature_info()) feature_layer_model_name = 'feature_layer.pt' torch.jit.script(f_layer).save(feature_layer_model_name)
The following code defines an identity feature layer (for the first three atoms).
ag = ref.select_atoms('bynum 1 2 3') f4 = Feature('identity', 'position', ag) identity_f_layer = FeatureLayer([f4], ag, use_angle_value=False)
- forward(x)[source]
forward map
- Parameters:
x (
torch.Tensor) – 3d tensor that contains coordinates of states- Returns:
torch.Tensor, 2d tensor that contains all features (in the feature list) of states
This function simply calls
FeatureMap.forward()for each feature in the feature list and then concatenates the tensors.
- get_feature(idx)[source]
- Parameters:
idx (int) – index of feature in feature list
- Returns:
molann.feature.Feature, the :math:`idx`th feature in the feature list
- get_feature_info()[source]
display information of features
- Returns:
pandas.DataFrame, information of features
- class molann.ann.PreprocessingANN(align_layer, feature_layer)[source]
ANN that performs preprocessing of states
- Parameters:
align_layer (
AlignmentLayeror None) – alignment layerfeature_layer (
FeatureLayer) – feature layer
Example:
import MDAnalysis as mda from molann.ann import FeatureLayer, PreprocessingANN from molann.feature import Feature import torch # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) ag=ref.select_atoms('bynum 1 2 3') input_ag=ref.select_atoms('bynum 1 2 3 4 5 6 7') # define alignment layer align = AlignmentLayer(ag, input_ag) # features are just positions of atoms 1,2 and 3. f1 = Feature('name', 'position', ag) f_layer = FeatureLayer([f1], input_ag, use_angle_value=False) # put together to get the preprocessing layer pp_layer = PreprocessingANN(align, f_layer) x = torch.tensor(input_ag.positions).unsqueeze(0) print (pp_layer(x))
When feature is already both translation- and rotation-invariant, alignment is not neccessary:
# define feature as dihedral angle f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) f_layer = FeatureLayer([f1], input_ag, use_angle_value=False) # since feature is both translation- and rotation-invariant, alignment is not neccessary pp_layer = PreprocessingANN(None, f_layer)
If only alignment is desired, one can provide an identity feature layer when defining
PreprocessingANN.f = Feature('identity', 'position', input_ag) identity_f_layer = FeatureLayer([f], input_ag, use_angle_value=False) pp_layer = PreprocessingANN(align, identity_f_layer)
- forward(x)[source]
forward map that aligns states and then maps to features
- Parameters:
x (
torch.Tensor) – 3d tensor that contains coordinates of states- Returns:
2d
torch.Tensor
- class molann.ann.MolANN(preprocessing_layer, ann_layers)[source]
ANN that incoorporates preprocessing layer and the remaining layers which contains training parameters.
- Parameters:
preprocessing_layer (
PreprocessingANN) – preprocessing layerann_layers (
torch.nn.Module) – remaining layers
Example:
import MDAnalysis as mda from molann.ann import FeatureLayer, PreprocessingANN, MolANN, create_sequential_nn from molann.feature import Feature # pdb file of the system pdb_filename = '/path/to/system.pdb' ref = mda.Universe(pdb_filename) f1 = Feature('name', 'dihedral', ref.select_atoms('bynum 1 3 2 4')) input_ag=ref.select_atoms('bynum 1 2 3 4 5 6 7') f_layer = FeatureLayer([f1], input_ag, use_angle_value=False) pp_layer = PreprocessingANN(None, f_layer) output_dim = pp_layer.output_dimension() # neural networks layers which contains training parameters nn = create_sequential_nn([output_dim, 5, 3]) model = MolANN(pp_layer, nn)
- preprocessing_layer
- Type:
- ann_layers
- Type:
- get_preprocessing_layer()[source]
- Returns:
PreprocessingANN, the preprocessing_layer
- molann.ann.create_sequential_nn(layer_dims, activation=Tanh())[source]
Construct a feedforward Pytorch neural network
- Parameters:
layer_dims (list of int) – dimensions of layers
activation – PyTorch non-linear activation function
- Raises:
AssertionError – if length of layer_dims is not larger than 1.
Example
from molann.ann import create_sequential_nn import torch nn1 = create_sequential_nn([10, 5, 1]) nn2 = create_sequential_nn([10, 2], activation=torch.nn.ReLU())