The proto-Nucleic-Acid Builder (pNAB)
Public Member Functions | Private Member Functions | Private Attributes | List of all members
PNAB::ConformationSearch Class Reference

A rotor search function used to find acceptable conformations of arbitrary backbone and helical parameter combinations. The main class of the proto-Nucleic Acid Builder. More...

#include <ConformationSearch.h>

Collaboration diagram for PNAB::ConformationSearch:
Collaboration graph
[legend]

Public Member Functions

 ConformationSearch (PNAB::RuntimeParameters &runtime_params, PNAB::Backbone &backbone, PNAB::HelicalParameters &helical_params, PNAB::Bases bases, std::string prefix="test", bool verbose=true)
 Constructor for the conformation search class. More...
 
std::string run ()
 A function to call the appropriate search algorithm using the provided RuntimeParameters::search_algorithm. More...
 

Private Member Functions

void SystematicSearch ()
 Given a step size, the algorithm exhaustively searches over all the rotatable dihedral angles in the backbone. More...
 
void RandomSearch (bool weighted)
 The algorithm randomly changes all the dihedral angles in the backbone and evaluates whether they are acceptable. More...
 
void MonteCarloSearch (bool weighted)
 The algorithm utilizes the Metropolis Monte Carlo scheme to improve the choice of the dihedral angle. More...
 
void GeneticAlgorithmSearch ()
 This algorithm utilizes the genetic algorihtm procedure to improve the choice of the dihedral angle. More...
 
std::vector< std::piecewise_linear_distribution< double > > WeightedDistributions ()
 Produces weighted distributions for each rotatable dihedral angle in the backbone. More...
 
double measureDistance (double *coords, unsigned head, unsigned tail)
 Compute the distance between the head backbone atom and the tail backbone atom of the next nucleotide. More...
 
void reportData (PNAB::ConformerData &conf_data)
 A function to report the data on the accepted candidates. More...
 
void printProgress (std::size_t search_index, std::size_t search_size)
 Prints the percentage of search completed and the best accepted candidate. More...
 

Private Attributes

bool verbose_
 Whether to print progress report to screen. More...
 
PNAB::RuntimeParameters runtime_params_
 The runtime parameters instance, RuntimeParameters. More...
 
std::array< unsigned, 2 > backbone_range_
 The Backbone index range for the first nucleotide. More...
 
PNAB::Backbone backbone_
 The backbone molecule. More...
 
PNAB::HelicalParameters helical_params_
 the helical parameters More...
 
PNAB::Bases bases_
 the list of the defined bases More...
 
std::mt19937_64 rng_
 A random number generator. More...
 
int number_of_candidates = 0
 The number of accepted candiates. More...
 
OpenBabel::matrix3x3 step_rot_
 The step rotation matrix, HelicalParameters::getStepRotationMatrix. More...
 
OpenBabel::matrix3x3 glbl_rot_
 The global rotation matrix, HelicalParameters::getGlobalRotationMatrix. More...
 
OpenBabel::vector3 step_translate_
 The step translation vector, HelicalParameters::getStepTranslationVec. More...
 
OpenBabel::vector3 glbl_translate_
 The global translation vector, HelicalParameters::getGlobalTranslationVec. More...
 
OpenBabel::OBConversion conv_
 An openbabel conversion object used to write accepted candidates. More...
 
unsigned monomer_num_coords_
 3*The number of atoms in the first nucleotide More...
 
std::string prefix_
 A string to prepend the name of the accepted backbone candidates. More...
 
BaseUnit unit
 The nucleotide unit for the first nucleotide in the strand. Used for searching for conformations. More...
 
OpenBabel::OBMol bu_a_mol
 The molecule of the first nucleotide. More...
 
unsigned head
 The first terminal atom in the backbone. More...
 
unsigned tail
 The second terminal atom in the backbone. More...
 
std::string output_string
 A string in CSV format containing the properties of the accepted candidates. More...
 
double * coords
 The coordinates of the first nucleotide. More...
 
OpenBabel::OBRotorList rl
 The list of the all rotatable dihedral angles in the backbone. More...
 
std::vector< OpenBabel::OBRotor * > rotor_vector
 A vector of dihedral angles to be rotated in the search. Excludes fixed angles. More...
 

Detailed Description

A rotor search function used to find acceptable conformations of arbitrary backbone and helical parameter combinations. The main class of the proto-Nucleic Acid Builder.

The general algorithm follows the following steps:

  1. Define the input parameters:
    • The three-dimensional structure of the backbone, the atoms that form the connection to the base, and the two atoms (head and tail) that link the backbone with two adjacent backbones.
    • The helical configuration of the nucleobases.
    • The runtime parameters:
      • The specific search algorithm (see below).
      • The distance and energy thresholds (see below).
      • The sequence of the nucleobases and whether the system is single-stranded, a duplex, or a hexad.
  2. Connect the backbone with the first nucleobase in the sequence.
  3. Fix all the bond distances, bond angles, and the rigid dihedral angles in the backbone. Fix all the atoms of the nucleobase.
  4. Change the dihdedal angles in the backbone using one of the search algorithm.
  5. Translate and rotate the tail atom of the backbone using the given helical parameters.
  6. Measure the distance between the head and the tail atoms of the backbone.
  7. If the distance is greater than the distance threshold, go to 4. Else, proceed.
  8. Arrange the nucleobases using the given helical parameters. Replicate the backbone structure across all the nucleotides.
  9. Evaluate various energy terms for the system.
  10. If the energy terms are less than the energy thresholds, save the structure. Else, reject the structure.
  11. Repeat steps 4-10 or quit.

The various search algorithm differ in the implementation of step 4. Four classes of search algorithms are implemented here, namely:

Details on the specific implementation of each algorithm are discussed below.

See also
SystematicSearch
RandomSearch
MonteCarloSearch
GeneticAlgorithmSearch
HelicalParameters
RuntimeParameters
Backbone
BaseUnit
Chain

Constructor & Destructor Documentation

◆ ConformationSearch()

ConformationSearch::ConformationSearch ( PNAB::RuntimeParameters runtime_params,
PNAB::Backbone backbone,
PNAB::HelicalParameters helical_params,
PNAB::Bases  bases,
std::string  prefix = "test",
bool  verbose = true 
)

Constructor for the conformation search class.

The constructor takes the input parameters and constructs a single nucleotide. Then, it determines the rotatable dihedral angles in the backbone.

Parameters
runtime_paramsSeries of parameters that control how the search algorithm runs
backboneThe backbone molecule over which the algorithm searches
helical_paramsThe geometric parameters constraining the possible conformations of the backbone
basesList of defined bases that serves of as the library used for building the strand
prefixA string to prepend the name of the accepted backbone candidates
verboseWhether to print progress report to screen

Member Function Documentation

◆ GeneticAlgorithmSearch()

void ConformationSearch::GeneticAlgorithmSearch ( )
private

This algorithm utilizes the genetic algorihtm procedure to improve the choice of the dihedral angle.

In the genetic algorithm search, a population of rotamers is initialized with random dihedral angles. At each iteration (or generation), certain rotamers (parents) are chosen for reproduction based on their fitness. The offsprings are generated by allowing the two parents to exchange (crossover) one dihedral angle. Then, a random mutation (or a new value for the dihedral angle) may be introduced for one dihedral angle. The new population consisting of the offsprings then follow the same procedure. The algorithm finishes after the specified number of generations. The length of the search is proportional to the number of generations multiplied by the size of the population.

The fitness of the individuals is computed as the inverse of the distance between the head atom of the backbone and the tail atom of the adjacent backbone. The individuals in the population are ranked by their fitness score, and the probability of choosing individuals for mating is proportional to their ranking.

The exchange of the dihedral angles between the parents and the introduction of new dihedral angles is determined by the crossover and mutation probabilities.

The algorithm significantly accelerates the finding of dihedral angles that satisfy the distance threshold. However, the distance threshold only indicates that the candidate backbone configuration can form a periodic structure. It does not indicates whether the backbone configuration is low or high in energy. Therefore, introducing mutations with higher probability is useful in preventing the formation of a homogeneous population that has low distances but high energies.

The probability of choosing a random angle between 0 and 360° is uniform.

See also
RuntimeParameters::num_steps
RuntimeParameters::population_size
RuntimeParameters::crossover_rate
RuntimeParameters::mutation_rate
measureDistance
Chain::generateConformerData

◆ measureDistance()

double ConformationSearch::measureDistance ( double *  coords,
unsigned  head,
unsigned  tail 
)
private

Compute the distance between the head backbone atom and the tail backbone atom of the next nucleotide.

This function applies the global translation and rotation to both the head and the tail atoms of the first nucleotide. Then, it applies the step translation and rotation to the tail atom to get its coordinates in the second nucleotide. Then, it computes the distance between the two atoms. This function is called by all the search algorithms.

Parameters
coordsThe coordinates of the first nucleotide
headThe index of the first terminal atom in the backbone of the first nucleotide
tailThe index of the second terminal atom the backbone of the first nucleotide
Returns
The distance between the two atoms in Angstroms
See also
RuntimeParameters::max_distance
ConformationSearch

◆ MonteCarloSearch()

void ConformationSearch::MonteCarloSearch ( bool  weighted)
private

The algorithm utilizes the Metropolis Monte Carlo scheme to improve the choice of the dihedral angle.

At every iteration, this algorithm randomly chooses two dihedrals and set them to random values. If the new configuration decreases the distance between the head atom of the backbone and the tail atom of the adjacent backbone or if the new distance is less than 1 Angstrom, this step is accepted. If the new configuration does not decrease the distance, the step is provisionally accpeted if a random number between 0 and 1 is less than \(exp(\frac{E}{k_{B}T})\), where \(E\) is a penalty term based on the distance, and \(k_{B}T\) is the Boltzmann constant multiplied by the temperature. If the step is provisionally accepted, the system is constructed and the energy terms of the system are computed. If the energy terms are higher than the thresholds, the step is finally rejected. The algorithm finishes after the specified number of iterations.

The penalty term \(E\) is given by \(E=-k(\textrm{currrent distance} - \textrm{previous distance})^2\). \(k\) is a bond stretching constant fixed at 300 kcal/mol/Angstrom \(^2\). The temperature controls how agressively the algorithm should provisionally accept or reject steps. An infinite temperature will reduce the algorithm to a random search with two dihedral angles changed at every step.

The algorithm significantly accelerates the finding of dihedral angles that satisfy the distance threshold. However, the distance threshold only indicates that the candidate backbone configuration can form a periodic structure. It does not indicates whether the backbone configuration is low or high in energy. Therefore, the ultimate criteria for accepting or rejecting a step, if the distance is not decreased in the step, is whether it satisfies the energy thresholds.

The probability of choosing a random angle between 0 and 360° can be uniform or can be weighted. See the description in WeightedDistributions for an explanation on the weighting scheme for the dihedral angles.

Parameters
weightedWhether to use a weighted distribution for the dihedral angles
See also
RuntimeParameters::num_steps
RuntimeParameters::weighting_temperature
RuntimeParameters::monte_carlo_temperature
WeightedDistributions
measureDistance
Chain::generateConformerData

◆ printProgress()

void ConformationSearch::printProgress ( std::size_t  search_index,
std::size_t  search_size 
)
private

Prints the percentage of search completed and the best accepted candidate.

Prints to the standard output the percentage of the search completed, the name of the PDB file containing the best accepted candidte, its distance and energy, and the number of accepted candidates.

Parameters
search_indexThe step number in the search
search_sizeThe total number of steps

◆ RandomSearch()

void ConformationSearch::RandomSearch ( bool  weighted)
private

The algorithm randomly changes all the dihedral angles in the backbone and evaluates whether they are acceptable.

At every iteration, this algorithm sets all the dihedral angles in the backbone to random values. This algorithm is purely random and each iteration is completely independent from the previous iteration. This algorithm can sample the dihedral angle space much faster than the Systemtic Search algorithm. Therefore, it may find an acceptable candidate faster. However, it is not deterministic. As each iteration is completely independent, this algorithm does not become trapped in unfavorable configurations. Nevertheless, the search does not improve with time. The algorithm finishes after the specified number of iterations.

The probability of choosing a random angle between 0 and 360° can be uniform or can be weighted. See the description in WeightedDistributions for an explanation on the weighting scheme for the dihedral angles.

Parameters
weightedWhether to use a weighted distribution for the dihedral angles
See also
RuntimeParameters::num_steps
RuntimeParameters::weighting_temperature
WeightedDistributions
measureDistance
Chain::generateConformerData

◆ reportData()

void ConformationSearch::reportData ( PNAB::ConformerData conf_data)
private

A function to report the data on the accepted candidates.

This function saves the structure of each accepted candidate in PDB format. It also reports the properties of the candidates, ordered by their total energies, in a string with the CSV format. This is called by all search algorithms when a candidate is accepted. The ConformationSearch::output_string variable is updated.

Parameters
conf_dataThe properties of the accepted candiate

◆ run()

std::string ConformationSearch::run ( )

A function to call the appropriate search algorithm using the provided RuntimeParameters::search_algorithm.

Returns
A string in CSV format containing the properties of the accepted candidates. The structure of the accepted candidates are saved as PDB files.
See also
SystematicSearch
RandomSearch
MonteCarloSearch
GeneticAlgorithmSearch

◆ SystematicSearch()

void ConformationSearch::SystematicSearch ( )
private

Given a step size, the algorithm exhaustively searches over all the rotatable dihedral angles in the backbone.

The systematic search algorithm works by exhaustively rotating all the dihedral angles between 0 and 360° using the given step size. This is the simplest search algorithm. However, it is the most time-consuming as the time grows exponentially with the number of rotatable bonds. The number of steps required for the search is \((\frac{360.0}{\textrm{dihedral step}})^{\textrm{number of rotatable dihedrals}}\). This algorithm is deterministic and reproducible. It is guranteed to find an acceptable candidate, if any exists, to within the given resolution. This algorithm can be used to validate the results of the other algorithms and to determine whether the other algorithms found all the families of the acceptable candidates.

See also
RuntimeParameters::dihedral_step
measureDistance
Chain::generateConformerData

◆ WeightedDistributions()

std::vector< std::piecewise_linear_distribution< double > > ConformationSearch::WeightedDistributions ( )
private

Produces weighted distributions for each rotatable dihedral angle in the backbone.

Instead of using a uniform distrbution for the dihedral angles between 0 and 360°, this function generates weighted distributions for each rotatable dihedral angle in the backbone. The energy of each dihedral angle in isolation is computed every 5°. Then, the probability for a given angle is computed as the \(\frac{exp(\frac{E_i}{k_{B}T})}{\sum_i exp(\frac{E_i}{k_{B}T})}\), where \(E_i\) is the dihedral energy at angle \(i\). The probability in each interval is the linear interpolation of the probabilities at the limits of the interval. The temperature controls how agressively the algorithm should weight the dihedral angles. An infinite temperature will reduce the probabilities to uniformly random distributions.

This weighting scheme is useful in increasing the sampling of the dihedral angles where the energy of the dihedral angle is low. However, the energy of the dihedral angle in isolation does not necessarily mean the nucleotide energy is going to be low. This weighting scheme may improve the search if the acceptable candidates adopt structures where the dihedral angles are not strained. However, it may worsen the search if the unstrained dihedral angles do not produce low-energy candidates.

Returns
A vector of the weighted distributions for each rotatable dihedral angle
See also
RuntimeParameters::weighting_temperature
RandomSearch
MonteCarloSearch

Member Data Documentation

◆ backbone_

PNAB::Backbone PNAB::ConformationSearch::backbone_
private

The backbone molecule.

◆ backbone_range_

std::array<unsigned, 2> PNAB::ConformationSearch::backbone_range_
private

The Backbone index range for the first nucleotide.

◆ bases_

PNAB::Bases PNAB::ConformationSearch::bases_
private

the list of the defined bases

◆ bu_a_mol

OpenBabel::OBMol PNAB::ConformationSearch::bu_a_mol
private

The molecule of the first nucleotide.

◆ conv_

OpenBabel::OBConversion PNAB::ConformationSearch::conv_
private

An openbabel conversion object used to write accepted candidates.

◆ coords

double* PNAB::ConformationSearch::coords
private

The coordinates of the first nucleotide.

◆ glbl_rot_

OpenBabel::matrix3x3 PNAB::ConformationSearch::glbl_rot_
private

The global rotation matrix, HelicalParameters::getGlobalRotationMatrix.

◆ glbl_translate_

OpenBabel::vector3 PNAB::ConformationSearch::glbl_translate_
private

The global translation vector, HelicalParameters::getGlobalTranslationVec.

◆ head

unsigned PNAB::ConformationSearch::head
private

The first terminal atom in the backbone.

◆ helical_params_

PNAB::HelicalParameters PNAB::ConformationSearch::helical_params_
private

the helical parameters

◆ monomer_num_coords_

unsigned PNAB::ConformationSearch::monomer_num_coords_
private

3*The number of atoms in the first nucleotide

◆ number_of_candidates

int PNAB::ConformationSearch::number_of_candidates = 0
private

The number of accepted candiates.

◆ output_string

std::string PNAB::ConformationSearch::output_string
private

A string in CSV format containing the properties of the accepted candidates.

◆ prefix_

std::string PNAB::ConformationSearch::prefix_
private

A string to prepend the name of the accepted backbone candidates.

◆ rl

OpenBabel::OBRotorList PNAB::ConformationSearch::rl
private

The list of the all rotatable dihedral angles in the backbone.

◆ rng_

std::mt19937_64 PNAB::ConformationSearch::rng_
private

A random number generator.

◆ rotor_vector

std::vector<OpenBabel::OBRotor*> PNAB::ConformationSearch::rotor_vector
private

A vector of dihedral angles to be rotated in the search. Excludes fixed angles.

◆ runtime_params_

PNAB::RuntimeParameters PNAB::ConformationSearch::runtime_params_
private

The runtime parameters instance, RuntimeParameters.

◆ step_rot_

OpenBabel::matrix3x3 PNAB::ConformationSearch::step_rot_
private

The step rotation matrix, HelicalParameters::getStepRotationMatrix.

◆ step_translate_

OpenBabel::vector3 PNAB::ConformationSearch::step_translate_
private

The step translation vector, HelicalParameters::getStepTranslationVec.

◆ tail

unsigned PNAB::ConformationSearch::tail
private

The second terminal atom in the backbone.

◆ unit

BaseUnit PNAB::ConformationSearch::unit
private

The nucleotide unit for the first nucleotide in the strand. Used for searching for conformations.

◆ verbose_

bool PNAB::ConformationSearch::verbose_
private

Whether to print progress report to screen.


The documentation for this class was generated from the following files: