GUI-HBDock: Reference Manual
General Information
GUI-HBDock program modules
Additional Sections
The section contains general information about programs named HBDock and GUI-HBDock, their execution environment and usage principles.
The HBDock program is designed for computation tasks with three-dimensional protein and nucleic acid structures developed for searches of low molecular ligands` docking sites. The mathematical model in the frame of classical molecular mechanics method is used. Atoms are treated as classical point-size particles, interacting with each-other and the environment by the complexes of covalent and non-covalent interatomic intramolecular forces according to the defined physical models. The physical model of forces is defined by the force field model for interatomic interactions. Program is designed to work with protein molecules consisting of 20 types of natural aminoacid residues and nicleic acid molecules, DNA and RNA, consisting of 4 types of natural nucleotides and several types of minor nucleotides that can be observed in natural t-RNA. Maximum summary number of atoms in protein molecule, nucleic acid or their complex should not be greater than 10000. This allows to work with the most known protein molecules and nucleic acid fragments.
Blind docking method used by HBDock program was published in the Journal of Computational Chemistry in 2010. You may contact the author by E-Mail: ynvorob@niboch.nsc.ru
The program supports Linux and Microsoft Windows operating systems.
The GUI-HBDock program is designed to create control files for the HBDock program in an interactive mode. An interactive mode here means using different controls user is familiar with, with the corresponding titles, instead of providing parameters manually in a text mode, where the exact knowledge of keywords is required, and also the strict formatting is assumed. The program also performs PDB file check for mistakes, modification for error-proof reading by HBDock program, basically checks input values to be of the required type and to be in the allowed value range. The program creates a command file (for Windows and Linux operating systems), that allows to start computations on the client site immediately. Operating system is detected automatically, that allows users not to think of operating system specific tweaks.
The GUI-HBDock program has a complete support of all wide-spread web browsers: Microsoft Internet Explorer (version 6.0 and above) and all browsers based on its core (e.g., Maxton), Mozilla FireFox (versions 2 and 3, including Linux versions), Opera, Google Chrome. Some minor difference in page layout and controls' look-and-feel could be noticed, but they won't interfere program usage however.
Right after opening the program in a web browser's window one can start to work with it. The main menu located on the left allows to switch between different task information modules. These modules were created for user's convenience and are in common subject area. A title of the module selected is highlighted in white. Work with module is performed in stepwise mode, the next step is turns enabled upon successful completion of the previous one. All intermediate data will be stored. After all steps are successfully completed, the archive containing all the control files should be generated. The archive includes a command file to launch the HBDock program, a main control file and additional files if they are required. The archive has tar.gz format and can be opened with proper archivers both in Windows and Linux operating systems.
To identify users program uses its own session mechanism based on users' IP-addresses (both external and internal if it resolves) and unique identifiers stored in cookies. Therefore for convenient work cookies should be enabled and program should be allowed to write them. No data except applied directly to the program will be written.
User session expires in 15 minutes. So if user diverts his attention away for more than this time period all data entered may be lost. The same thing may happen if the program is not able to work with cookies – in this case the unique identifier will be lost, the program will allocate a new one and a new session will be initialized.
Upon starting new module all data entered are reset. Therefore it is enough to select a first step of some module in the main menu to flush data.
The GUI-HBDock program supports multiple languages. At this moment only English and Russian languages are supported, however more can be added at administrator's discretion. English language is the default program language. Language switching is performed by clicking the icon with the corresponding national flag located in the top right corner of the program's layout. A language title hint is shown on moving the mouse over the icon. The information about the selected language is stored in cookies and is restored automatically on subsequent program runs.
Protein preparation module allows to check and, if necessary, to correct protein PDB file, set up preparatory modeling parameters and generate all files required.
1. Protein PDB file check
To determine whether the HBDock program will work with user's PDB-filed, there is a PDB validation function. File to be validated is uploaded to the server and analyzed. If it contains critical errors, their minute description is displayed. If errors are insignificant, the validation program corrects them and suggests to download a valid PDB file. Besides correction of insignificant errors (see below) the validation program removes all the information that is not parsed by the HBDock program (in particular, remarks and heteroatoms; in the last case there may be raised some errors that are mentioned below). If there are alternate atom coordinates program stores only atoms of group A, and if there are multiple models program stores only model 1 atoms. Therefore do not replace the original PDB file with the valid one, but save it separately. Program also searches for Cysteine residues that can potentially form an S-S bond and allows user to decide whether the particular bond would be formed (this choice has an effect when modeling with BioPASED program; when using HBDock S-S bonds are resolved automatically).
These errors are referred to as critical ones:
- Too many atoms: the HBDock program can handle not more than 10000 atoms at that moment. If the number of atoms in the PDB file hits the limit, the program won't be able to load it. Some atoms should be deleted to enable the program to work with the file. The number of atoms is displayed in an error's text, and the number of heteroatoms is not included.
- Broken protein backbone in residue: this error is raised if some aminoacid does not contain four necessary atoms that form a protein backbone (these atoms are Ñ, CA, N and O). The number of broken residue is displayed in an error's text. The PDB file must be edited and the break must be eliminated.
- Broken chain between residues: residue sequence numbering is disrupted and distance between two adjacent aminoacid residues exceeds 4.2 angstroms. In this case the validator assumes that some residue is missing from the PDB file. This may occur if non-standard residues were written to the PDB file as the list of heteroatoms, that are not parsed. The residue numbers between which there is a disruption are displayed in an error's text. One must insert one or more missing residues by editing the PDB file, or, if the error is because of heteroatom removal, replace them with normal atoms of the proper residue.
- Undefined residue: there is a nucleotide residue or a modification group in the PDB file that program is not familiar with. This residue should be replaced with one of the common residues by editing the PDB file manually. Undefined aminoacid residues do not raise the error described (see below).
- Incomplete residue: there is a nucleotide residue or a modification group in the PDB file that has a heavy atom count not equal to the real atom count. Missing atoms with the correct coordinates should be added to the structure by editing the PDB file manually.
These errors are referred to as insignificant ones and are corrected automatically (the proper warning is used in this case):
- Invalid atoms' serial numbers: atom serial numbering must be sequential and start from 1. If it does not, the validation program corrects atom serial numbers. This error usually occurs in the PDB proceeding process (deletion of heteroatoms, aminoacid side groups etc.).
- Invalid residues' serial numbers: residue serial numbering must start from 1. Hovewer sometimes not all the residues are resolved within the PDB structure and several first residues are lost. In that case residue serial numbering does not start from 1 and this is corrected by the validation program. Incorrect residue numbering inside the chain leads to another, critical error - «Broken chain between residues». This correction is performed right before PDB file download therefore no warning is printed.
- Asterisks in atom names: in some PDB files atoms of the ribose and desoxyribose constain an asterisk in their names instead of a single quote. Such atom names are corrected automatically.
- Undefined aminoacid: if there is an unknown aminoacid with an intact backbone in the PDB file,the validation program corrects this error by replacing that aminoacid with known aminoacid and deleting all the atoms except four atoms forming the backbone. All other atoms will be restored by the HBDock program during the computation process. Aminoacid may be selected to replace undefined one or it may not be replaced at all.
Program may also raise errors not connected with PDB structure:
- PDB File is empty: there are no atoms in the PDB file, or the file specified is not a PDB file.
- PDB File upload error: could not upload the file to the server. It might be an internal server error, or the file specified is too large.
- Program internal error: there was an internal error in the validator. This should never happen. If it did one is recommended to contact the program's developers.
2. Setting up Parameters
At this step one is proposed to set up parameters for a preparatory modeling, which is necessary for protein structure optimization that is later used in docking calculation step.
- Hydrogen Atoms: allows to specify the mode of how the information about hydrogen atoms is gained. By default the hydrogen restore module is used. If hydrogen atoms exist partially these atoms are ignored and are completely restored according to polymer residues topology library. The restoration of missing heavy atoms of aminoacid side chains is performed automatically by comparing PDB-file read with residue topology in the library. The reading from source structure mode can also be specified if it contains all the hydrogen atoms; this corresponds to specifying the «$Hread» keyword in the control file.
- Molecule Optimization: this parameter specifies whether all the molecule will be used for optimization (which corresponds to the «$FullProtMD» keyword of the control file), or the molecule fragments (which corresponds to the «$MovingRes» keyword). If the «Selected Atoms Only» option is selected, one should specify the segments to be optimized (see «Segment Definition File»). By default all the molecule is optimized.
- Segment Definition File: allows to edit segments to be optimized in a separate editor. Segment is designated by specifying the first and the last residue accoring to the serial number of the residue in the protein molecule. Segments to be optimized are stored in the separate input file that is added to the archive to be downloaded by the user. The file name is passed to the HBDock using «-mv» command-line argument.
- Calculate Energy for Source Structure: specifies whether to perform energy calculation for the source structure. By default it is performed. This parameter corresponds to the «$EngCalc» keyword of the control file.
- Energy Minimization using Local Optimization Method: specifies whether to perform energy optimization using local optimization method. During the computation process the file named MolName.molEnOpt.pdb will be created – this is the result of molecule's structure optimization. By default the optimization is performed. This parameter corresponds to the «$EngOptim» keyword of the control file.
- Minimization Step Count: specifies the number of steps program takes to minimize energy using local optimization method. Minimization step includes a local minimum calculation in single descent direction for a multiple arguments function. Acceptable value range is from 1 to 999. By default 10 steps are performed. The value of 1 is reasonable if the previously optimized structure is taken for the computations. This parameter corresponds to the «$nOptStep» keyword of the control file.
- Molecular Dynamics Simulation: a general parameter that specifies whether the optimization of the molecule using molecular dynamics method will be performed. If it won't all the subsequent parameters would become disabled and would not be used. By default the molecular dynamics simulation is enabled. This parameter corresponds to the «$doMDyn» keyword of the control file.
- Initial Temperature of Molecule, K: specifies the initial temperature of the molecule in the Kelvin scale, used to initialize the initial conditions of atoms' velocities' distribution for the molecular dynamics method. Acceptable value range is from 1 to 1000. By default the value of 10 is used. This parameter corresponds to the «$initMDTemp» keyword of the control file.
- Thermostat Temperature, K: specifies the initial temperature of the environment (thermostat) in the Kelvin scale, where the molecule is located in. Acceptable value range is from 1 to 1000. By default the value of 100 is used. This parameter corresponds to the «$bathMDTemp» keyword of the control file.
- Modelling Timestep, ps: specifies the intergration timestep in picoseconds. Acceptable value range is from 0.0001 to 0.002. Recommended value range is from 0.001 to 0.002. By default the value of 0.001 is used. This parameter corresponds to the «$mdStepTime» keyword of the control file.
- Modelling Step Count: specifies atomic movement equation integration step count for the molecular dynamics method with the current integration timestep specified by previous parameter. The minimum value is 1 step. By default the value of 1000 steps is used, however one is very likely to raise it. This parameter corresponds to the «$runMDnstep» keyword of the control file.
- Structure Save Interval, in steps: specifies the number of steps after which the molecule's structure snapshot is written to files named like MolName.molMdRes0001.pdb ... .0xxx.pdb. The minimum value is 1 step. By default the snapshot is written every 100 steps. This parameter corresponds to the «$nwtra» keyword of the control file.
- Simulated Annealing: allows to perform a molecule optimization using simulated annealing method. If this mode is enabled a simulated annealing protocol must be created (see below). By default this mode is disabled. This parameter corresponds to the «$MDSA» keyword of the control file.
- Annealing Protocol: allows to create a simulated annealing protocol in a separate editor. One line of the protocol contains 5 parameter values:
- modelling time steps, or time in integration steps;
- temperature of the molecule which it approaches to during the molecular dynamics simulation of specified duration;
- weight coefficient for the repulsive branch of the van der Waals potential;
- weight coefficient for the hydrogen bonds between atoms in the aminoacid residues' backbone;
- weight coefficient for the hydrogen bonds between side chain atoms and any other atoms.
Simulated annealing protocol is stored in the separate input file that is added to the archive to be downloaded by the user. The file name is passed to the HBDock using «-sa» command-line argument.
- Hydrogen Bond Factor: specifies maximum energy of hydrogen bonds (in kcal/mol). Acceptable value range is from 0.0 to 5.0. By default the value of 2.0 is used. This parameter corresponds to the «$hBond128» keyword of the control file.
3. Input File Download
At this step one is proposed to select which files generated should be packed into archive and supplied for download to the client side. By default all files are included. If a certain file for some reasons is not needed, one may select “No” on the right side of its name, and this file will not be included into an archive. Here is a list of files being generated on this step.
- Corrected PDB file – PDB file of a protein uploaded to the server at the first step of this module.
- Control file – file containing preparatory modeling settings selected at the second step of this module.
- Segment definition file – file is created only if certain molecule’s fragments were selected to be optimized.
- Annealing protocol – file is created only if simulated annealing was enabled in the settings.
- Command file – file containing system commands that ensure an execution of a preparatory modeling and a subsequent docking calculation.
Files are packed into a tar.gz archive which is familiar to the most non-specialized archivers and must be extracted into project directory (the same place where archives of other modules were extracted to; there should be no name conflict).
1. Ligand PDB file check
A first step of this module is similar to protein PDB file check step of the first module (see above). Hovewer in this case a PDB file must contain a low molecular ligand structure definition, therefore some error checks are skipped. After the check is done one is suggested to select a number of hydrogen atoms to add to «heavy» atoms according to their valence.
2. Setting up Parameters
At this step one is proposed to set up parameters for ligand’s topology calculation which is necessary for tweaking its structure and creation of a topology file used on docking calculation step.
- Ligand name: name of ligand residue in a PDB file (no more than three symbols).
- Add hydrogen atoms: a number of hydrogen atoms specified at previous step is added to «heavy» atoms of a ligand. Existing hydrogen atoms are ignored. This option is usually enabled.
- Rename atoms: change atom names to meet the names in a standart library. This option is usually enabled.
- Total charge: a total charge of ligand molecule. Leave 0 here if molecule is not charged (neutral).
3. Input File Download
Like at the third step of first module, one is proposed to select which files generated should be packed into archive and supplied for download to the client side. Here is a list of files being generated on this step.
- Corrected PDB file – PDB file of a ligand uploaded to the server at the first step of this module with a number of hydrogen atoms to add to «heavy» atoms added to each line describing such atom.
- Control file – file containing topology calculation settings selected at the second step of this module.
- Command file – file containing system commands that ensure an execution of a topology calculation and a subsequent docking calculation.
1. Setting up File Names
At this step it is required to specify original PDB file names of a protein and a ligand, which were used in previous modules. File names must not contain program-specific postfixes used in names of derivative PDB files. These postfixes are added by the module automatically. Please take notice of the fact that postfixes in real file names will appear only if you use command files generated by previous modules for a preparatory modeling and ligand topology calculation. Thus all modules are connected with each-other and command files should be executed sequentially.
Generally speaking there is no difference which command file – from the first module or from the second module – to execute at first, but docking calculation must be executed finally. However it is recommended to keep to the order of command file execution according to the numeration of modules.
2. Setting up Parameters
At this step one is proposed to set up parameters for docking calculation. Most of them are identical to parameters of a preparative modeling of a protein. They are already described above and will not be reviewed again. Only parameters specific to docking calculation are described below.
- Docking type: allows to select the most suitable docking calculation mode. Ony may choose either blind or single-position docking, and also a different precision (along with duration) of calculation (rough, medium, precise).
- Annealing protocol: allows to select one of predefined annealing protocols for docking. Length of a protocol determines precision (and duration) of calculation.
- Atomic softness: tweaks a «softness» of a van der Waals potential for small (contact) distances between atoms from 0 (very soft) to 1 (standard potential). Using a «soft» van der Waals potential helps to optimize «bad» structures with a lot of spatial interatomis collisions.
3. Input File Download
Like at the third step of first and second module, one is proposed to select which files generated should be packed into archive and supplied for download to the client side. Here is a list of files being generated on this step.
- Control file – file containing docking calculation settings selected at the second step of this module.
- Annealing protocol – one of predefined annealing protocols selected at the second step of this module.
- Command file – file containing system commands that ensure an execution of docking calculation.
Before running a command file generated by the GUI-HBDock, one must install the HBDock, the actual computing program. It can be done using «Download HBDock» menu item by clicking the provided download link. User's operating system will be detected automatically. Windows users will be prompted to download an archive with the HBDock for Windows, and Linux users will be prompted to download Linux-version of the program respectively. The program's setup information for the proper operating system is also shown.
For Windows operating system, setup process is the following:
- Download the program distributive.
- Install it into some directory, e.g., C:\HBDock.
- Add this path to the PATH environment variable, and also create another environment variable MDYN09HOMED and assign this directory to it with addition of a trailing slash and «dat» directory («\dat»). It can be done from the Control Panel («System» > «Advanced» > «Environment Variables»).
In Linux operating system one must do the following:
- Download the program distributive.
- Install it into some directory, e.g., /home/yourname/HBDock.
- Add this path to the PATH environment variable, and also create another environment variable MDYN09HOMED and assign this directory to it with addition of a trailing slash and «dat» directory («/dat»). It can be done by editing «.bash_profile» file (if you use Bash command processor). File may look like this:
#!/bin/sh
MDYN09HOMED=/home/yourname/HBDock/dat
PATH=$PATH:home/yourname/HBDock
- Make sure that the file «.bash_profile» is executable (chmod +x .bash_profile). After changes you should restart your command processor.
In the «Links» section administrator may publish links to any Web‑pages, that in his opinion may be useful for the GUI-HBDock program users. It must be noted that these links are not in the direct contact with the program, and the program's developers take no responsibility for the contents located on the pages referenced by these external links.