Tinkergpu:Tinker-tut

From tinkergpu
Jump to: navigation, search

Purposes

This page provide tutorials on using AMOEBA force field via Tinker (CPU) and Tinker-OpenMM (GPU) programs.

 

 

Download & Install

Tinker GPU (OpenMM)

See Tinkergpu:Openmm-install for instructions

For a quick guide, see Dr. CW Liu's instrcution http://leucinw.com/posts/2020/03/Tinker-OpenMM/

 

Tinker CPU

For Tinker CPU code, download from github: https://github.com/TinkerTools  (git clone recommended)

Then follow the compilation instructions:

https://biomol.bme.utexas.edu/tinkergpu/index.php?title=Poltype:Poltype#Notes_about_compiling_tinker_8_with_OpenMP 

You may also compile Tinker for Windows Subsystem for Linux:

Tinkergpu:WSLinstall

Tinker with APBS

Notes created by Rae Corrigan (2020)

https://docs.google.com/document/d/1c2r4DJq20rXc3n15jS-KSh6P6Z9Uz3LT642kB8EG5Ag/edit?usp=sharing 

Tutorials

Some simple tutorials about Tinker I used in my teaching below. Note you can directly download tinker executables from https://dasher.wustl.edu/tinker/ (Windows, mac. linux) for the following tutorials.

You can either use command lines in Windows CMD window or Linux terminals, or use FFX interface for some exercises. Actually applications always use command lines to operate in Linux OS. If you are not familiar with Linux, this is the oppurtunity to practice.

 https://biomol.bme.utexas.edu/~pren/courses/tinker-tut/

Below you will find more detailed instructions and discussions.

Input files

You can find examples of input files in tinker distribution under tinker/bench, tinker/example or tinker/test. Two files are required to run TINKER calculations: *.xyz and *.key For example, open butane.xyz and butane.key in tinker/example/ (available in all official distributions) to see what’s inside.

*.xyz file

  • The first number on the 1st line is how many atoms total.
  • There may be a second line that specify the box dimensions if the system is periodic (newer format since tinker 6)
  • The first column is the atomic index
  • The second column is the atomic symbol
  • The 3 -5th columns are x,y,z coordinates in Angstrom
  • The 6th is the “atom type” defined in the *.key file. This is the index tinker uses to assign parameters from the key/parameter file.
  • The 7th – last columns are lists of atoms that are connected to the current atom

 

 

 

 

*.key file

The key file may have all the actual parameters or a link to the actual parameters file specified in the first line. The parameters specify the bond, angle, torsion, vdW and electrostatic interactions between atoms based on the “atom type”. If you see an error related to OMP, please set the OPENMP-THREADS in the key file to a number less than the # of CPU cores on your computer. This sets how many CPU cores are used in the parallel execution.

For example, in protein.key below, borrowed from tinker/bench/bench7.key, the first line specific the actual parameters are contain in the amoebapro13.prm. More examples can be found in tinker/bench, tinker/example, or tinker/test.

 

parameters           $TINKERDIR/params/amoebapro13.prm
# 
# the amoebapro13.prm is the AMOEBA protein force field in tinker/params/. If you have a ligand you can add the parameters below. More info below
#
# verbose                                  # printing info for every step, for debugging mostly
#
#randomseed            123456789
integrator            respa                #this multi-time step integrator allows TINKER to use 2 fs time step
#HEAVY-HYDROGEN                    #This will increase H atom mass automatically so 3 or 3.5 fs time step can be used.                          
                                   #Kinetics will  be affected               
neighbor-list                              # this below requires your box is twice the cutoff plus 2-3 Ang.
                                          # If your box is too small for vdw cutoff but OK for Ewald, you can use "mpole-list" here.
#openmp-threads    16                       # how many core you want to use on the node.
#
#  Define the Periodic Box and Cutoffs
#
a-axis                62.23
vdw-cutoff            12.0
vdw-correction
#
#  Set Parameters for Ewald Summation
#
ewald
ewald-cutoff          7.0                  # we use such small cutoff because dipole/quadrupole die off faster than point charge.
# pme-grid  64 64 64                         #If speed is a concern, set this mannually to be slightly bigger than simulation box size. E.g. use 64 is box is 62.23 (1.2x is the best but slower). 
                                           Must be even with factors of only 2, 3 and 5. see source/kewald.f for a list. Sometimes the default grid size is conservative 
                                  

#fft-package           FFTW
#
#  Set Parameters for Induced Dipole Convergence
#
polarization OPT3                         #OPT4 is mre accurate but OPT3 is faster
#polar-eps             0.0001                # the induced dipole convergence threshold
#polar-predict
#
# Example of overwriting the parameters in the .prm file. The bond parameters between 
# atom classes 1 and 4 are redefined below, which
# will overwrite those already in the amoebapro13.prm. 
#this is only for illustration purpose
# bond          1    4          200.00     1.1

How to generate input files of your own

proteins, nucleic acids, common organics

In tinker/params, you can find pre-existing parameters for certain molecular systems, both AMOEBA, amber, charmm, opls, mmff, and mm2/3.

 

PDB to xyz

If you have a pdb file you can convert it tinker xyz file by specifying a prm file above. It will remove the heteroatoms such as ligand, which you can use POLTYPE to generate parameters for.

First you need to determine the protonation state of charged residue. Tools like propka can do this quikly: http://nbcr-222.ucsd.edu/pdb2pqr_2.0.0/

Once you decided the protonation state, make sure the residue name in PDB file match the protonation state (list below). For example, ASH is the neutral form of ASP. Then you can run "pdbxyz.x"

'GLY' 'ALA' 'VAL' 'LEU' 'ILE' 'SER' 'THR' 'CYS' 'CYX' 'CYD' 'PRO' 'PHE' 'TYR' 'TYD' 'TRP' 'HIS' 'HID' 'HIE' 'ASP' 'ASH' 'ASN' 'GLU' 'GLH' 'GLN' 'MET' 'LYS' 'LYD' 'ARG' 'ORN' 'AIB' 'PCA' 'UNK'

UNK means unownk; AIB, ORN and PCA  are modified AA.

https://www.ebi.ac.uk/pdbe/entry/pdb/3q9g/modified/ORN 

http://www.ebi.ac.uk/pdbe/entry/pdb/4ua7/modified/PCA

Terminal residue names:

COH ACE NH2 NME  FOR

HETATM (residue name) recoginzed by pdbxyz:

HOH K CA MG NA CL

AMOEBA library

For AMOEBA, please use amoeba09.prm for common small molecules, amoebapro13.prm for proteins. Nucleic acid parameters coming soon (end of 2017). More information here: http://biomol.bme.utexas.edu/tinker-openmm/index.php/TINKER-OPENMM:Amoeba

If you have a protein, you use "pdbxyz" to convert it to tinker xyz file. It will ask you for a key file (1bty.key below) or you can create a key file with "parameters $TINKERDIR/params/amoebapro13.prm" in it:

pdbxyz 1bty.pdb -k ./1bty.key

AMOEBA for a new ligand

Note that pdbxyz recognize proteins, water (res name HOH and some ions). The ligand (benzamidine above) is stripped. For that you need to derive your own parameters. For AMOEBA this can be done using POLTYPE: https://biomol.bme.utexas.edu/wiki/index.php/Research_ex:Poltype

Input: ligand.sdf or ligand.pdb (for example, you can truncate the ligand "BEN" out of the 1bty.pdb) 
Output: will produce the xyz and corresponding key files (ttt.xyz and ttt.key):

If you will merge the ligand xyz file with another molecule, make sure you set the suitable range for atom types (an option for pOLTYPE) so that they won't overlap. See the "Check the results section" on POLTYPE website before using them. Use "analyze.x" to make sure the xyz and key files work correctly: "analyze.x ttt.xyz -k ttt.key ep"

 

 

build a solvent box

arbirary solvent

Use xyzedit.x to build a water (or any solvent) box starting from a water monomer (e.g. tinker/test/water.xyz). You can specify many monomers to add and how big the box is. Type xyzedit.x at the command line and you will be asked to enter relevant inputs (or you may type all the parameters in one line).

Make sure edit the key file to add box size and Ewald related keywords (see http://biomol.bme.utexas.edu/tinker-openmm/index.php/TINKER-OPENMM:Tinker-tut#.2A.key_file)

 

prebuilt water box

Some prebuilt waterboxs: http://biomol.bme.utexas.edu/~pren/downloads/waterbox

Larger boxes can be created as supercells of smaller boxes, e.g. this following command will create a box 64x larger than the orignial box.

#!/bin/bash
echo -e "5\n4\n\n" | crystal.x watersmall.xyz -k tinker.key

Combining two xyz files

Tinker "xyzedit.x" program (option 20) can be used to combine two xyz files into one (matching key file for each xyz is required). You can merge the key or parameter files by appending one to the other, but please make sure the atom types are not overlapping between the two.

For the above 1BTY example, poltype will translate/rotate the ligand in standard orientation. You may need a script to copy the coordinates from ligand.pdb into ttt.xyz (hwich you get out of the POLTYPE run).

Then you can use xyzedit.x to combine 1bty.xyz and lig.xyz into one xyz file with the two molecules orient/position as in PDB. Merge the key file by appending ttt.key (except the first line which contains header already in amoebapro13.prm of 1bty.key) to 1bty.key; again avoid overlapping atom types between the ligand and protein.

 

 

 

 

Soaking solute in solvent

To soak a molecule/complex in a box water, you can use this utility: http://www.ime.unicamp.br/~martinez/packmol/home.shtml

Alternatively, tinker "xyzedit.x" program (option 20) can be used for this as well. You just need a molecule.xyz and waterbox.xyz along with matching key files (actual names do not matter). One can make a water box of different size using this program too (starting from one water or a cluster of water and use option 19). Option 21 can add ions to solvent box.

After construct the simulations, relax the system using minimize and dynamics, with proper restraints and heating. See Tinkergpu:Tinker-tut#MD setup for details

Modified PRO & NA residues

Build a structure with modified residues

Modify residues in a PDB file.

Input: PDB files of the biopolymer, an original residue and a modified residue. The original residue must be present in the biopolymer, and the modified residue must share at least 3 atoms with the original residue with identical coordinates. The modified residue is usually obtained by manually modifying the original residue.

Output: A PDB file with specified residues modified to the new residue. link

# change residues 2 and 14 in dna15.pdb from res_A to res_pA
morphling.py -i dna15.pdb -o pdna15.pdb -t0 res_A.pdb -t1 res_pA.pdb -n 2,14


How to run Tinker9 using GPU

To run tinker9 is almost the same as to run the canonical tinker. After compilation, an executable called tinker9 exists in the build directory. Here is the help information if you directly execute ./tinker9

 SYNOPSIS
       tinker9 PROGRAM [ args... ]

 PROGRAMS AVAILABLE
       analyze
       bar
       dynamic
       info
       minimize
       testgrad
       help

So it is just adding tinker9 in front of the canonical tinker run commands.

Here are the compiled executables ready to use on our clusters.

/home/liuchw/Documents/Github.leucinw/Tinker9/build/tinker9 #this one runs on 3070/3080 cards
/home/liuchw/Documents/Github.leucinw/Tinker9/build_2/tinker9 #this one runs on the rest of the cards (not all of the cards have been tested)

!!! Note: as of today (Feb 22, 2021), tinker9 has to be compiled on an older machine (CPU) in order to generally run on newer machines. An alternative is to build an executable on each machine. 

About the compilation of Tinker9, please refer to the successful build on the GitHub site (https://github.com/TinkerTools/tinker9/discussions/121)

 

How to run Tinker and tinker-openmm (GPU)

Command line

Tinker programs can be run interactively, which is the best way to learn what are the required inputs. Tinker programs can also run in background with all parameters specified, for the purpose of automation:

analyze.x ttt.xyz -k ttt.key ep
dynamic.x bench7 100000 3.0 6.0 2 298.0 N > ben7.log &
# the bench7.xyz above can be found in tinker official distribution inside tinker/bench/
#2-fs time step for MD here is ok because of the "integrator respa" in the key file

Or for tinker-openmm:

 export CUDA_DEVICE_ORDER=PCI_BUS_ID
 export CUDA_VISIBLE_DEVICES=0 # device number, 0, 1...is the 1st, 2nd...
 #source cuda, tinker-openmm, gcc related env var (see tinker-openmm install)
 nohup dynamic_omm.x bench7 100000 3.0 6.0 2 298.0 N > ben7.log &

Pause and resume MD

Tinker creates .dyn file that contains coordinates and velocities needed for restart MD. To stop MD, simply create a .end file (e.g. touch myrun.end) in the folder where MD is running. At the next time MD frame was written, the end file will signal Tinker to stop. To resume later simply rerun dynamics with the presence of the .dyn file. Note the output file of dynamics does not resume the count of MD steps/frames.

Additional notes for "tinker-openmm"

  • It s recommended to use the respa inetgrator and 2-fs time step
  • "heavy-hydrogen" in key file allows a 3-fs time step
  • Default and only thermostat is Anderson
  • Only MC barostat is available for now. We are adding virial/Langevin piston pressure to openmm.

How to specify different ensembles for MD simulation

For TINKER-openmm

Not many options available. 

For NVT, use Bussi for thermostat, RESPA integrator (2fs)

For NPT, use Montecarlo for Barostat, Bussi thermostat. Verlet (1 fs) is safer than RESPA for use with MC barostat if very accurate density is needed.

See below for keyword syntax.

Available thermostat and barostat in TINKER (2/4/2016 Ponder)

Thermostats:

METHOD                     KEYWORD (lines you add to .key file)
Bussi-Parrinello           thermostat bussi
Berendsen                  thermostat berendsen
Andersen Stochastic        thermostat andersen
Nose-Hoover                thermostat nose-hoover

There are only two barostats available via the “barostat” keyword:

METHOD                     KEYWORD
Berendsen                  barostat berendsen
Monte-Carlo                barostat montecarlo

The above thermostat and barostats are available for the Verlet, Beeman and RESPA integrators, and can be used in any combination with those integrators. These integrators are available via:

METHOD                     KEYWORD
Verlet                     integrator verlet
Beeman                     integrator beeman
RESPA                      integrator respa

Note that the defaults are Bussi for thermostat, Berendsen for barostat, and Beeman for integrator.

Then there are two special integrators, a stochastic one, and a Nose-Hoover that does NPT. The stochastic integrator uses a kind of Langevin temperature bath for thermostating, and does listen to the barostat keyword. The Nose-Hoover integrator uses a separate code branch and does only NPT with Nose-Hoover methods following Martyna-Tuckerman-Klein. You can get these via:

METHOD                      KEYWORD
Stochastic                   integrator stochastic (no need for other T control)
Nose-Hoover NPT             integrator nose-hoover (no need other keywords for T or P; starting structures need to reasonable)

Recommended keywords (add somewhere in the .key file) for NVT

integrator respa
thermostat bussi

Command:

dynamic xxx.xyz 100000 2.0 1.0 2 298  (100000 steps, 2.0 fs time step, dump structure every 1.0 ps, option 2 is NVT, 298 is the target T)

It is also possible to combine "integrator Beeman" or "thermostat Berendsen” or "thermostat Andersen". But RESPA allows large time steps (2.0 or 2.5 fs) than Beeman. Berendsen thermostat does not provide canonical ensemble fluctuation.

 

Recommended keywords for NPT

Integrator nose-hoover # this will trigger also nose-hoover P and T control

Note with Nose-Hoover you can not use a time step bigger than 1. 0.5fs is the most stable

Good alternative: Berendsen produces correct average but not "correct" ensemble fluctuation (see 2003 water paper)

integrator stochastic
thermostat bussi
barostat berendsen

Command:

dynamic xxx.xyz 100000 2.0 1.0 4 298 1.0 (option 4 is NPT, 298 is T and the last 1.0 is the target pressure 1atm)

Barostat Berendsen is available and useful for equilibrate system but again not recommended for production since the ensemble fluctuation is incorrect.

NVE

integrator respa

command:

dynamic xxx.xyz 100000 2.0 1.0 1

No need for thermostat or barostat of course. It is best to use smaller time step such as 1.0fs to conserve energy better. May even use smaller polar-eps (10^-6) than default (10^-5) in the key file.

 

 

Non periodic system, e.g. gas molecules not in a box

integrate stochastic

If there is no box (a-axis) in the key file, or no box dimensions in the den file, the system is non-periodic. This will set the stochastic temperature control along with the stochastic MD integration. For gas phase molecular cluster (very few atoms), the recommended time step is 0.1 fs.

 

Free energy calculations

MD setup-initial equilibration

  •  Determine protonation state of ionizable groups (ASP, GLU, LYS, ARG, HIS). Use propka here http://nbcr-222.ucsd.edu/pdb2pqr_2.0.0/ for proteins. For ligand, you may use pka predicction tool from Chemaxon.
  • Add water and counter ions (neutralize system) so that the density is ~1.0 g/cc. The distance between protein and box wall should be 10-15 A. Check/remove extra water in the binding pocket as necessary (keep the crystal water molecules in the pocket)
  • In the key file, set PME-grid to be 1.2x box size in Ang. For example, if box size is 55, “pme-grid 64 64 64” is enough. The default of Tinker is usually more conservative. See kewald.f for allowed grid values. Add “neighbor-list”, “polar-eps 0.0001”, “vdw-cutoff 12”, "vdw-correction" “integrator respa”, “ewald”, “ewald-cutoff 7.0”  to .key file.
  • Minimize the box before MD. If you see errors related to polarization (induced dipole not converge), do this in two steps: first minimize with electrostatic (multipoleterm NONE) and polarization turned off (polarizeterm NONE in .key file), to ~5.0 or lower; then minimize again with ele then ele+polarization back on to ~2.0 or lower. You may use position-restraints if you don't want your solute to undergo dramtic changes. (RESTRAIN-POSITION -1 200 50.0, means restrain atoms 1 to 200 using a force constant of K=50 kcal/mol). 
  • MD relaxation. You may use GPU for MD now (dynamic_omm.x). Use repsa integrator and 2fs time step. Add position-restraints to restrain protein & ligands in the key file initially; For metal ion-protein binding or weak binding ligand, we also suggest to use ~3 distance restraints between ion/ligand and first shell atoms. This is to prevent water disrupt the initial solute structure during the equlibration. 
    • Use 2 fs for all MD below. If you want to use 3fs (e.g for large systems), you need to "heavy-hydrogen" in the key file. Another way to spede up is to use the OPTx: "polarization OPT4" or "polarization OPT3" OPT3 is faster but bigger error.
    • With the positional/distance restraints, run ~2ns  NVT MD to gradually (e.g. exponentially) heat up the system from 10K  to 298K or whatever expt T should be. Water/ions are relaxed after this step.
    • Run NVT at 298K or expt T for ~2ns while gradually turning off all the position and distance restraints on protein-ligands.
    • Run 1.5 ns NPT to compute the average density/box size (ignore the first 300ps). For GPU, only "barostat MonteCarlo" is available. 
    • NVT MD for ~2ns with box fixed at the average lengths from above.
    • Check protein RMSD (esp. around binding pocket) from crystal structure after every step above. If any step gives large RMSD, redo that (and previous step) with longer/slower MD to correct the problems.
  • MD production run. For alchemical free energy, this involves setting the ligand group and various lambda values for ele and vdw to scale the interactions between ligand and surrounding (see below BAR section).

 

BAR

Alchemical free energy calculations are available in TINKER and TINKER-OpenMM. One needs to specify the ligand or solute in the key file (example below). The lambda scaling schedule can be specified by user, automated by the "bar.x" in TINKER. See this reference for examples: J Comput Chem. 2017 Sep 5;38(23):2047-2055

For host-guest, one can also apply a bond restraint between host and guest is graduately turned on when lambda goes from 1 to 0. A correction is needed for this restraint and standard volume.Tinker/utiity/freefix.f can be used to caclulae the correction.

*The correction for lambda=0 from RTln(C0*V), where C0=1/1660 A^3 and V=integrate{4*pi*r^2*exp[-k*(r-r0)^2/RT]dr}. k=15 kcal/mol in the example below. If you are using a harmonic restraint (example below) and the equilibrium r0 is not 0, numerical integration is necessary. A good reference is JACS v126, NO. 24, 2004.

If you use distance restraint when lambda =1 for both ele and vdw, you can remove the effect of restraint from FEP or BAR; or you can avoid this correction if the restraint strength is set to 0 when the lambda =1 (gradually turned on when lambda-> 0 for both vdw and ele).

ligand                  -1 24, 26    #negative means range: atoms 1 to 24 plus 26
vdw-lambda              1.00
ele-lambda              0.50
GROUP 1 -1 24
GROUP 2 -25 150
RESTRAIN-GROUPS  1  2  15.0  0 2.0
#lower bound & upper bound do not have to be the same

Typical lambda schedule (ele, vdw): (1.0, 1.0), (0.9 1.0)....(0, 1.0), (0, 0.9)....(0, 0)

Each set correspond to one separate MD simulations of 5-10ns or longer. Save MD frames at every 3-5 ps. Use bar.x (bar_omm.x for GPU) to analyze the free energy dG between the neighboring lambda values.

Some scripts fron CW: https://github.com/leucinw/ComputTools/tree/master/bardemo 

hydration and binding free energy examples

Please unzip the files and see README for instructions.

OSRW

Only implemented in tinker CPU. GPU version is under development.

Visualization

Force Field Explorer

By Mike Schneider and Jay Ponder, http://dasher.wustl.edu Can visualize the xyz and arc (MD trajectory) files; create and start TINKER calcualtions

VMD

Choose TINKER format when open a xyz file. Trajectory file (.arc) also works.

Pymol

Sometimes the xyz file can not be displayed correctly

Other resources

https://sites.google.com/site/biomolsimstw/workshop-materials/tutorials

Please email me if you have tutorials related to AMOEBA or Tinker you would like to share.

 

 

 

AMOEBA Force Field Papers:

AMEOBA FF:

  1. water model: Ren, P. Y.; Ponder, J. W., Polarizable atomic multipole water model for molecular mechanics simulation. Journal of Physical Chemistry B 2003, 107 (24), 5933-5947.
  2. Small molecules: Ren, P.; Wu, C.; Ponder, J. W., Polarizable Atomic Multipole-based Molecular Mechanics for Organic Molecules. J Chem Theory Comput 2011, 7 (10), 3143-3161.
  3. Proteins: Shi, Y.; Xia, Z.; Zhang, J.; Best, R.; Wu, C.; Ponder, J. W.; Ren, P., The Polarizable Atomic Multipole-based AMOEBA Force Field for Proteins. J Chem Theory Comput 2013, 9 (9), 4046-4063.
  4. DMP/TMP/Base & nucleic acids: 
  •      Zhang, C.; Lu, C.; Wang, Q.; Ponder, J. W.; Ren, P., Polarizable Multipole-Based Force Field for Dimethyl and Trimethyl Phosphate. J Chem Theory Comput 2015, 11 (11), 5326-39.
  •     Zhang, C.; Bell, D.; Harger, M.; Ren, P., Polarizable Multipole-Based Force Field for Aromatic Molecules and Nucleobases. J Chem Theory Comput 2017, 13 (2), 666-678.
  •     Zhang, C.; Lu, C.; Jing, Z.; Wu, C.; Piquemal, J. P.; Ponder, J. W.; Ren, P., AMOEBA Polarizable Atomic Multipole Force Field for Nucleic Acids. J Chem Theory Comput 2018, 14 (4), 2084-2108.


AMOEBA+ model (water only, new model)

  1. Liu, C.; Piquemal, J. P.; Ren, P., AMOEBA+ Classical Potential for Modeling Molecular Interactions. J Chem Theory Comput 2019.
  2. Liu, C.; Piquemal, J. P.; Ren, P., Implementation of Geometry-Dependent Charge Flux into the Polarizable AMOEBA+ Potential. J Phys Chem Lett 2019, 11, 419-426.

More Explicit Free Energy Calc Steps - Brandon Walker

Old/obsolete description here: Obsolete description here