Research ex:Poltype

From renlab
Jump to: navigation, search

Readme First

POLTYPE site has been moved to (as of 2019) 

  • This typer uses QM calculations to obtain electrostatic multipoles and some valence parameters (bond length etc), and a database to assign some valence parameters (force constants) and non-bonded parameters such as vdW and polarizability. While the electrostatic parameters follow the AMOEBA protocol, valence parameters should also be reasonable, the vdW and torsion parameters are only estimates. Should you need accurate parameters for simulations etc please pay attention to the torison and vdW parms.
  • For large molecules (e.g. >100 atoms), one may need to split the molecule into small model compounds, derive parameters and merge them. When manually merging multipoles at boundary, it is recommended to use POTENTIAL program in Tinker to re-optimize these boundary atomic multipoles. Automation of this progress is under way.
  • Check the log file to make sure the QM electrostatic potential, molecular dipole and quadrupole moments are well reproduced by atomic multipoles. Typically molecular moments should be automatically reproduced by atomic multipoles unless for some highly symmetric molecules for which the program may pick wrong local frames and force certain multipole components to 0. This can be remedied by running poledit.x of Tinker and change local frame definitions for specific atoms and/or adding keywords such as "target-dipole x y z" to tinker.key file before re-running ESP fitting.
  • For high accuracy, further QM should be used to adjust valence force constants (valence program in TINKER) and torsion parameters; and liquid simulations (ForceBalance) to refine vdw parameters, especially if the functional group has not been covered by amoeba09.prm or amoebapro13.prm.
  • Example (recommended) usage (the last option turns on rigorous torsion fitting): -s *.sdf -n 8 -m 6000MB -M 120GB --do-tor-qm-opt --espbasisset=aug-cc-pvtz
  • For alchol, the quadrupole on O and H should be mannually scaled by 0.6. This only applies to OH that connect to sp3 Carbon. Similarly for NH in amine (that connects to a sp3 C), scale the Q by 0.75 or 75%. See JCC 2011 32(5):967-77.
  • The default QM basis set for ESP fitting is MP2/6-311++G2d,2p. To use aug-cc-pvtz, add option --espbasisset=aug-cc-pvtz and use, where the ESP fitting convergence is changed from 0.5 in to 0.1:
 ...." + qmesp2fname + " N  0.5 "
See the example command above.
  • Halogen parameters:
    • Mono Cl vdw parameters updated according to JPCB 2014, 118 (24), pp 6456–6465.
    • Can NOT detect multiple Cl and assign the corresponding parameters yet. Please annually update vdw parameters according to the paper.
    • F, Br, I parameters are merely educated guesses, not yet carefully parameterized & validated like Cl.
    • F vdw parameters for fluorobenzene should be manually changed to 3.2020 0.2500. For details see /users/pren/CCP/FLUORO/
    • Cl vdw parameters for o-dicholorobenzene should be manually changed to 3.860 0.446.
  • To use with amber, you need tinker v4.3 format by using the proper option (see below). For details read here
For instruction about using AMBER,


Wu, J.C., G. Chattree, and P. Ren, Automation of AMOEBA polarizable force field parameterization for small molecules. Theoretical chemistry accounts, 2012. 131(3): p. 1138.

Check your results

While we continue to improve POLTYPE, currently there are "tough" molecules for POLYTYPE to handle. Lots of information are printed in the output files for you to check (automation is under way)

  • and ttt.key are the resulting structure and parameter files you will need.
    • Check log file to make sure all calculations actually complete successfully!
    • If you have a charged molecule, read a section below on how to make sure the correct charge is reflected in your input structure.
    • For molecules of high symmetry, please check frame definitions and alter them as needed using POLEDIT and POTENTIAL command in Tinker to change frames/combine multipoles and refit to ESP. See here for common frames AMOEBA uses:
    • The molecular dipole moments should match between QM and MM
    • The ESP fitting RMSE should be small
    • Check the torsion plots in qm-torsion folder.
    • if you are concerned with vdw parameters, you may check again QM dimer (your_molecule-water) structure and energy.
  • obsolete:
    • This is updated in 1.1.4, the output is now consistent with amoeba09.prm and amoeba2013.prm. The following is no longer necessary.
The previous output tinker files are in tinker v6 (old) format - please watch out opbendunit headers (opbendunit  0.02191418)
when mixing with build-in parameters: the opbend force constants need to be scaled up by 71.94 before combining with amoebapro13.prm which 
does NOT use "opbendunit  0.02191418" in the header

Programming Documentation

Download and Installation

Current version (1.1.4) (works with tinker 6.2. ) Updated July 2015 Poltye from github

Previous version (1.1.2) (works with tinker 6.0, not 6.1/6.2): Poltype

Quik notes about compiling tinker 6.x with openMP: 

Use intel ifort v12 or above (set this up in .bashrc or .cshrc)
For csh ass this line is .cshrc and source it : source /opt/intel/composer_xe_2013.2.146/bin/compilervars.csh intel64
compile fftw by following 0README in tinker/fftw/ (I replace gcc and gfortran with icc and ifort)
cd tinker/source
 For latest TINKEr 7.x, once you set F77 to be ifort, you can run "make -j 6 all"
 If the makefile doesn't work for you, use compile.make, library.make, link.make
 For tinker 6.x, you may need edit Makefile (from tinker/make/) for your OS and compiler (see options below)
 you may also need to update the dependency at the bottom of makefile by running depend.make in source
 make -j 6 all

Use these options in Makefile

F77 = ifort
LIBS = -L$(TINKERDIR)/fftw/lib -lfftw3_omp -lfftw3
F77FLAGS = -c -assume cc_omp   
OPTFLAGS = -O3 -mia32 -debug -no-ipo -no-prec-div -openmp -traceback 
LIBFLAGS = -crusv
LINKFLAGS = $(OPTFLAGS) -static-intel

This works too:

 F77 = ifort
LIBS = -L$(TINKERDIR)/fftw/lib -lfftw3_omp -lfftw3
F77FLAGS = -c -assume cc_omp  
OPTFLAGS = -O3 -no-ipo -no-prec-div -openmp -traceback 
LIBFLAGS = -crusv
LINKFLAGS = $(OPTFLAGS) -static-intel


Previous version (1.1.1) (works with tinker 5) Poltype
</br> Previous version (1.1.0) (works with tinker 5) Poltype

Babel/Python binding installation

$ tar zxf openbabel-2.3.0.tar.gz   # (this creates openbabel-2.3.0)
$ mkdir build
$ cd build
$ cmake ../openbabel-2.3.0 -DPYTHON_BINDINGS=ON -DCMAKE_INSTALL_PREFIX=/opt/openbabel-2.3.0
$ make
# make install

Environment variables

The following environment variables (with the appropriate paths) need to be exported. They may be saved in your ~/.bashrc or a ~/.cshrc appropriate format may be used:

cshrc example:

set path = ( $path /opt/software/CHEM/openbabel-2.3.0/bin)
setenv PYTHONPATH /opt/software/CHEM/openbabel-2.3.0/lib:/opt/software/CHEM/openbabel-2.3.0/lib64/python2.6/site-packages
setenv GDMADIR /opt/gdma-2.2/bin
setenv LD_LIBRARY_PATH /opt/software/CHEM/openbabel-2.3.0/lib
setenv OPENBABEL_INSTALL /opt/software/CHEM/openbabel-2.3.0
setenv TINKERDIR=/users/wuch/tinker/source
plus those for Gaussian...


$ export PYTHONPATH=/opt/openbabel-2.3.0/lib/python2.6/site-packages:$PYTHONPATH
$ export GDMADIR=/opt/gdma2.2/bin
$ export LD_LIBRARY_PATH=/opt/openbabel-2.3.0/lib:$LD_LIBRARY_PATH
$ export TINKERDIR=/users/wuch/tinker/source

You may need to add intel compiler/lib to your path to run tinker commands. You can try a TINKER command first to make sure it can run.



-s, -n, -M, -m are all required flags to run the program

-s,--structure  -- Initial structure (i.e. SDF, MOL2, PDB, etc.) (required)
-n,--numproc   -- Number of processors to use for QM calculations (default: 1)
-m,--maxmem -- Maximum amount of memory to use for QM calculations (default: 700MB)
-M,--maxdisk   -- Maximum amount of diskspace to use for QM calculations (300GB or larger recommended for molecules with ~30,40 heavy atoms) (default 100GB)
-a,--atmidx      -- Starting index of atom type (default: 401)
--omit-torsion -- Skips QM torsion scan calculations around rotatable bonds
--omit-torsion2, and a file *.toromit which lists the specific torsion scans you would like to skip
--do-tor-qm-opt, during torsion scanning, use qm to optimize the molecule at each angle (by default this is off, TINKER optimization is used instead)
--tinker4format -- to output tinker v4.3 format (use with AMBER). Default output is tinker5format. Note this is no longer necessary as the latest analyze.f modification allow you to use the v6 tinker exe and parameters with amber. See Readme first above.
-h,--help          -- displays this help message

Additional options to select basis sets in QM calculations

--optbasisset= (Default now is MP2/6-31G*. Used to be HF/6-31G*; for torsion scan, HF is still used for geom opt and MP2/6-311++G** for sp energy) 
--dmabasisset= (6-311G** is default)
--espbasisset= (default MP2/6-311++G(2d,2p); aug-cc-pvtz is recommended for small molecules - see README)
  • Example -s phenol.sdf -n 8 -m 7500MB -M 300GB
  • The scratch directory used for QM calculations can be specified by exporting the GAUSS_SCRDIR environment variable.
  • Verify your molecular structure, protonation state, multiplicity here:

Poltype Usage Example

1) Log into venus (default)

2) type "ssh nova"

3) look for node with available resources, type "ssh node40", for example

4) Check memory usage, "df -h" tells you scratch space, just look for the row that says /scratch and look at third column. "free -g" tells you RAM available, look for row with buffers/cache and look under column with free on top. "nproc" is number of processors,  "top" is like a process list that also tells you CPU% usage, (100%)=1 CPU being used fully, 200%=2CPUs being used fully etc... Can also use , just click NOVA from drop down menu

5) type "source yourbashrc" , this tells bash where gaussian and tinker and babel is located

6) Navigate to directory with the drug molecule .sdf file

If dont have sdf:
babel -ipdb test.pdb -osdf -O test.sdf

If want to add H:
babel -isdf test.sdf -p 7.0 -osdf -O test1.sdf

7) open up the script with the moba text editor

8) choose something reasonable (enough memory) to have in shell script such as
"python ~/Poltype/ -s "DRUGNAME.sdf" -n 4 -m 15GB -M 50GB --do-tor-qm-opt"

Please read , poltype documentation
9) May need to type "chmod +x" to give yourself permission to execute

10) Since you may disconnect from the network (remote login) when you are not on it constantly, we need to run this shell script in the background as a subprocess. type "nohup ./ &". nohup means run
nohup is a POSIX command to ignore the HUP (hangup) signal. The HUP signal is, by convention, the way a terminal warns dependent processes of logout.

Output that would normally go to the terminal goes to a file called nohup.out if it has not already been redirected.

./ means execute

& just puts process in background

We need nohup so you can look at output from command line, output will be saved in nohup.out

11) Open another tab, ssh into your chosen node, navigate to directory with drug molecule

12) Check for errors, type "tail nohup.out", or just view nohup.out with your text editor

You can also type "ls -ltr", this is like normal list but also includes files in order last modified

SDF File Definition

  • Note that you don't need to have "correct" charges for the atom. The program will produce all the charges and dipoles etc for you. However, we need correct total molecular charge (neutral, +1, -1...) for Gaussain job. You can do so by assigning the total molecular charge to the first atom.
  • Note that the multiplicity is assumed to be "1". If you need something else, see the "RAD" option below.
  • Specify Charge
    • 1 = +3, 2 = +2, 3 = +1, 5=-1, 6=-2
    • 4 = doublet radical, 5 = -1, 6 = -2, 7 = -3
    • 0 or other value = uncharged
    • Example of N+
   -0.0007   -0.0000   -0.0000 N   0  3  0  0  0  0  0  0  0  0  0  0
xxxxx.xxxxyyyyy.yyyyzzzzz.zzzz aaaddcccssshhhbbbvvvHHHrrriiimmmnnneee
  • Alternative charge specification
    • After the bond block
    • Following example means there is 1 atom with a formal charge: atom 2 has a +1 charge
M  CHG  1   2   1
  • Multiplicity specification
    • After the bond block
    • Following example means there is 1 atom with a radical: atom 4 has a radical of 1
M  RAD  1   4   1

Bond Order

  • Bond orders are determined from the bond block of the SDF file. If atoms are in a ring, aromaticity is determined based on Kekule's 4n+2 rule.


The following list provides an overview of the procedure.

  • Input molecular structure
    • Formats of the molecular structure can be SDF or Gaussian output files (single point calculations)
    • Any file format can potentially be used
    • Protonation states need to be considered carefully (i.e. consistent with experimental values, etc.)
  • QM optimization - The molecular structure is optimized using HF/6-31G*.
  • Single point calculations
    1. Single point (SP) calculations are performed using MP2/6-311++G**
  • Valence
    1. Obtained from bond orders of optimization or SP calculations.
  • Bonded parameters
    1. Bonds (force constant, equilibrium bond length)
    2. Angles (force constant, equilibrium angle)
    3. Torsion (nth-fold, force constant, phase angle)
    4. Out-of-plane (constant)
    5. Assigned based on element type and valence target atom and its neighbors
  • vdW parameters
    1. Radius (r) and well-depth (ε)
    2. Assigned based on element type and valence
  • Multipoles and symmetry
    1. Atomic partial charge, dipole, and quadropole
    2. Obtained from distributed multipole analysis (Stone et al)
    3. Reference frame based on bonded atoms
      • Preference given to heavier atoms and those that are bonded to more atoms
    4. Molecule is analyzed for symmetry groups
      • Atoms that are equivalent based on symmetry group share multipole parameters
  • Polarization groups
    1. Polarization groups are identified by splitting rotatable bonds
    2. Hydroxyl -C-OH is one group
  • Torsion fitting
  • Output structure and key
    1. Output structure obtained from QM optimization
    2. Parameters located in Tinker key file
      • Available for Tinker v5
      • Possible to convert to Tinker v4.3 and Amber parameters files

Currently Gaussian package is required. GAMESS and other QM programs that implement Distributed Multipole Analysis can also be used but development is under way.

Future Updates

Torsional fitting

POLTYPE tries to detect flexible torsions, either transfer or parameterize on the fly. We are working on improving the robustness. In case neither is satisfactory for you, and you wish to manually derive torsion parameters for certain bonds, here is a general procedure:

For manual fitting,

  1.  First, scan the torsion around a given bond using QM restrained optimization (every ~20 degrees, 0-180 or 360 depending on the symmetry). Gaussian with DFT/M06L is reasonably good for this.
  2.  Identify all the relevant torsions in AMOEBA key file, set the parameters to 0. There could be easily 3x3=9 torsions across one rotatable bond. You need to identify their relationship (e.g. one is always +/-120 degree away from another if it is sp3). Each torsion can have up to 6 cosine terms, with V1, V2, V3...V6 as parameters. We typically use up to 3-fold (V1-3) and use 0, 180, and 0 for the phase angles due to symmetry. You could use all 6-fold and change phase anglesto match a difficult, non-symmetrical torsional energy profile, but not reocmmended.  If there are too many torsions/parameters, you can decide some share the same parameters or put the torsion energy in just 2 or 3 of them (e.g. heavy atom only); keep the rest to 0 and don’t use in fitting below
  3.  Retrained optimization using tinker/AMOEBA key from above. Starting from QM structures you hade from (1).
  4.  Compute QM (from 1)-MM(from 3) energy and shift the minimum to 0
  5.  Fit torsion parameters to the QM-MM target of (4). I use gnuplot (example below) 6) add the torsion parameters to the AMOEBA key file, run (3) again and compare with QM (1) to check.

Other tips:

  • If the shape of QM-MM profile is not symmetric with 0-360, that usually means the molecules "flipped" during scan so the structure is no longer symmetric.
  • If you can’t fit the whole curve, the minima are the most important and you can use "weight" to force them in gnuplot fit at step (5). For barriers, it is fine as long as there is high barrier in AMOEBA at the right angle location (e.g. 15 kcal/mol in MM vs 20 kcal/mol QM is not that "bad").

File "fit.gpt":

fit f(x) 'QM.txt'  using 1:2  via a,b,c,a1,b1,c1,h   
plot 'QM.txt' using 1:2, f(x)

In the above example, there two torsions to be fit. X is one of the torsion angle; another angle is x+120 or x+1.05 in radian). QM.txt is a text file with 2 columns: angle (in radian) and QM-MM energy

At the command line I run: "gnuplot fit.gpt"



Multipole definition and symmetry

frame definition (in kmpole.f of tinker source)

    • Z-then-X (M are atom types in the parameter file)
 M1 M2 M3 q
              Dx  Dy  Dz
              Qxy  Qyy
              Qxz  Qyz  Qzz

Note the traceless requirements: Qxx+Qyy+Qzz=0

    • bisector:
M1 M2 -M2, or M1 -M2 M2, or M1 -M2 -M2 are equivalent

    • bisector-then-X:
 M1 M2 -M3 -M3 q

Local frame.png


    • typically Dy, Qxy, Qyz components are zero, unless it is chiral such as C_alpha in protein
    • for chiral atom, use a forth atom type;
   M1 M2 M3 M4

Tinker automatically detect the change of chirality and assign the correct parameters by flipping the sign of Dy, Qxy, Qyz. For example, in the amoebapro.prm the multipole for Ca was derived for L form AA (multipole Ca N C Cb). If the parameter file is applied to D-form AA, the multipoles will be converted internally. If one derive the multipole using D- form, you need to put a negative sign in front of the y axis defining atom type (e.g. -Cb). When applied to L form, TINKER again will convert internally.

The chkpole.f in tinker/source handles this. The L form of Ca is defined by using N as the Z-axis (the Ca- N bond), C as the X-axis defining atom (in the C-Ca-N plane and perpendicular to Ca-N), and the Y-axis is along (acute angle with) the Ca-Cb bond. The sign of the parallelepiped volume below allows to check where is the Cb. Note that the Z-X definition can not be switched here.

   if ( .or.
& then
          yaxis(i) = -k
          pole(3,i) = -pole(3,i)
          pole(6,i) = -pole(6,i)
          pole(8,i) = -pole(8,i)
          pole(10,i) = -pole(10,i)
          pole(12,i) = -pole(12,i)
       end if
    • for EAB3 (e.g. C-CH3), three of the same B atom type connects to A. The Qxx and Qyy would be exactly the same for multipoles of A due to the symmetry, no matter bisector ( A B -B) or Z-then-X (A E B ) frames are used.


Q: I get the following error:

              missed_torsions = v1.get_mt() 
          AttributeError: Valence instance has no attribute get_mt 

A: Try commenting out the following lines in (~ lines 1881 - 1890):

           v1 = valence.Valence(output_format)
          idxtoclass = []
          for i in range(mol.NumAtoms()):
          # dorot is set as false in
          missed_torsions = v1.get_mt()
          if(not sorttorsion([t1.GetIdx(),t2.GetIdx(),t3.GetIdx(),t4.GetIdx()]) in missed_torsions):
              skiptorsion = True

This shouldn't create any issues with the result, but it will take longer to run

Previous Instructions



  • Johnny Wu (
  • Gaurav Chattree (
  • Advised by: Dr. Pengyu Ren (


Missing torsions are identified in valence by zero for all coefficeints. If no matches found, *~*~*~* will match any torsion with 0's as parameter guesses

Elements can be searched with numbers as well as letters

torsunit corrects for fact that torsion energy equations has /2 in it, so must multiply by 2 to cancel out that effect in valence under torguess


Undergraduate Projects

Create a python script that acts as  a server to let people external to our cluster run poltype jobs