Difference between revisions of "Poltype:Poltype"
|Line 84:||Line 84:|
= POLTYPE 2.0 =
= POLTYPE 2.0 =
: [https://pren.github./poltype/ https://pren.github./poltype/]
== Use POLTYPE 2.0 in Ren lab (07/2020) ==
== Use POLTYPE 2.0 in Ren lab (07/2020) ==
Revision as of 17:23, 22 July 2020
- WHAT DOES POLTYPE DO? The program uses QM calculations to obtain electrostatic multipoles and some valence parameters (equilibrium bond length and angle), and a database to assign the rest of valence parameters (inclduing force constants) and non-bonded parameters such as vdW and polarizability. It will attempt transfer (if in database) or fit the torsion parameters for "flexible rotatable bonds." While the electrostatic parameters follow the AMOEBA protocol, valence parameters should also be reasonable, the vdW and torsion parameters are only estimates. Should you need accurate parameters for simulations etc please refine/pay attention to the torison and vdW parms (work in progress to automate these).
- For large molecules (e.g. >100 atoms), one may need to split the molecule into small model compounds, derive parameters and merge them. When manually merging multipoles at boundary, it is recommended to use POTENTIAL program in Tinker to re-optimize these boundary atomic multipoles. Automation of this progress is under way.
- Check the poltype log file to make sure the QM electrostatic potential, molecular dipole and quadrupole moments are well reproduced by atomic multipoles. Typically molecular moments should be automatically reproduced by atomic multipoles unless for some highly symmetric molecules for which the program may pick wrong local frames and force certain multipole components to 0. This can be remedied by running poledit.x of Tinker and change local frame definitions for specific atoms and/or adding keywords such as "target-dipole x y z" to tinker.key file before re-running ESP fitting.
- For high accuracy, further QM should be used to adjust valence force constants (valence program in TINKER) and QM dimer or liquid simulations to refine vdw parameters, especially if the functional group has not been covered by amoeba09.prm or amoebapro13.prm.
- The input file needs to have correct total molecular charge specification! POLTYPE does NOT predict if the molecule is charged or neutral, users set them. See below for details on how to set total charge.
- The default QM basis set for ESP fitting is MP2/6-311++G2d,2p. To use aug-cc-pvtz, add option --espbasisset=aug-cc-pvtz (slower but better)
- Halogen parameters:
- Mono Cl vdw parameters updated according to JPCB 2014, 118 (24), pp 6456–6465.
- Can NOT detect multiple Cl and assign the corresponding parameters yet. Please annually update vdw parameters according to the paper.
- F, Br, I parameters are merely educated guesses, not yet carefully parameterized & validated like Cl.
- F vdw parameters for fluorobenzene should be manually changed to 3.2020 0.2500. For details see /users/pren/CCP/FLUORO/
- Cl vdw parameters for o-dicholorobenzene should be manually changed to 3.860 0.446.
Obsoltete: to use with amber, you need tinker v4.3 format by using the proper option (see below). For details read here
For instruction about using AMOEBA in AMBER (pmemd), http://biomol.bme.utexas.edu/wiki/index.php/Research_ex:Amber
Wu, J.C., G. Chattree, and P. Ren, Automation of AMOEBA polarizable force field parameterization for small molecules. Theoretical chemistry accounts, 2012. 131(3): p. 1138.
Check your results
While we continue to improve POLTYPE, currently there are "tough" molecules for POLYTYPE to handle. Lots of information are printed in the output files for you to check (automation is under way)
- ttt.xyz and ttt.key are the resulting structure and parameter files you will need.
- Check log file to make sure all calculations actually complete successfully!
- If you have a charged molecule, read a section below on how to make sure the correct charge is reflected in your input structure.
- For molecules of high symmetry, please check frame definitions and alter them as needed using POLEDIT and POTENTIAL command in Tinker to change frames/combine multipoles and refit to ESP. See here for common frames AMOEBA uses: http://biomol.bme.utexas.edu/wiki/index.php/ABS:Mm_mpole#Local_frame_definition
- The molecular dipole moments should match between QM and MM
- The ESP fitting RMSE should be small
- Check the torsion plots in qm-torsion folder.
- if you are concerned with vdw parameters, you may check again QM dimer (your_molecule-water) structure and energy.
- This is updated in 1.1.4, the output is now consistent with amoeba09.prm and amoeba2013.prm. The following is no longer necessary.
The previous output tinker files are in tinker v6 (old) format - please watch out opbendunit headers (opbendunit 0.02191418) when mixing with build-in parameters: the opbend force constants need to be scaled up by 71.94 before combining with amoebapro13.prm which does NOT use "opbendunit 0.02191418" in the header
INPUT SDF file
The SDF file needs to specificy the correct total charge of the molecule!
SDF File Definition and Setting total charge
1. Note that you don't need to have "correct" charges for the atom. The program will produce all the charges and dipoles etc for you. However, we need correct total molecular charge (neutral, +1, -1...) for Gaussain job. You can do so by assigning the total molecular charge to the first atom.
- Note that the multiplicity is assumed to be "1". If you need something else, see the "RAD" option below.
- Specify Molecular Charge
- 1 = +3, 2 = +2, 3 = +1, 5=-1, 6=-2
- 4 = doublet radical, 5 = -1, 6 = -2, 7 = -3
- 0 or other value = uncharged
- Example of N+
-0.0007 -0.0000 -0.0000 N 0 3 0 0 0 0 0 0 0 0 0 0 xxxxx.xxxxyyyyy.yyyyzzzzz.zzzz aaaddcccssshhhbbbvvvHHHrrriiimmmnnnee
2. Alternative charge specification (if use both approaches to specify charges, need to be consistent)
- After the bond block
- Following example means there is 1 atom with a formal charge: atom 2 has a +1 charge
- If you have both CHG and atomic charges specified as above, CHG takes precedence.
M CHG 1 2 1
- Multiplicity specification
- After the bond block
- Following example means there is 1 atom with a radical: atom 4 has a radical of 1
M RAD 1 4 1
- SDF specification or search "SDF CTFILE"
Poltype 2.0 github site: https://github.com/pren/poltype/tree/poltype2
Poltype 2.0 documentation website: https://pren.github.io/poltype/
Use POLTYPE 2.0 in Ren lab (07/2020)
To use Chengwen's POLTYPE, source the following:
#!/usr/bin/bash -f #Python & openbabel export PATH="/home/liuchw/anaconda3/bin:$PATH" conda activate poltype #Gaussian, GDMA and scratch export g09root=/opt/g09gh/gaussian source $g09root/g09/bsd/g09.profile export GAUSS_SCRDIR=/scratch/$USER export GDMADIR=/home/liuchw/Softwares/gdma-2.3.3/bin export PATH=$GDMADIR:$PATH export PSI_SCRATCH=/scratch/$USER #Tinker export PATH=/home/liuchw/Softwares/tinkers/Tinker-latest/source/:$PATH
Updated July 2015
Download from Poltype from github (master)
- Gaussian (need to source corresponding .g09* files to set environment variables)
- GDMA -- Stone's Distributed Multipole Analysis software
- Python bindings for OpenBabel, and matching python installed
- Tinker 8.2 (Don't use >= 8.7) from dasher.wustl.edu
Babel/Python binding installation Install Open Babel from source code by following these steps (from Open Babel installation instructions):
$ tar zxf openbabel-2.3.0.tar.gz # (this creates openbabel-2.3.0) $ mkdir build $ cd build $ cmake ../openbabel-2.3.0 -DPYTHON_BINDINGS=ON -DCMAKE_INSTALL_PREFIX=/opt/openbabel-2.3.0 $ make # make install
The following environment variables (with the appropriate paths) need to be exported. They may be saved in your ~/.bashrc or a ~/.cshrc appropriate format may be used:
set path = ( $path /opt/software/CHEM/openbabel-2.3.0/bin) setenv PYTHONPATH /opt/software/CHEM/openbabel-2.3.0/lib:/opt/software/CHEM/openbabel-2.3.0/lib64/python2.6/site-packages setenv GDMADIR /opt/gdma-2.2/bin setenv LD_LIBRARY_PATH /opt/software/CHEM/openbabel-2.3.0/lib setenv OPENBABEL_INSTALL /opt/software/CHEM/openbabel-2.3.0 setenv TINKERDIR=/users/wuch/tinker/source plus those for Gaussian & Tinker...
# .oldpoltype.bashrc export g09root=/opt/g09gh/gaussian # where software is installed source $g09root/g09/bsd/g09.profile # sets up enviormental variable for running '''Gaussian''' export GAUSS_SCRDIR=/scratch/bdw2292/ # defines where scratch space if for QM computations, for gaussian export GDMADIR=/opt/gdma-2.2-ifort/bin/gdma # path to '''GDMA '''binary export PATH=/opt/gdma-2.2-ifort/bin/gdma:$PATH # add GDAM binary to variable PATH export PSI_SCRATCH=/scratch/bdw2292/ # define scratch space for Psi4 export PATH=/home/bdw2292/Tinker-8.4.3/bin/:$PATH # add '''Tinker '''executables/binary to the variable PATH export PYTHONPATH=/home/bdw2292/Python2.7/lib/python2.7:/opt/software/CHEM/openbabel-2.3.0/lib:/opt/software/CHEM/openbabel-2.3.0/lib64/python2.6/site-packages:/usr/lib64/python2.6/lib-tk:/usr/lib64/python2.6/lib-dynload/_tkinter.so:$PYTHONPATH # add python libraries, including openbabel python bindings for POLTYPE; if you python2.7 for openbabel export PATH=/home/bdw2292/Python2.7/bin/:$PATH alias python=/home/bdw2292/Python2.7/bin/python2.7 # shortcut for calling python2.7 alias pip=/home/bdw2292/Python2.7/bin/"python2.7 -m pip" # shortcut for calling pip export PATH=$PATH:/opt/software/CHEM/openbabel-2.3.0/bin # openbabel binary for use on commandline export LD_LIBRARY_PATH=/opt/software/CHEM/openbabel-2.3.0/lib:$LD_LIBRARY_PATH # linked libraries for openbabel
You may need to add intel compiler/lib to your path to run tinker commands. You can try a TINKER command first to make sure it can run.
poltype cpmmand line options: -s,--structure -- Initial structure (i.e. SDF, MOL2, PDB, etc.) (required) -n,--numproc -- Number of processors to use for QM calculations (default: 1) -m,--maxmem -- Maximum amount of memory to use for QM calculations (default: 700MB) -M,--maxdisk -- Maximum amount of diskspace to use for QM calculations (300GB or larger recommended for molecules with ~30,40 heavy atoms) (default 100GB) -a,--atmidx -- Starting index of atom type (default: 401) --omit-torsion -- Skips QM torsion scan calculations around rotatable bonds --omit-torsion2, and a file *.toromit which lists the specific torsion scans you would like to skip --do-tor-qm-opt, during torsion scanning, use qm to optimize the molecule at each angle (by default this is off, TINKER optimization is used instead) --tinker4format -- to output tinker v4.3 format (use with AMBER). Default output is tinker5format. Note this is no longer necessary as the latest analyze.f modification allow you to use the v6 tinker exe and parameters with amber. See Readme first above. -h,--help -- displays this help message
Additional options to select basis sets in QM calculations
--optbasisset= (Default now is MP2/6-31G*. Used to be HF/6-31G*; for torsion scan, HF is still used for geom opt and MP2/6-311++G** for sp energy) --dmabasisset= (6-311G** is default) --espbasisset= (default MP2/6-311++G(2d,2p); aug-cc-pvtz is recommended for small molecules - see README)
poltype.py -s phenol.sdf -n 8 -m 7500MB -M 300GB
- The scratch directory used for QM calculations can be specified by exporting the GAUSS_SCRDIR environment variable.
- Verify your molecular structure, protonation state, multiplicity here:
We define small fragment as groups (e.g. a water molecule, benzyl group) and there is no intramolecular permanent multipole polarization within the group. The group definition needs to be set before deriving permanent multipole from QM since we remove the intramolecular polarization from QM multipoles. POLTYPE allow users enter bonds that separate a molecule into groups.
polarize 1 1.0730 0.3900 3 4 9 52 223 242
polarize 3 1.3340 0.3900 1 5 7 50 225 227
In the above example, 1.073 or 1.334 is atomic polarizability, 0.39 is the damping coefficient. 3 and the rest of the number are atom types.
For example, atom type 1 will be in the same group with atom type 3, 4, 9, 52, 223 or 242, if any of them is bonded to 1. Tinker will search the molecule and match the definition above to define groups.
Multipole definition and symmetry
Local frame definition (in kmpole.f of tinker source). Currently POLTYPE force to use the first two types.
- Z-then-X (M are atom types in the parameter file)
M1 MZ MX q Dx Dy Dz Qxx Qxy Qyy Qxz Qyz Qzz
Note the traceless requirements: Qxx+Qyy+Qzz=0
- bisector: (e.g. for O in water)
M1 M2 -M2, or M1 -M2 M2, or M1 -M2 -M2 are equivalent (e.g. in the fig below the bisector the two H atoms define the Z for O). Note that you can also define bisector even the the last two atoms are not of the same type: M1 M2 -M3 when M2 and M3 are different types.
M1 MZ -M2 -M3 q (note th elast two frame defining atom types M2 and M3 both need to be negative!)
The two frames below are supported in Beta version of POLTYPE as of Nov 2018
- Z-only: (this is useful for molecules such ash N2, O2 etc)
M1 MZ 0 q
M1 MZ q
- 3-fold (trisector). This is useful for PO3 or NH3, where the trisectioe of P-O or N-H bonds is used as Z for P or N. For P in CH3PO3, we can also use z only above (C would be the MZ)
M1 -M2 -M3 -M4 q
Symmetry in multipole components (need to enforced)
- Typically for all frame definitions, including z-then-x, Dy, Qxy, Qyz components are zero, unless it is chiral such as C_alpha in protein (see below)
- For bisector frame, if the two atom types in the frame def are the same (e.g. M N -N or M -N -N), POLTYPE forces dx, dy, Qxz, Qxy, Qyz =0 for symmetry reason. Nete that it is possible to use bisector when the two frame-def atoms are not the same (M -N -O); in that case, Tinker poledit.x does not force dx=Qxz=0 but POLTYPE always do.
- for chiral atom, use a forth atom type;
M1 M2 M3 M4
Tinker automatically detect the change of chirality and assign the correct parameters by flipping the sign of Dy, Qxy, Qyz. For example, in the amoebapro.prm the multipole for Ca was derived for L form AA (multipole Ca N C Cb). If the parameter file is applied to D-form AA, the multipoles will be converted internally. If one derive the multipole using D- form, you need to put a negative sign in front of the y axis defining atom type (e.g. -Cb). When applied to L form, TINKER again will convert internally.
The chkpole.f in tinker/source handles this. The L form of Ca is defined by using N as the Z-axis (the Ca- N bond), C as the X-axis defining atom (in the C-Ca-N plane and perpendicular to Ca-N), and the Y-axis is along (acute angle with) the Ca-Cb bond. The sign of the parallelepiped volume below allows to check where is the Cb. Note that the Z-X definition can not be switched here.
if (k.lt.0.and.vol.gt.0.0d0 .or. & k.gt.0.and.vol.lt.0.0d0) then yaxis(i) = -k pole(3,i) = -pole(3,i) pole(6,i) = -pole(6,i) pole(8,i) = -pole(8,i) pole(10,i) = -pole(10,i) pole(12,i) = -pole(12,i) end if
- for EAB3 (e.g. -PO32-), three of the same B atom type connects to A. It is best to use Z-only frame. The Qxx and Qyy would be exactly the same for multipoles of A due to the symmetry (Qxx=Qyy=-1/2*Qzz), in addiiton to dx, dy, Qxz, Qxy, Qyz =0 (bisector frame requirement).
Example of Nitrate NO3-
Note NO3- is fairly flat and rigid so we are able to use Z-then-X for both N and O. For N, the Qxx and Qzz are equal (x and Z are in the plane, y is normal to the plane). For O, the "2 1 2" also works because the direction of "X" or "Y" is not affecting the (sign) of Qxx or Qyy.
######################################### ## AMOEBA Parameters for Nitrate Ion ## ######################################### atom 1 1 N "Nitrate N" 7 14.007 3 atom 2 2 O "Nitrate O" 8 15.999 1 vdw 1 3.7100 0.1100 vdw 2 3.5100 0.1120 bond 1 2 390.0 1.2606 angle 2 1 2 155.00 120.00 strbnd 2 1 2 18.70 18.70 opbend 2 1 2 2 172.00 multipole 1 2 2 1.07363 0.00000 0.00000 0.00000 0.14407 0.00000 -0.28814 0.00000 0.00000 0.14407 multipole 2 1 2 -0.69121 0.00000 0.00000 0.04941 -0.50067 0.00000 -0.08271 0.00000 0.00000 0.58338 polarize 1 1.0730 0.3900 2 polarize 2 0.8370 0.3900 1
POLTYPE tries to detect flexible torsions, either transfer or parameterize on the fly. We are working on improving the robustness. In case neither is satisfactory for you, and you wish to manually derive torsion parameters for certain bonds, here is a general procedure:
For manual fitting,
- First, scan the torsion around a given bond using QM restrained optimization (every ~20 degrees, 0-180 or 360 depending on the symmetry). Gaussian with DFT/M06L is reasonably good for this.
- Identify all the relevant torsions in AMOEBA key file, set the parameters to 0. There could be easily 3x3=9 torsions across one rotatable bond. You need to identify their relationship (e.g. one is always +/-120 degree away from another if it is sp3). Each torsion can have up to 6 cosine terms, with V1, V2, V3...V6 as parameters. We typically use up to 3-fold (V1-3) and use 0, 180, and 0 for the phase angles due to symmetry. You could use all 6-fold and change phase anglesto match a difficult, non-symmetrical torsional energy profile, but not reocmmended. If there are too many torsions/parameters, you can decide some share the same parameters or put the torsion energy in just 2 or 3 of them (e.g. heavy atom only); keep the rest to 0 and don’t use in fitting below
- Retrained optimization using tinker/AMOEBA key from above. Starting from QM structures you hade from (1).
- Compute QM (from 1)-MM(from 3) energy and shift the minimum to 0
- Fit torsion parameters to the QM-MM target of (4). I use gnuplot (example below) 6) add the torsion parameters to the AMOEBA key file, run (3) again and compare with QM (1) to check.
- If the shape of QM-MM profile is not symmetric with 0-360, that usually means the molecules "flipped" during scan so the structure is no longer symmetric.
- If you can’t fit the whole curve, the minima are the most important and you can use "weight" to force them in gnuplot fit at step (5). For barriers, it is fine as long as there is high barrier in AMOEBA at the right angle location (e.g. 15 kcal/mol in MM vs 20 kcal/mol QM is not that "bad").
f(x)=a/2.0*(1+cos(x))+b/2.0*(1-cos(2*x))+c/2.0*(1+cos(3*x))+a1/2.0*(1+cos(x+1.05))+b1/2.0*(1-cos(2*(x+1.05)))+c1/2.0*(1+cos(3*(x+1.05)))+h a=1;b=1;c=1;a1=1;b1=1;c1=1;;h=1 fit f(x) 'QM.txt' using 1:2 via a,b,c,a1,b1,c1,h plot 'QM.txt' using 1:2, f(x)
In the above example, there two torsions to be fit. X is one of the torsion angle; another angle is x+120 or x+1.05 in radian). QM.txt is a text file with 2 columns: angle (in radian) and QM-MM energy
At the command line I run: "gnuplot fit.gpt"
Notes about compiling tinker 8 with OpenMP
1. Set up gcc/gfortran or intel ifort v12 or above compiler (set this up in .bashrc or .cshrc). If use intel, for csh add this line is .cshrc and source it :
source /opt/intel/composer_xe_2013.2.146/bin/compilervars.csh intel64
Or .bashrc if you use bash:
source /opt/intel/composer_xe_2013.2.146/bin/compilervars.sh intel64
Intel compiled programs perform better than GNU-compiled ones on multiple intel CPU cores.
2. Compile fftw by following 0README in tinker/fftw/. Use gcc and gfortran or icc and ifort.
3. Compile executables under tinker/source/
You may use the compile.make, library.make and link.make in tinker/linux or tinker/macosx. Copy these 3 files into tinker/source and extecute the 3 on that folder seuqentially. You may need to change the options in the .make files if you have trouble.
You may also use the Makefile provided in tinker/make/
cd tinker/source cp../make/Makefile ./ #Modify the top part and compiler section (see below). make -j 6 all #If you modified Tinker source files, you may also need to update the dependency at the bottom of makefile by running depend.make in source
In the Makefile, define the TINKERDIR and FFTWDIR on the top properly.
F77 = gfortran F77FLAGS = -c OPTFLAGS = -O3 -ffast-math -fopenmp LIBDIR = -L. -L$(TINKER_LIBDIR)/linux LIBS = LIBFLAGS = -crusv RANLIB = ranlib LINKFLAGS = $(OPTFLAGS) -static-libgcc RENAME = rename_bin
Or this if you use intel:
F77 = ifort F77FLAGS = -c -xHost OPTFLAGS = -O3 -no-ipo -no-prec-div -recursive -openmp LIBDIR = -L. -L$(TINKER_LIBDIR)/linux LIBS = -L$(TINKERDIR)/fftw/lib -lfftw3_threads -lfftw3 LIBFLAGS = -crusv RANLIB = echo LINKFLAGS = $(OPTFLAGS) -static-libgcc -static-intel RENAME = rename_bin
You can also use intel mkl instead of the fftw library above. In Makefile make the following changes (/home/pren/tinker6/source-2014-xz-node/Makefile). You also need to remove any "$(TINKER)/fftw/lib" in the Makefile
F77 = ifort LIBS = -L/opt/intel/composer_xe_2013/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core F77FLAGS = -c -xHost -assume cc_omp OPTFLAGS = -O3 -no-ipo -no-prec-div -openmp LIBFLAGS = -crusv LINKFLAGS = $(OPTFLAGS)
Visualizing Tinker XYZ
You can use pymol or vmd to visualize tinker xyz file.
Q: I get the following error:
get_torlist missed_torsions = v1.get_mt() AttributeError: Valence instance has no attribute get_mt
A: Try commenting out the following lines in Poltype.py (~ lines 1881 - 1890):
v1 = valence.Valence(output_format) idxtoclass =  for i in range(mol.NumAtoms()): idxtoclass.append(i+1) v1.setidxtoclass(idxtoclass) # dorot is set as false in valence.py v1.torguess(mol,False,) missed_torsions = v1.get_mt() if(not sorttorsion([t1.GetIdx(),t2.GetIdx(),t3.GetIdx(),t4.GetIdx()]) in missed_torsions): skiptorsion = True
This shouldn't create any issues with the result, but it will take longer to run
Q: What is Cl radius in DMA A:
For Cl we use a radii of “1.1”, which should be the default value of GDMAv2.2 (it is mentioned in the manual). To be sure, you can also set it explicitly in POLTYPE to 1.1 (Radius Cl 1.10).
For Cl polarizability (2.5 A3) and vdw parameters, POLTYPE does not assign the lasts parameter yet. Please use the parameters from "J. Phys. Chem. B, 2014, 118 (24), pp 6456–6465," where Table 3 lists all the vdw parameters for Cl, Cl2, Cl3…You can update the Cl vdw values in “valence.py” inside the POLTYPE folder. We will soon update the official POLTYPE to take care of all halogen parameters.
Q: Why use original DMA, not the latest ver 2.0
We always use the original DMA procedure (even with newer v2.2 program) by setting “Switch” to 0 and Radius H to 0.650. In the original DMA, all atoms (H,C,N,O) have the same radius but we do find the need for large radius for S, P and some halogen. The new GDMA use a grid base grid-based quadrature algorithm ("Switch >4" and "Radius H 0.325") approach and use different radius to partition electron density to different atoms. We found some transferability issues with the new algorithm as we discussed in our POLTYPE (Theor Chem Acc 2012) and AMOEBA small organic paper (JCTC 2011, 7, p 3143).
- Matthew Harger (2016-2019)
- Brandon Walker (2018- )
- Johnny Wu (2008-2012)
- Gaurav Chattree (2013)
- Pengyu Ren (PI)
Click to expand: todo list