接上文
NVT平衡
EM可保证我们的初始结构在几何构型和溶剂分子取向等方面都合理. 为了开始真正的动力学模拟, 我们必须对蛋白质周围的溶剂和离子进行平衡. 如果我们在这时就尝试进行非限制的动力学模拟, 体系可能会崩溃. 原因在于我们基本上只是优化了溶剂分子自身, 而没有考虑溶质. 我们需要将体系置于设定的模拟温度下, 以确定溶质(蛋白质)的合理取向. 达到正确的温度(基于动能)之后, 我们要对体系施加压力直到它达到合适的密度.
还记得好久以前我们用db2gmx
生成的osre.itp
文件么? 现在它要派上用场了. posre.itp
文件的目的在于对蛋白质中的重原子(非氢原子)施加位置限制(position restraining)力. 这些原子不会移动, 除非增加非常大的能量. 位置限制的用途在于, 我们可以平衡蛋白质周围的溶剂分子, 而不引起蛋白质结构的变化.
平衡往往分两个阶段进行. 第一个阶段在NVT系综(粒子数, 体积和温度都是恒定的)下进行. 这个系综也被称为等温等容系综或正则系综. 这个过程的需要的时间与体系的构成有关, 但在NVT系综中, 体系的温度应达到预期值并基本保持不变. 如果温度仍然没有稳定, 那就需要更多的时间. 通常情况下, 50 ps到100 ps就足够了,为了节约时间,我们仅进行2ps的NVT。
nvt.mdp
的设置如下:
title = OPLS NVT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 1000 ; 2 * 1000 = 2 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 500 ; save coordinates every 1.0 ps
nstvout = 500 ; save velocities every 1.0 ps
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed
file=open('nvt.mdp','w')
file.write("""
title = OPLS NVT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 1000 ; 2 * 1000 = 2 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 500 ; save coordinates every 1.0 ps
nstvout = 500 ; save velocities every 1.0 ps
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed
""")
file.close()
接下来与能量最小化情况类似
out=!{"gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr"}
out=!{"gmx mdrun -v -deffnm nvt"}
由于模拟的时间比较短,我一会儿就模拟好了
让我们来分析温度变化情况, 再次使用energy
模块:
out=!{"gmx energy -f nvt.edr -o nvt.xvg<}
fig,ax=plt.subplots()
plotXVG(ax,"nvt.xvg")
ppl.legend(ax)
plt.show()
由于这里nstxout
记录值过大所以只有两个点,应该把值设置更小,从而产生如下结果
前一步的NVT平衡稳定了体系的温度. 在采集数据之前, 我们还需要稳定体系的压力(因此还包括密度). 压力平衡是在NPT系综下进行的, 其中粒子数, 压力和温度都保持不变. 这个系综也被称为等温等压系综, 最接近实验条件.
一般建议设置的模拟时长为1ns,但是由于时间关系,这里仅模拟10ps
npt.mdp
设置如下:
title = OPLS NPT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 5000 ; 2 * 5000 = 10 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 50 ; save coordinates every 0.1 ps
nstvout = 50 ; save velocities every 0.1 ps
nstenergy = 50 ; save energies every 0.1 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = yes ; Restarting after NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
file=open('npt.mdp','w')
file.write("""
title = OPLS NPT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 5000 ; 2 * 5000 = 10 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 50 ; save coordinates every 0.1 ps
nstvout = 50 ; save velocities every 0.1 ps
nstenergy = 50 ; save energies every 0.1 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = yes ; Restarting after NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
""")
file.close()
out=!{"gmx grompp -f npt.mdp -c nvt.gro -t nvt.cpt -p topol.top -o npt.tpr"}
out=!{"gmx mdrun -v -deffnm npt"}
我们可以通过压力和密度检测来看模拟是否平衡
out=!{"gmx energy -f npt.edr -o pressure.xvg<}out=!{"gmx energy -f npt.edr -o density.xvg<}
fig,ax=plt.subplots()
plotXVG(ax,"pressure.xvg")
ppl.legend(ax)
plt.show()
fig,ax=plt.subplots()
plotXVG(ax,"density.xvg")
ppl.legend(ax)
plt.show()
成品MD
随着两个平衡阶段的完成, 体系已经在需要的温度和压强下平衡好了. 我们现在可以放开位置限制并进行成品MD以收集数据了. 这个过程跟前面的类似. 运行grompp
时, 我们还要用到检查点文件(在这种情况下,其中包含了压力耦合信息). 我们要进行一个1 ns的MD模拟, 所用的参数文件如下:
title = OPLS MD simulation
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 500000 ; 2 * 500000 = 1000 ps (1 ns)
dt = 0.002 ; 2 fs
; Output control
nstxout = 5000 ; save coordinates every 10.0 ps
nstvout = 5000 ; save velocities every 10.0 ps
nstenergy = 5000 ; save energies every 10.0 ps
nstlog = 5000 ; update log file every 10.0 ps
nstxout-compressed = 5000 ; save compressed coordinates every 10.0 ps
; nstxout-compressed replaces nstxtcout
compressed-x-grps = System ; replaces xtc-grps
; Bond parameters
continuation = yes ; Restarting after NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
file=open('md.mdp','w')
file.write("""
title = OPLS Lysozyme MD simulation
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 500000 ; 2 * 500000 = 1000 ps (1 ns)
dt = 0.002 ; 2 fs
; Output control
nstxout = 5000 ; save coordinates every 10.0 ps
nstvout = 5000 ; save velocities every 10.0 ps
nstenergy = 5000 ; save energies every 10.0 ps
nstlog = 5000 ; update log file every 10.0 ps
nstxout-compressed = 5000 ; save compressed coordinates every 10.0 ps
; nstxout-compressed replaces nstxtcout
compressed-x-grps = System ; replaces xtc-grps
; Bond parameters
continuation = yes ; Restarting after NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
""")
file.close()
out=!{"gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr"}
!gmx mdrun -v -deffnm md_0_1
具体的分析方法与内容我们下次分享
更多原创精彩视频敬请关注生信杂谈: