专栏名称: 生信杂谈
生物信息学;生物信息;计算机辅助药物设计;测序分析;Python;R;机器学习;论文写作;网站制作;LOL;dota2。
目录
相关文章推荐
中国旅游报  ·  逐梦亚冬,黑龙江冰雪游新亮点多多 ·  6 天前  
KLOOK客路旅行  ·  圣诞预告!东京哈利波特1:1还原雪中霍格沃茨 ·  1 周前  
51好读  ›  专栏  ›  生信杂谈

#MD#模拟教程:GROMACS之載脂蛋白模拟2

生信杂谈  · 公众号  ·  · 2017-07-10 23:33

正文

接上文

NVT平衡

EM可保证我们的初始结构在几何构型和溶剂分子取向等方面都合理. 为了开始真正的动力学模拟, 我们必须对蛋白质周围的溶剂和离子进行平衡. 如果我们在这时就尝试进行非限制的动力学模拟, 体系可能会崩溃. 原因在于我们基本上只是优化了溶剂分子自身, 而没有考虑溶质. 我们需要将体系置于设定的模拟温度下, 以确定溶质(蛋白质)的合理取向. 达到正确的温度(基于动能)之后, 我们要对体系施加压力直到它达到合适的密度.

还记得好久以前我们用db2gmx生成的osre.itp文件么? 现在它要派上用场了. posre.itp文件的目的在于对蛋白质中的重原子(非氢原子)施加位置限制(position restraining)力. 这些原子不会移动, 除非增加非常大的能量. 位置限制的用途在于, 我们可以平衡蛋白质周围的溶剂分子, 而不引起蛋白质结构的变化.

平衡往往分两个阶段进行. 第一个阶段在NVT系综(粒子数, 体积和温度都是恒定的)下进行. 这个系综也被称为等温等容系综或正则系综. 这个过程的需要的时间与体系的构成有关, 但在NVT系综中, 体系的温度应达到预期值并基本保持不变. 如果温度仍然没有稳定, 那就需要更多的时间. 通常情况下, 50 ps到100 ps就足够了,为了节约时间,我们仅进行2ps的NVT。
nvt.mdp的设置如下:

title       = OPLS  NVT equilibration
define      = -DPOSRES  ; position restrain the protein
; Run parameters
integrator  = md        ; leap-frog integrator
nsteps      = 1000     ; 2 * 1000 = 2 ps
dt          = 0.002     ; 2 fs
; Output control
nstxout     = 500       ; save coordinates every 1.0 ps
nstvout     = 500       ; save velocities every 1.0 ps
nstenergy   = 500       ; save energies every 1.0 ps
nstlog      = 500       ; update log file every 1.0 ps
; Bond parameters
continuation            = no        ; first dynamics run
constraint_algorithm    = lincs     ; holonomic constraints
constraints             = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Neighborsearching
cutoff-scheme   = Verlet
ns_type         = grid      ; search neighboring grid cells
nstlist         = 10        ; 20 fs, largely irrelevant with Verlet
rcoulomb        = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw            = 1.0       ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype     = PME   ; Particle Mesh Ewald for long-range electrostatics
pme_order       = 4     ; cubic interpolation
fourierspacing  = 0.16  ; grid spacing for FFT
; Temperature coupling is on
tcoupl      = V-rescale             ; modified Berendsen thermostat
tc-grps     = Protein Non-Protein   ; two coupling groups - more accurate
tau_t       = 0.1     0.1           ; time constant, in ps
ref_t       = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl      = no        ; no pressure coupling in NVT
; Periodic boundary conditions
pbc     = xyz           ; 3-D PBC
; Dispersion correction
DispCorr    = EnerPres  ; account for cut-off vdW scheme
; Velocity generation
gen_vel     = yes       ; assign velocities from Maxwell distribution
gen_temp    = 300       ; temperature for Maxwell distribution
gen_seed    = -1        ; generate a random seed
#演示使用
file=open('nvt.mdp','w') file.write(""" title       = OPLS  NVT equilibration define      = -DPOSRES  ; position restrain the protein ; Run parameters integrator  = md        ; leap-frog integrator nsteps      = 1000     ; 2 * 1000 = 2 ps dt          = 0.002     ; 2 fs ; Output control nstxout     = 500       ; save coordinates every 1.0 ps nstvout     = 500       ; save velocities every 1.0 ps nstenergy   = 500       ; save energies every 1.0 ps nstlog      = 500       ; update log file every 1.0 ps ; Bond parameters continuation            = no        ; first dynamics run constraint_algorithm    = lincs     ; holonomic constraints constraints             = all-bonds ; all bonds (even heavy atom-H bonds) constrained lincs_iter              = 1         ; accuracy of LINCS lincs_order             = 4         ; also related to accuracy ; Neighborsearching cutoff-scheme   = Verlet ns_type         = grid      ; search neighboring grid cells nstlist         = 10        ; 20 fs, largely irrelevant with Verlet rcoulomb        = 1.0       ; short-range electrostatic cutoff (in nm) rvdw            = 1.0       ; short-range van der Waals cutoff (in nm) ; Electrostatics coulombtype     = PME   ; Particle Mesh Ewald for long-range electrostatics pme_order       = 4     ; cubic interpolation fourierspacing  = 0.16  ; grid spacing for FFT ; Temperature coupling is on tcoupl      = V-rescale             ; modified Berendsen thermostat tc-grps     = Protein Non-Protein   ; two coupling groups - more accurate tau_t       = 0.1     0.1           ; time constant, in ps ref_t       = 300     300           ; reference temperature, one for each group, in K ; Pressure coupling is off pcoupl      = no        ; no pressure coupling in NVT ; Periodic boundary conditions pbc     = xyz           ; 3-D PBC ; Dispersion correction DispCorr    = EnerPres  ; account for cut-off vdW scheme ; Velocity generation gen_vel     = yes       ; assign velocities from Maxwell distribution gen_temp    = 300       ; temperature for Maxwell distribution gen_seed    = -1        ; generate a random seed """) file.close()

接下来与能量最小化情况类似

out=!{"gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr"}
out=!{"gmx mdrun -v -deffnm nvt"}

由于模拟的时间比较短,我一会儿就模拟好了
让我们来分析温度变化情况, 再次使用energy模块:

out=!{"gmx energy -f nvt.edr -o nvt.xvg<}
#演示使用
fig,ax=plt.subplots() plotXVG(ax,"nvt.xvg") ppl.legend(ax) plt.show()

由于这里nstxout记录值过大所以只有两个点,应该把值设置更小,从而产生如下结果

前一步的NVT平衡稳定了体系的温度. 在采集数据之前, 我们还需要稳定体系的压力(因此还包括密度). 压力平衡是在NPT系综下进行的, 其中粒子数, 压力和温度都保持不变. 这个系综也被称为等温等压系综, 最接近实验条件.
一般建议设置的模拟时长为1ns,但是由于时间关系,这里仅模拟10ps
npt.mdp设置如下:

title       = OPLS NPT equilibration
define      = -DPOSRES  ; position restrain the protein
; Run parameters
integrator  = md        ; leap-frog integrator
nsteps      = 5000     ; 2 * 5000 = 10 ps
dt          = 0.002     ; 2 fs
; Output control
nstxout     = 50       ; save coordinates every 0.1 ps
nstvout     = 50       ; save velocities every 0.1 ps
nstenergy   = 50       ; save energies every 0.1 ps
nstlog      = 500       ; update log file every 1.0 ps
; Bond parameters
continuation            = yes       ; Restarting after NVT
constraint_algorithm    = lincs     ; holonomic constraints
constraints             = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Neighborsearching
cutoff-scheme   = Verlet
ns_type         = grid      ; search neighboring grid cells
nstlist         = 10        ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb        = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw            = 1.0       ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype     = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order       = 4         ; cubic interpolation
fourierspacing  = 0.16      ; grid spacing for FFT
; Temperature coupling is on
tcoupl      = V-rescale             ; modified Berendsen thermostat
tc-grps     = Protein Non-Protein   ; two coupling groups - more accurate
tau_t       = 0.1     0.1           ; time constant, in ps
ref_t       = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl              = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype          = isotropic             ; uniform scaling of box vectors
tau_p               = 2.0                   ; time constant, in ps
ref_p               = 1.0                   ; reference pressure, in bar
compressibility     = 4.5e-5                ; isothermal compressibility of water, bar^-1
refcoord_scaling    = com
; Periodic boundary conditions
pbc     = xyz       ; 3-D PBC
; Dispersion correction
DispCorr    = EnerPres  ; account for cut-off vdW scheme
; Velocity generation
gen_vel     = no        ; Velocity generation is off
#演示使用
file=open('npt.mdp','w') file.write(""" title       = OPLS NPT equilibration define      = -DPOSRES  ; position restrain the protein ; Run parameters integrator  = md        ; leap-frog integrator nsteps      = 5000     ; 2 * 5000 = 10 ps dt          = 0.002     ; 2 fs ; Output control nstxout     = 50       ; save coordinates every 0.1 ps nstvout     = 50       ; save velocities every 0.1 ps nstenergy   = 50       ; save energies every 0.1 ps nstlog      = 500       ; update log file every 1.0 ps ; Bond parameters continuation            = yes       ; Restarting after NVT constraint_algorithm    = lincs     ; holonomic constraints constraints             = all-bonds ; all bonds (even heavy atom-H bonds) constrained lincs_iter              = 1         ; accuracy of LINCS lincs_order             = 4         ; also related to accuracy ; Neighborsearching cutoff-scheme   = Verlet ns_type         = grid      ; search neighboring grid cells nstlist         = 10        ; 20 fs, largely irrelevant with Verlet scheme rcoulomb        = 1.0       ; short-range electrostatic cutoff (in nm) rvdw            = 1.0       ; short-range van der Waals cutoff (in nm) ; Electrostatics coulombtype     = PME       ; Particle Mesh Ewald for long-range electrostatics pme_order       = 4         ; cubic interpolation fourierspacing  = 0.16      ; grid spacing for FFT ; Temperature coupling is on tcoupl      = V-rescale             ; modified Berendsen thermostat tc-grps     = Protein Non-Protein   ; two coupling groups - more accurate tau_t       = 0.1     0.1           ; time constant, in ps ref_t       = 300     300           ; reference temperature, one for each group, in K ; Pressure coupling is on pcoupl              = Parrinello-Rahman     ; Pressure coupling on in NPT pcoupltype          = isotropic             ; uniform scaling of box vectors tau_p               = 2.0                   ; time constant, in ps ref_p               = 1.0                   ; reference pressure, in bar compressibility     = 4.5e-5                ; isothermal compressibility of water, bar^-1 refcoord_scaling    = com ; Periodic boundary conditions pbc     = xyz       ; 3-D PBC ; Dispersion correction DispCorr    = EnerPres  ; account for cut-off vdW scheme ; Velocity generation gen_vel     = no        ; Velocity generation is off """) file.close()
out=!{"gmx grompp -f npt.mdp -c nvt.gro -t nvt.cpt -p topol.top -o npt.tpr"}
out=!{"gmx mdrun -v -deffnm npt"}

我们可以通过压力和密度检测来看模拟是否平衡

#压力
out=!{"gmx energy -f npt.edr -o pressure.xvg<}#密度out=!{"gmx energy -f npt.edr -o density.xvg<}
#演示使用
fig,ax=plt.subplots() plotXVG(ax,"pressure.xvg") ppl.legend(ax) plt.show()

#演示使用
fig,ax=plt.subplots() plotXVG(ax,"density.xvg") ppl.legend(ax) plt.show()

成品MD

随着两个平衡阶段的完成, 体系已经在需要的温度和压强下平衡好了. 我们现在可以放开位置限制并进行成品MD以收集数据了. 这个过程跟前面的类似. 运行grompp时, 我们还要用到检查点文件(在这种情况下,其中包含了压力耦合信息). 我们要进行一个1 ns的MD模拟, 所用的参数文件如下:

title       = OPLS MD simulation
; Run parameters
integrator  = md        ; leap-frog integrator
nsteps      = 500000    ; 2 * 500000 = 1000 ps (1 ns)
dt          = 0.002     ; 2 fs
; Output control
nstxout             = 5000      ; save coordinates every 10.0 ps
nstvout             = 5000      ; save velocities every 10.0 ps
nstenergy           = 5000      ; save energies every 10.0 ps
nstlog              = 5000      ; update log file every 10.0 ps
nstxout-compressed  = 5000      ; save compressed coordinates every 10.0 ps
                                ; nstxout-compressed replaces nstxtcout
compressed-x-grps   = System    ; replaces xtc-grps
; Bond parameters
continuation            = yes       ; Restarting after NPT
constraint_algorithm    = lincs     ; holonomic constraints
constraints             = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Neighborsearching
cutoff-scheme   = Verlet
ns_type         = grid      ; search neighboring grid cells
nstlist         = 10        ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb        = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw            = 1.0       ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype     = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order       = 4         ; cubic interpolation
fourierspacing  = 0.16      ; grid spacing for FFT
; Temperature coupling is on
tcoupl      = V-rescale             ; modified Berendsen thermostat
tc-grps     = Protein Non-Protein   ; two coupling groups - more accurate
tau_t       = 0.1     0.1           ; time constant, in ps
ref_t       = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl              = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype          = isotropic             ; uniform scaling of box vectors
tau_p               = 2.0                   ; time constant, in ps
ref_p               = 1.0                   ; reference pressure, in bar
compressibility     = 4.5e-5                ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc     = xyz       ; 3-D PBC
; Dispersion correction
DispCorr    = EnerPres  ; account for cut-off vdW scheme
; Velocity generation
gen_vel     = no        ; Velocity generation is off
#演示使用
file=open('md.mdp','w') file.write(""" title       = OPLS Lysozyme MD simulation ; Run parameters integrator  = md        ; leap-frog integrator nsteps      = 500000    ; 2 * 500000 = 1000 ps (1 ns) dt          = 0.002     ; 2 fs ; Output control nstxout             = 5000      ; save coordinates every 10.0 ps nstvout             = 5000      ; save velocities every 10.0 ps nstenergy           = 5000      ; save energies every 10.0 ps nstlog              = 5000      ; update log file every 10.0 ps nstxout-compressed  = 5000      ; save compressed coordinates every 10.0 ps                                ; nstxout-compressed replaces nstxtcout compressed-x-grps   = System    ; replaces xtc-grps ; Bond parameters continuation            = yes       ; Restarting after NPT constraint_algorithm    = lincs     ; holonomic constraints constraints             = all-bonds ; all bonds (even heavy atom-H bonds) constrained lincs_iter              = 1         ; accuracy of LINCS lincs_order             = 4         ; also related to accuracy ; Neighborsearching cutoff-scheme   = Verlet ns_type         = grid      ; search neighboring grid cells nstlist         = 10        ; 20 fs, largely irrelevant with Verlet scheme rcoulomb        = 1.0       ; short-range electrostatic cutoff (in nm) rvdw            = 1.0       ; short-range van der Waals cutoff (in nm) ; Electrostatics coulombtype     = PME       ; Particle Mesh Ewald for long-range electrostatics pme_order       = 4         ; cubic interpolation fourierspacing  = 0.16      ; grid spacing for FFT ; Temperature coupling is on tcoupl      = V-rescale             ; modified Berendsen thermostat tc-grps     = Protein Non-Protein   ; two coupling groups - more accurate tau_t       = 0.1     0.1           ; time constant, in ps ref_t       = 300     300           ; reference temperature, one for each group, in K ; Pressure coupling is on pcoupl              = Parrinello-Rahman     ; Pressure coupling on in NPT pcoupltype          = isotropic             ; uniform scaling of box vectors tau_p               = 2.0                   ; time constant, in ps ref_p               = 1.0                   ; reference pressure, in bar compressibility     = 4.5e-5                ; isothermal compressibility of water, bar^-1 ; Periodic boundary conditions pbc     = xyz       ; 3-D PBC ; Dispersion correction DispCorr    = EnerPres  ; account for cut-off vdW scheme ; Velocity generation gen_vel     = no        ; Velocity generation is off """) file.close()
out=!{"gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr"}
!gmx mdrun -v -deffnm md_0_1

具体的分析方法与内容我们下次分享


更多原创精彩视频敬请关注生信杂谈: