Re: [AMBER-Developers] gpu-tachyon branch causing error on GPU from Bill Miller III on 2013-10-17 (Amber Developers Archive Oct 2013)

From: Bill Miller III <brmilleriii.gmail.com>
Date: Thu, 17 Oct 2013 14:02:11 -0400

Scott,

I just tested the updated code and it works perfectly now. Thanks!

-Bill

On Thu, Oct 17, 2013 at 12:56 PM, Scott Le Grand <varelse2005.gmail.com>wrote:

> Hey Bill,
> Sync to TOT and try again. Think I fixed it. Thanks for the clear repro!
>
> Scott
>
>
>
> On Wed, Oct 16, 2013 at 6:53 PM, Bill Miller III <brmilleriii.gmail.com
> >wrote:
>
> > Hi,
> >
> > I have been trying to run some MD using mass partitioning with the
> > gpu-tachyon branch of the Amber git tree. I did a 'git pull' this morning
> > and compiled pmemd(.MPI) and pmemd.cuda(.MPI). When I attempt to run the
> > simulation with the following input:
> >
> > Explicit solvent molecular dynamics constant pressure 25 ns MD
> > &cntrl
> > imin=0, irest=1, ntx=5,
> > ntpr=25000, ntwx=25000, ntwr=-50000, nstlim=12500000,
> > dt=0.004, ntt=3, tempi=335,
> > temp0=335, gamma_ln=1.0, ig=-1,
> > ntp=1, ntc=2, ntf=2, cut=9,
> > ntb=2, iwrap=1, ioutfm=1,
> > /
> >
> > with pmemd.cuda, I get the following error:
> >
> > Nonbond cells need to be recalculated, restart simulation from previous
> > checkpoint
> > with a higher value for skinnb.
> >
> > To correct this error, I increased the value of skinnb to 3.0 in the mdin
> > file (by adding &ewald namelist), and this error goes away but I get a
> > different error:
> >
> > cudaMemcpy GpuBuffer::Download failed unspecified launch failure
> >
> > and the simulation dies after only a few steps (~ 6 steps). The
> > temperature, pressure, and energies increase substantially after the
> first
> > step.. I have shown the energies below for the first three steps:
> >
> > NSTEP = 1 TIME(PS) = 455500.004 TEMP(K) = 376.45 PRESS =
> > -171.4
> > Etot = -176203.3645 EKtot = 61495.2500 EPtot =
> > -237698.6145
> > BOND = 2194.3161 ANGLE = 5722.4810 DIHED =
> > 6928.1889
> > 1-4 NB = 2392.0336 1-4 EEL = 19312.4710 VDWAALS =
> > 23856.6106
> > EELEC = -298104.7157 EHBOND = 0.0000 RESTRAINT =
> > 0.0000
> > EKCMT = 23351.5596 VIRIAL = 26376.5669 VOLUME =
> > 817600.0187
> > Density =
> > 0.9977
> >
> >
> ------------------------------------------------------------------------------
> >
> >
> > NSTEP = 2 TIME(PS) = 455500.008 TEMP(K) = NaN PRESS
> > =-52155.0
> > Etot = NaN EKtot = NaN EPtot =
> > 601744736.3088
> > BOND = 2395.5598 ANGLE = 7383.8134 DIHED =
> > 7021.7748
> > 1-4 NB = 2444.0931 1-4 EEL = 19301.1203 VDWAALS =
> > 601920908.4454
> > EELEC = -214718.4980 EHBOND = 0.0000 RESTRAINT =
> > 0.0000
> > EKCMT = 23537.8979 VIRIAL = 944202.1688 VOLUME =
> > 817574.8784
> > Density =
> > 0.9977
> >
> >
> ------------------------------------------------------------------------------
> >
> >
> > NSTEP = 3 TIME(PS) = 455500.012 TEMP(K) = NaN PRESS
> > =622386.7
> > Etot = NaN EKtot = NaN EPtot =
> > **************
> > BOND = ************** ANGLE = 713224.4557 DIHED =
> > 20586.1594
> > 1-4 NB = ************** 1-4 EEL = 12529.4645 VDWAALS =
> > **************
> > EELEC = -275698.1771 EHBOND = 0.0000 RESTRAINT =
> > 0.0000
> > EKCMT = 12582912.0000 VIRIAL = 1698479.3771 VOLUME =
> > 809967.6470
> > Density =
> > 1.0071
> >
> >
> ------------------------------------------------------------------------------
> >
> > And when I visualize the simulation the atoms go everywhere after the
> > initial frame.
> >
> > However, when I run the same inputs (with and without increasing skinnb)
> > with the CPU code (pmemd.MPI), I do not get any of these errors and the
> > simulation appears to run smoothly (the energies and dynamics appear
> > normal).
> >
> > The last commit for the gpu-tachyon branch before I compiled was commit
> > f04c58935955826a24e5a534b9ea3446a44fbb87.
> >
> > Previously, I have gotten this simulation to work using pmemd.cuda from
> the
> > gpu-tachyon branch back in August. The last commit from that compilation
> > was
> >
> > commit a0a4f71de7595c70fb0014d46f412fdfb767a134
> > Author: scott legrand <slegrand.amber.(none)>
> > Date: Wed Aug 14 16:37:20 2013 -0700
> >
> > Fix for Bug 210
> >
> > It appears as if something was recently added to the gpu-tachyon branch
> > that broke pmemd.cuda's ability to run this simulation.
> >
> > I have placed the prmtop and restart file used to run this simulation on
> > Dropbox, if anyone wants to try to reproduce the errors.
> >
> > prmtop: https://www.dropbox.com/s/aqvgskaojbgbhwv/test.prmtop
> > restart: https://www.dropbox.com/s/f2tpv8pxovy6usz/test.rst7
> >
> > I am running these tests on a linux workstation using Red Hat 6.4 OS,
> with
> > 12 2.10 GHz Intel Xeon processors and a GTX-780 GPU. I am using nVidia
> > Driver version 325.15.
> >
> > Let me know if you have any questions about any of the details.
> >
> > Thanks,
> > Bill
> >
> > --
> > Bill Miller III
> > Post-doc
> > University of Richmond
> > 417-549-0952
> > _______________________________________________
> > AMBER-Developers mailing list
> > AMBER-Developers.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
> >
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>

-- 
Bill Miller III
Post-doc
University of Richmond
417-549-0952
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers

Received on Thu Oct 17 2013 - 11:30:03 PDT