Re: [AMBER-Developers] Parallel Test failures with CUDA 5.5

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 05 Feb 2014 16:02:43 -0800

I've only ever tested cuda5.0 since 5.5 is at least a 10% hit on
performance so we never bother to use it. If this works fine with 5.0 but
fails with 5.5 though that may be hiding an underlying bug.

I'll try it now.



On 2/5/14, 2:33 PM, "Daniel Roe" <daniel.r.roe.gmail.com> wrote:

>Hi All,
>
>Has anyone seen really egregious test failures using
>pmemd.cuda.MPI/cuda5.5
>compiled from the GIT tree (updated today)? I'm getting some insane
>differences and '***' in energy fields (see below for an example, full
>test
>diffs attached). I do not see this problem with pmemd.cuda/cuda5.5 or
>pmemd.cuda.MPI/cuda5.0 (those diffs are attached as well and seem OK).
>This
>was compiled using GNU 4.8.2 compilers.
>
>Not sure if this means anything, but most of the failures seem to be with
>PME; the only GB stuff that fails is AMD-related.
>
>Any ideas?
>
>-Dan
>
>---------------------------------------
>possible FAILURE: check mdout.tip4pew_box_npt.dif
>/mnt/b/projects/sciteam/jn6/GIT/amber-gnu/test/cuda/tip4pew
>96c96
>< NSTEP = 1 TIME(PS) = 0.002 TEMP(K) = 122.92 PRESS =
> 42.6
>> NSTEP = 1 TIME(PS) = 0.002 TEMP(K) = 128.19 PRESS =
> 43.5
><snip>
>426c426
>< NSTEP = 40 TIME(PS) = 0.080 TEMP(K) = 38.69 PRESS =
>659.4
>> NSTEP = 40 TIME(PS) = 0.080 TEMP(K) = NaN PRESS =
> NaN
>427c427
>< Etot = 18.6535 EKtot = 231.6979 EPtot =
>240.1483
>> Etot = NaN EKtot = NaN EPtot =
> NaN
>428c428
>< BOND = 0.6316 ANGLE = 1.2182 DIHED =
>0.3663
>> BOND = ************** ANGLE = 361.5186 DIHED =
>5.4026
>429c429
>< 1-4 NB = 0.8032 1-4 EEL = 1.3688 VDWAALS =
>100.3454
>> 1-4 NB = ************** 1-4 EEL = ************** VDWAALS =
> NaN
>430c430
>< EELEC = 222.4484 EHBOND = 0. RESTRAINT = 0.
>> EELEC = NaN EHBOND = 0. RESTRAINT = 0.
>431c431
>< EKCMT = 131.0089 VIRIAL = 699.4621 VOLUME =
>192.3578
>> EKCMT = 1278.0524 VIRIAL = NaN VOLUME =
> NaN
>432c432
>< Density =
>0.0030
>> Density =
> NaN
>### Maximum absolute error in matching lines = 2.38e+04 at line 385 field
>3
>### Maximum relative error in matching lines = 1.55e+01 at line 257 field
>3
>
>--
>-------------------------
>Daniel R. Roe, PhD
>Department of Medicinal Chemistry
>University of Utah
>30 South 2000 East, Room 201
>Salt Lake City, UT 84112-5820
>http://home.chpc.utah.edu/~cheatham/
>(801) 587-9652
>(801) 585-6208 (Fax)
>_______________________________________________
>AMBER-Developers mailing list
>AMBER-Developers.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber-developers



_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Wed Feb 05 2014 - 16:30:02 PST
Custom Search