Re: [AMBER-Developers] weird behavior for pmemd.cuda on Volta cards

From: Adrian Roitberg <roitberg.ufl.edu>
Date: Fri, 8 Dec 2017 12:33:28 -0500

Hi

We have not been able to reproduce this.

Delaram in my group just finished a test in our system,

I attach an output from the following script

nvidia-smi

run jac regular

nvidia-smi

run jac long

nvidia-smi

run jac regular


As you can see, the timings where actually a little bit better for the
long run.

       [0] 1 x GPU: Note: The following floating-point exceptions are
signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
|         ns/day =     927.94   seconds/ns =      93.11
       [0] 1 x GPU: Note: The following floating-point exceptions are
signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
|         ns/day =     934.01   seconds/ns =      92.50
       [0] 1 x GPU: Note: The following floating-point exceptions are
signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
|         ns/day =     929.69   seconds/ns =      92.93

Dave, my guess is that maybe the GPU temperature is going high ?

Adrian


On 12/8/17 12:17 PM, David Cerutti wrote:
> OK that then confirms some odd things I had been seeing. With systems
> larger than JAC, and probably longer overall run times, I was also seeing
> dramatic performance decreases, to the point where our Volta was giving the
> performance of a GP100. It's good to know, then, that our Volta in Case
> lab is not unique (uniquely broken).
>
> Dave
>
>
> On Fri, Dec 8, 2017 at 9:11 AM, David A Case <david.case.rutgers.edu> wrote:
>
>> Hi folks:
>>
>> The few developers that have Volta cards have reported markedly different
>> speedups vs. Pascal for different benchmarks.
>>
>> I think these may be related to the following observation: jobs seem to
>> slow
>> down the longer they run. You can check this on the JAC_production_NVE_4fs
>> benchmark: make nstlim ten times large, and re-run; (you can increase ntwx
>> and ntwr if you like--doesn't seem to make much difference).
>>
>> For me, the default run (250000 steps) clocks at 923 ns/day (total time is
>> 95 sec.) This is in line with what others are getting
>>
>> The 10x longer run returns 824 ns/day; if I also increase ntwx by a
>> factor of
>> 10, I get up to 847 ns/day. (total time of 0.28 hours).
>>
>> A 100x run kind of plateaus at 830 ns/day (total time of 2.9 hours).
>>
>> For larger systems, the difference between the "short run" timings (which
>> I suspect are typical of the official benchmarks) and "real" production
>> runs can be larger. For at 391000 atom system, 10000 steps (82 seconds)
>> runs at 51 ns/day, whereas 50000 steps (450 sec.) runs at 40 ns/day,
>> and 100000 steps (900 seconds) is at 39 ns/day. These are jobs with
>> ntwx=ntwr=100000, so there is no dumping of coordinates to disk, etc.
>>
>> So:
>>
>> 1. Be careful with benchmarks: the official JAC benchmark, at 250000
>> steps,
>> is not long enough for this platform. (!?) Same is probably true for other
>> benchmarks.
>>
>> 2. If we can figure out what is causing the slowdown, we might see a way
>> to
>> get performance improvements in legacy mode.
>>
>> ...dac
>>
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ambermd.org_mailman_listinfo_amber-2Ddevelopers&d=DwICAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=JAg-KQEjdZeg_E8PHDDoaw&m=YSYXelmKCIYhlT-zzctQXxjKrHFtu95JjRy3mgDfeuM&s=pgme-6nlkpAMGrSzfVQK4j1XJFCr2uet6dT8Xt28MRc&e=
>>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ambermd.org_mailman_listinfo_amber-2Ddevelopers&d=DwICAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=JAg-KQEjdZeg_E8PHDDoaw&m=YSYXelmKCIYhlT-zzctQXxjKrHFtu95JjRy3mgDfeuM&s=pgme-6nlkpAMGrSzfVQK4j1XJFCr2uet6dT8Xt28MRc&e=

-- 
Dr. Adrian E. Roitberg
University of Florida Research Foundation Professor
Department of Chemistry
University of Florida
roitberg.ufl.edu
352-392-6972



_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers

Received on Fri Dec 08 2017 - 10:00:02 PST
Custom Search