Re: [AMBER-Developers] Oops, I introduced a bug into pmemd.cuda_SPFP (master branch), but how shall I resolve it? from David Cerutti on 2017-10-26 (Amber Developers Archive Oct 2017)

From: David Cerutti <dscerutti.gmail.com>
Date: Thu, 26 Oct 2017 15:26:12 -0400

Thanks all for the helpful comments--I will use Dan's suggestion for SPFP
error checking until a longer-term solution is found.

But, I'd like to get back to the main point of the thread. I introduced an
error, I later caught that error. I have resolved it in favor of the
explicit type interpretation that I indicated in my first message--there's
an array of PMEFloat2, but half of what it stores are integers. This gets
another complication by the fact that PMEFloat means fp64 in DPFP and fp32
in SPFP or SPXP. Whenever those integers enter the array, they do so by
__uint_as_float() (SPFP) or __longlong_as_double() (DPFP), and when they
are pulled out the functions __float_as_uint() and __double_as_longlong()
must be used. This adds a bit to the code, like this (qljid is a PMEFloat2
from the array):

#ifdef use_DPFP
atom_ljid = __double_as_longlong(qljid.y);
#else
atom_ljid = __float_as_uint(qljid.y);
#endif

It would be fewer lines to simply write atom_ljid = qljid.y, but it's
rather casual and it doesn't emphasize that the data is really about being
an integer to index into another array. It's critical if ever one wants to
store fp32 in an integer array. I'm trying to look forward a few years,
when there will be more people actively developing the cuda code. Are
people OK with the style?

Dave

On Thu, Oct 26, 2017 at 8:49 AM, Daniel Roe <daniel.r.roe.gmail.com> wrote:

> On Wed, Oct 25, 2017 at 8:58 PM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
> >
> > However, that does not mean one can avoid such testing when modifying
> the code and certainly one can't just rely on our test suite. You have to
> sweat the blood doing the validation I'm afraid. I never found a good fix
> for SPFP with dacdif. The test case format really just doesn't lend itself
> well to that situation. In the end after any major modification to the GPU
> code I would always run the full test suite with SPFP and DPFP with dacdif
> set to save all the output and then I would go through and manually compare
> things by hand myself. In particular every couple of weeks I would do a
> full comparison of the latest SPFP output against the CPU output by hand to
> make sure everything looked good as well as repeating the long timescale
> validation tests I used in the GPU papers to check for energy convergence.
> Once satisfied everything was good I would create a gold set of save files
> for each of the GPU models I had in my machine (at least one from each
> major generation, typically I would test on between 4 to 6 different
> models) and then, because of the deterministic nature of GPU code I could
> rerun the test suite and get perfect comparisons against that gold
> standard. For the vast majority of work the answers wouldn't change so I
> could just blast it through my gold standard test set. If something did
> change I knew it was either a bug or a rounding difference from changing
> orders in the code etc, or modifying the way the random number generator
> worked. I would then conduct a careful by hand comparison again.
>
> Could you upload these gold standards to the GIT master branch? They
> can be excluded from the Amber tarball but I think it would be great
> for devs to have access to them. This way we can at least compare
> against a GPU generation that's closer to something you have. For
> example, it looks like the majority of the current SPFP test outputs
> were generated with a GeForce GTX TITAN X. If I run the tests with a
> K20m the largest diff I get is 5.86e-01 kcal/mol absolute error,
> whereas if I run with a TITAN Xp the largest diff is 3.09e-01, and I
> end up with fewer diffs. So that's one way to cut down on the false
> positives.
>
> > Ultimately for every hour of coding work I did I probable spent upwards
> of 2 hours doing validation. Validation is unfortunately not sexy, slows
> down coding and ultimately sucks but this is arguably the price we pay for
> developing scientific software that thousands of people will use and rely
> on the results from. Given our scientific reputations are on the line for
> the code we write it is very important to take the time to do this
> validation.
>
> Agreed, validation is extremely important. Adding CI testing with
> several generations of GPUs can only improve our QC.
>
> -Dan
>
> >
> > My 0.02 btc,
> >
> > All the best
> > Ross
> >
> >
> >> On Oct 25, 2017, at 18:50, Jason Swails <jason.swails.gmail.com> wrote:
> >>
> >> On Wed, Oct 25, 2017 at 6:48 PM, Hai Nguyen <nhai.qn.gmail.com> wrote:
> >>>
> >>>
> >>> Just a side question: if 99% of the time will be SPFP, why do we make
> the
> >>> DPFP as default for testing?
> >>>
> >>
> >> Different kind of testing. DPFP can be compared directly to CPU
> results.
> >> No other precision really can. But that should certainly not be in
> >> replacement of SPFP testing, which I think was your point.
> >>
> >> --
> >> Jason M. Swails
> >> _______________________________________________
> >> AMBER-Developers mailing list
> >> AMBER-Developers.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber-developers
> >
> >
> > _______________________________________________
> > AMBER-Developers mailing list
> > AMBER-Developers.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
>
>
>
> --
> -------------------------
> Daniel R. Roe
> Laboratory of Computational Biology
> National Institutes of Health, NHLBI
> 5635 Fishers Ln, Rm T900
> Rockville MD, 20852
> https://www.lobos.nih.gov/lcb
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Thu Oct 26 2017 - 12:30:02 PDT