Re: [AMBER-Developers] Oops, I introduced a bug into pmemd.cuda_SPFP (master branch), but how shall I resolve it? from Jason Swails on 2017-10-25 (Amber Developers Archive Oct 2017)

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 25 Oct 2017 19:00:34 -0400

On Wed, Oct 25, 2017 at 4:16 PM, David Cerutti <dscerutti.gmail.com> wrote:

> It's possible the non-equilibrium MD would also be affected--I've fixed the
> problem in my branch and am preparing the hotfix in master. It's kind of
> remarkable that the damage from this was only one or two features of the
> code, and features that I had not directly dealt with at all. In any case,
> I'm also looking into what we can do to remedy the problem of so many false
> positives for failures in the SPFP test suite. I ran both test suites
> before committing--what happened is that there is a baseline of 40+ test
> cases that "fail" in the SPFP mode, and in skimming those diffs I didn't
> see the problem at first. It was only after doing about a half hour of
> parsing through all diffs one by one, when looking for changes to dihedral
> energies for another edit I've been doing, that I noticed the error in
> constant pH simulations. It's reasonable to ask for that level of review
> before commits to master, but on the other hand shouldn't we be able to
> clamp down on the number of false positives?
>

The sheer number of false positives are a substantial hindrance to
effective quality control and stand in the way of the tests actually being
useful. If you find yourself searching for a twisted needle in a
needlestack when looking through test failures, there's something seriously
wrong -- it's only a little better than having no tests at all.

There have been a number of ideas in the past with regards to "fixing" the
test cases (use the CPU deterministic PRNG for stochastic tests that
doesn't give different results for each card, and tweak either the length
of test or test system to one that doesn't yield divergent results over the
time span of the test. The importance of this has also been stressed
repeatedly in the past, but nobody with access the "gold-standard" machine
(where all the tests nominally *do* pass, if such a thing even exists
anymore) prioritized it high enough to actually get it done.

All the best,
Jason

-- 
Jason M. Swails
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers

Received on Wed Oct 25 2017 - 16:30:03 PDT