Re: [AMBER-Developers] Proposal for a new git branch

From: Jason Swails <jason.swails.gmail.com>
Date: Sat, 12 Nov 2011 17:49:43 -0500

On Sat, Nov 12, 2011 at 4:51 PM, Ross Walker <ross.rosswalker.co.uk> wrote:

Look at the ConstantPH tests for example. These are broken now and we have
> no idea when they started breaking and why. Instead we just get them
> disabled... Fail!
>

To clarify only 1 of 3 tests fail in a subtle way (diffs make the failure
seem way worse than it really is). The core implementation works, and my
experimental method isn't ready yet (that's really why I disabled the
test). Since we're between-releases, I don't think this is a suitable
poster child for this issue. (The existing functionality always passed,
and I fixed the error in H-REMD after you caught it on a platform I hadn't
tested on).


> Same goes for the MMPBSA test results in AMBER 11 + AmberTools 1.5. Why do
> the test cases fail? - Is this rationalized? - simply a different random
> number stream? Or did a bunch of bugs get fixed in the new MMPBSA?


The PBSA solver changed from AmberTools 1.4 to AmberTools 1.5. The
executive decision was made, as I recall, to *not* update the tests based
on PBSA in the Amber 11 directory due to the added weight of those changes,
which will cause the mm_pbsa.pl tests to fail (because it's part of
Amber11). MMPBSA.py should be passing, last I knew, since it's released as
part of AT 1.5.

The changes in the MMPBSA.py test cases are rationalized, yes.

As an end
> user this is VERY alarming (forget the message at the end of AT15_Amber.py
> -
> nobody reads that and it doesn't matter if you have it in day glo green
> neon
> it still won't get read).


At some point failure to read becomes PEBKAC, and the above is caused by
the staggered release, so it doesn't quite apply to development versions,
IMO. It's also one of the (many) known weaknesses of the staggered release
and one that is sure to get addressed. All tests passed in the developer
tree prior to either release, and it's only the unnatural union that gave
problems.

I wouldn't go so far to say that AmberTools 1.5 + Amber11 was a failure
because we learned a lot of things *not* to do in the future, but it's
close. We all already know that we should *not* use the same approach in
the future (for many reasons), but I think we can lay the AT15_Amber11
beating to rest -- the horse has been long dead.

Moreover, I think this issue is actually more easily fixed with a better,
more general bug fix application solution (which I now think we have).

(As an interesting observation, these 3 examples do sum up the bulk of my
contributions to Amber over the past couple years :-P)

No the complete opposite as Andreas realized the other day. You can test
> your code at least works on your system (I encourage people not to get
> totally lazy) and then check it in and within an hour you will have it
> tested on various different variants. Then you can go and fix anything you
> inadvertently broke.
>

If we see cruise control as a helpful tool to automate our testing (on
multiple platforms) rather than a strict babysitter looking for any reason
to shame and punish us, then I'll definitely agree with Ross here.

As an aside, we need some kind of mechanism, IMO, to guard against a hung
program on cruise control. What happens if there's some kind of race
condition that leads to threads waiting at some barrier that will never be
answered? I can attest that this is very easy to accomplish using MPI.

All the best,
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sat Nov 12 2011 - 15:00:03 PST
Custom Search