Re: [AMBER-Developers] Please read: we need to look at pmemd.cuda pair list fidelity

From: Daniel Roe <>
Date: Wed, 17 Jan 2018 10:42:57 -0500


I'm still not sure I understand what the problem is. Right now to me
it sounds like what we are missing is a bad input trap here (you
should never be setting the NB skin too small etc).

That said, If I take the parameters you gave in your last example
(cutoff 7.0 Ang., nonbond skin 0.5001, truncated octahedron with
lengths = 33.81) and use sander's pairlist setup routine I get 11
cells in each dimension with individual cell separations of 2.51 Ang..
With the hardcoded cell search dimension of 3 (at least in sander, not
sure about pmemd) that means the smallest so-called subcell cutoff is
7.53 Ang, which is larger than cut+skinnb, so no interactions should
be omitted. Maybe I'm missing something though...

> A lot of study, and posts to a subset of the developers, happened before I made this thread on the actual listserv.

You've clearly done a lot of work on this which is great. However,
considering how critical building the pairlist is, we should be
absolutely certain what the issue is before we go changing test cases
etc. Could you send me off-list one or two of the problem systems so I
can run some tests as well? I feel like the more people we throw at
this issue the quicker it can get fixed.



On Wed, Jan 17, 2018 at 8:25 AM, Ross Walker <> wrote:
> Hi Dave,
> This is pretty dense so I didn't follow everything. Do you have a simple list of tests that show issues / a 3 line summary of all the issues and the scope of each?
> With regards to the very small systems - these have always been problematical and would not be recommended.
> That said the code should quit rather than give incorrect answers if one is trying to use a box that is unreasonably small.
> Should the limit not be the minimum box dimension needs to be greater than 2 x (cutoff + skinnb)? I believe the default
> would be 2.0 for skinnb so in this case 20A making your box too small?
> What happens if you run with a 21A box? Do you see those jumps go away? If yes then we may just need a fix here to
> limit such tiny boxes. I am assuming the jumps you are referring to are for small systems only?
> All the best
> Ross
>> On Jan 16, 2018, at 22:45, David Cerutti <> wrote:
>> A lot of study, and posts to a subset of the developers, happened before I
>> made this thread on the actual listserv. I was able to make a case, for
>> example, where the box volume equilibrates under NPT and we have the
>> problem of shrinking hash cells that eventually get too small for the
>> cutoff. The problem was fixed for the case of rectangular unit cells, but
>> the non-orthorhombic case isn't correctly covered. The simulation settles
>> out to a volume where each unit cell is in fact less than the cutoff in
>> thickness, irrespective of any pair list margin, but the code thinks it's
>> much thicker than therefore keeps right on going. I'll make an NVE case to
>> study the energy conservation effects when one of these things happens.
>> In the meantime, I've been looking in to the other issue I raised.
>> Apologies--I had kept the first issue off of the listserv until I had a
>> better idea of what I was seeing, but I fired off the second issue without
>> as much discussion. Nonetheless, it seems that there is something going
>> on, although it may not quite be what I had thought. The attached plot
>> shows the problem. I constructed a cubic box containing 216 TIP3P
>> waters--familiar to anyone who's ever tiled such a thing using TIP3PBOX in
>> tleap. It's about 18.6A on a side--just big enough to handle an 8A cutoff
>> with a 1A margin. Equilibrating and then simulating that in the NVE
>> ensemble, I see the energy increase as shown on the attached image. Red
>> line = old code. Black line = new code. I need to run it further to
>> verify that the new code isn't any more prone to this than the old code,
>> but it doesn't appear that may changes to the precision model are doing
>> anything detrimental. Furthermore, the occurrence and the size of the
>> jumps are random. Something is weird. I've tried various permutations on
>> this--pushing the cutoff up to 8.7A with a 0.6A skin, dialing up the
>> reciprocal space settings and cutting the time step down to 0.5fs (from
>> 1fs) to remove as much noise as possible. I have still seen one of these
>> jumps in that situation, but it's too early as of yet to say whether the
>> trend has been affected. If I switch to double precision, I can only go
>> 1/10th as fast, and I haven't YET seen a jump, but I'm only out to 28ns so
>> it might. Some of you may notice that, in between jumps, there is a slight
>> DOWNWARD trend to the energy. If the energy isn't flat, errors of any sort
>> should drive the energy UP, we thought, but there may be something about
>> the fixed precision rounding that imparts an overall tendency to lose
>> energy at a very slow rate (not sure on that yet, but the effect is real).
>> I tried making a larger box, one which will have three hash cells in any
>> direction, and so far I haven't seen any jumps in the energy but there is
>> still that slight downward pressure on the overall energy. If I then take
>> that larger box and increase the cutoff, so that there are now only 2x2x2
>> hash cells once again (but each o them is highly oversized), I am not (yet)
>> seeing any jumps of this sort, but that kind of jibes with my original
>> hypothesis given what I know about how the pairlist works. You're going to
>> see this problem when the box is really small (< 20 thick) and only two
>> hash cells thick.
>> By tomorrow I'll hopefully have more investigation of the first problem I
>> brought up. For now, it looks like I wasn't crying wolf over the second
>> one.
>> Dave
>> On Tue, Jan 16, 2018 at 10:57 AM, Daniel Roe <> wrote:
>>> Hi,
>>> On Mon, Jan 15, 2018 at 10:11 PM, David Cerutti <>
>>> wrote:
>>>> What's happening is that, for non-orthogonal unit cells (and the
>>>> only case the tleap really knows how to construct is the truncated
>>>> octahedron, translating it into a parallelepiped with all angles at
>>>> 109.47), sander and pmemd do not know how to properly gauge the distance
>>>> between the box faces, and therefore the number of hash cells to create
>>>> when building their pair lists.
>>> I don't think you are right about this. Sander (and cpptraj and I
>>> think pmemd as well) determines the number of grid cells in each
>>> dimension using the length of the *reciprocal* cell vectors, which is
>>> the inverse of the fractional coordinate system. Because this
>>> determination is done in reciprocal (fractional) space, it doesn't
>>> matter if the unit cell is orthorhombic or not. All of the "gridding"
>>> during the pair list build is also done in fractional space, so I
>>> think the pair list builds are fine. Maybe I'm not understanding what
>>> you're saying the problem is though. Do you have a specific test case
>>> where the pair list is not properly set up (i.e. an atom that is
>>> within the cutoff of another atom is skipped)?
>>> -Dan
>>> --
>>> -------------------------
>>> Daniel R. Roe
>>> Laboratory of Computational Biology
>>> National Institutes of Health, NHLBI
>>> 5635 Fishers Ln, Rm T900
>>> Rockville MD, 20852
>>> _______________________________________________
>>> AMBER-Developers mailing list
>> <EnergyGain.tif>_______________________________________________
>> AMBER-Developers mailing list
> _______________________________________________
> AMBER-Developers mailing list

Daniel R. Roe
Laboratory of Computational Biology
National Institutes of Health, NHLBI
5635 Fishers Ln, Rm T900
Rockville MD, 20852
AMBER-Developers mailing list
Received on Wed Jan 17 2018 - 08:00:02 PST
Custom Search