Re: [AMBER-Developers] On the precision of SPFP in pmemd.cuda

From: David Cerutti <dscerutti.gmail.com>
Date: Thu, 8 Jun 2017 16:50:26 -0400

The numbers are for a single frame, a good metric thanks to deterministic
code.

On Jun 8, 2017 4:40 PM, "B. Lachele Foley" <lfoley.ccrc.uga.edu> wrote:

> Thanks!
>
>
> Next question: the original error numbers you posted, were those for a
> single frame? If not, over how many steps did they accumulate?
>
>
> :-) Lachele
>
> Dr. B. Lachele Foley
> Associate Research Scientist
> Complex Carbohydrate Research Center
> The University of Georgia
> Athens, GA USA
> lfoley.uga.edu
> http://glycam.org
>
>
> ________________________________
> From: David Cerutti <dscerutti.gmail.com>
> Sent: Monday, June 5, 2017 1:28 PM
> To: AMBER Developers Mailing List
> Subject: Re: [AMBER-Developers] On the precision of SPFP in pmemd.cuda
>
> It would have been more realistic to represent the coordinates by assigning
> them to mock-up hash cells, casting them to floats and then back to
> doubles, then taking them back to a unified coordinate representation and
> writing back the inpcrd file. But, more than half of the coordinates will
> be perturbed by the 1.0e-6 amount in a typical simulation as the boxes will
> measure at least 9 + 2 = 11A in all directions, and for any coordinates
> bigger than 8A the ULP (unit in the last place) error will be 1 part in
> 2^24.
>
> I may have over-estimated the error due to the coordinate representation if
> the doubles are getting rounded to the nearest ULP of the FP32
> representation--the error would only be 1 part in 2^25. Still, the error
> that I am accruing for combining forces from each synchronous warp is on
> the same order and both are dwarfed by the uncertainties arising from the
> electrostatics mesh (which creates its own error in the conservation of
> forces), the Lennard-Jones cutoff, and the direct space electrostatics.
>
> Dave
>
>
> On Mon, Jun 5, 2017 at 5:39 AM, B. Lachele Foley <lfoley.ccrc.uga.edu>
> wrote:
>
> > Why did you perturb your coords to <=1e-6 A? Have you tried perturbing
> > them by <=( (1e-6) * xi ) where xi is x, y or z?
> >
> >
> > :-) Lachele
> >
> > Dr. B. Lachele Foley
> > Associate Research Scientist
> > Complex Carbohydrate Research Center
> > The University of Georgia
> > Athens, GA USA
> > lfoley.uga.edu
> > http://glycam.org
> > ________________________________
> > From: David Cerutti <dscerutti.gmail.com>
> > Sent: Saturday, June 3, 2017 3:20:34 PM
> > To: AMBER Developers Mailing List
> > Subject: Re: [AMBER-Developers] On the precision of SPFP in pmemd.cuda
> >
> > Where is the wiki, so I can post this (there's some follow-up to write,
> > too)?
> >
> > As far as the extra memory, __shared__ is wide open now that I've
> converted
> > everything to use __shfl() (that's one thread reading another thread's
> > value, to make the most use of registers). There is about 44kb out of a
> > maximum 48kb available. There is no danger of having 32-bit integers
> > overflow anymore--there are no 32-bit integers in the process.
> >
> > I've tried a number of things to improve the way the sparse tiles get
> > handled, but aside from one tweak which I feel makes the code a little
> > easier to parse and has negligible effect on speed, there is really no
> way
> > to beat what we've got right now. New technology and software (Volta,
> CUDA
> > 9) will offer us some features to do what I've not been able to with
> > cleverness, but it will also require us to be very careful about the
> > accumulation because it's breaking the warp synchronization that the
> > current code relies on.
> >
> > Dave
> >
> >
> > On Sat, Jun 3, 2017 at 12:45 AM, B. Lachele Foley <lfoley.ccrc.uga.edu>
> > wrote:
> >
> > > Was this ever put on the wiki? I'm not sure how much I can contribute,
> > > but I'm curious about things. For example, what would you do with the
> > > newly freed shared memory? (and if you said that, sorry but I got lost
> > > somewhere) Also, these recommendations - are they something I should
> > > share with labmates as actual recommendations or as possible ones?
> > >
> > >
> > > :-) Lachele
> > >
> > > Dr. B. Lachele Foley
> > > Associate Research Scientist
> > > Complex Carbohydrate Research Center
> > > The University of Georgia
> > > Athens, GA USA
> > > lfoley.uga.edu
> > > http://glycam.org
> > > ________________________________
> > > From: David Cerutti <dscerutti.gmail.com>
> > > Sent: Wednesday, May 17, 2017 1:08:31 AM
> > > To: AMBER Developers Mailing List
> > > Subject: Re: [AMBER-Developers] On the precision of SPFP in pmemd.cuda
> > >
> > > I'd be happy to post this on a wiki. Ideally, we can get a rich wiki
> > > environment incorporated into the new website, which I still need to
> put
> > > the capstone on.
> > >
> > > I was considering minimization last night--the code *already* clamps
> > forces
> > > at 10000 kcal/mol-A for minimization, no matter the mode (it just takes
> > > place at different stages of the accumulation), so for minimization
> modes
> > > in SPFP I've lowered that barrier to 1800 kcal/mol-A (unlikely that two
> > > very large forces will align so tightly that they could combine to
> break
> > > 2048 kcal/mol-A). Also there are notes on the current GPU pages
> warning
> > > people not to minimize with SPFP.
> > >
> > > The "official" internal coordinate representation of pmemd.cuda is
> double
> > > precision, and iwrap is inconsequential. What happens, rather, is that
> > > when the pair list gets built, the atoms get re-imaged and placed into
> > the
> > > hash table as floating point numbers. Long story short, each cell in
> the
> > > hash table has an origin, and the atom locations are stored, it seems,
> by
> > > their Cartesian displacements from that origin. Each cell in the hash
> > > table seems to be at least the cutoff plus non-bonded buffer zone, so
> > think
> > > 9A + 2A or 10A + 2A in the JAC case above, and the cells have to span
> the
> > > unit cell. What that means is that the floating point numbers have to
> > > cover a range of [0, cutoff) initially, and (0-buffer, cutoff+buffer)
> as
> > > the pairlist gets older, and that means numbers of up to 10-12. Plug
> > those
> > > into this handy website:
> > >
> > > https://www.h-schmidt.net/FloatConverter/IEEE754.html
> IEEE-754 Floating Point Converter - h-schmidt.net<https://www.h-
> schmidt.net/FloatConverter/IEEE754.html>
> www.h-schmidt.net
> Tools & Thoughts IEEE-754 Floating Point Converter Translations: de. This
> page allows you to convert between the decimal representation of numbers
> (like "1.02") and ...
>
>
>
> > >
> > > and one can see that, for numbers between 8 and 16, the mantissa's
> least
> > > significant bits correspond to displacements of 1 part in 1048576, or
> > about
> > > a millionth of an Angstrom. We may be able to rework the cell grid and
> > > store atom locations relative to the CENTER of the cell, and pick up
> > > another bit worth of precision in most cases (have the numbers span a
> > range
> > > from -7.5 to +7.5, for example--the last significant bits will
> correspond
> > > to displacements of 1 / 2097152 Angstroms). Also there may be cases,
> > > perhaps in octahedral unit cells, where the a few of the atoms' cell
> > > coordinates have to get bigger than 16, and would thus lose a bit of
> > > precision (literally).
> > >
> > > For those applications with aggressive swapping, it might be feasible
> to
> > > keep the original form of the kernel in the code and apply that for the
> > > first few steps after a swap move, to give extremely large forces time
> to
> > > dissipate before switching back to the streamlined version.
> > >
> > > One way or another, in 2015 D. E. Shaw made their code run 30% faster
> > than
> > > ours (after accounting for their multiple timesteps and so forth),
> > breaking
> > > away in what was once a tight race. I think I know how to get that
> > > performance in pmemd.cuda, and more __shared__ memory is one thing that
> > > will help.
> > >
> > > Dave
> > >
> > >
> > > On Tue, May 16, 2017 at 11:11 PM, Jason Swails <jason.swails.gmail.com
> >
> > > wrote:
> > >
> > > > Hi Dave,
> > > >
> > > > This is a lot of cool information and a nicely complete investigation
> > of
> > > > this -- it was an enjoyable and informative read. Maybe I could
> > > encourage
> > > > you to start up a Wiki page with some of your investigations so your
> > > > investigative work survives a bit longer past the end of this
> > discussion.
> > > >
> > > > On Tue, May 16, 2017 at 10:10 PM, David Cerutti <dscerutti.gmail.com
> >
> > > > wrote:
> > > >
> > > > > (In SP modes pmemd.cuda
> > > > > can represent positions to a precision of 1/1048576 A, a number
> which
> > > > won't
> > > > > change unless the cutoff gets needlessly large or unadvisably
> small.)
> > > >
> > > >
> > > > What does the internal representation of the coordinates look like
> > for a
> > > > PME simulation in pmemd.cuda when iwrap is set to 0 (i.e., no
> wrapping
> > is
> > > > done)? As coordinates grow, this precision you pointed out degrades
> > (the
> > > > difference between adjacent real numbers that can be represented with
> > > > fixed-size floating point numbers grows as the absolute values of the
> > > > numbers themselves grow). Obviously with a nicely packed box (like
> how
> > > > Amber simulations usually start or restarting from a simulation using
> > > > iwrap=1), the precision won't vary much across 64 A. But if
> > coordinates
> > > > are allowed to grow without bound, I suspect this precision could
> > quickly
> > > > become the largest source of error.
> > > >
> > > > Probably the safest way to run an iwrap=0 simulation is to maintain
> the
> > > > "compact" representation of coordinates internally (so as to avoid
> > losing
> > > > precision when computing energies and/or forces) as well as a set of
> > > > translations for each atom so that the "unwrapped" representation can
> > be
> > > > returned for printing in the corresponding output files. I'm pretty
> > sure
> > > > this is what OpenMM does, but I have no clue about pmemd.cuda.
> > > >
> > > >
> > > > > representation. In this format I can handle forces up to 2000
> > > > kcal/mol-A.
> > > > >
> > > > > I'd conclude this by saying "I don't think that any simulation
> that's
> > > > > stable at all is going to break that," but it sounds too ominous
> and
> > > will
> > > > > no doubt invite comments of "famous last words..." If really
> > necessary,
> > > > I'm
> > > > > pretty confident that I can push to 19 bits of precision past the
> > > decimal
> > > > > (force truncation is then getting us slightly more error than the
> > > > > coordinate rounding, but still way way below any of the other
> > sources),
> > > > and
> > > > > let the accumulators take up to 4000 kcal/mol-A forces. About one
> > in a
> > > > > million individual forces gets above 64 kcal/mol-A, and it becomes
> > > > > exponentially less likely to get forces of larger and larger sizes
> > the
> > > > > higher up you go.
> > > > >
> > > >
> > > > An obvious counterexample is minimization, although it's not all that
> > > bad
> > > > to make people minimize on the CPU. Another perhaps less obvious
> > > > application that may result in overflowing forces/energies come from
> > > hybrid
> > > > MD/MC methodologies (e.g., H-REMD) with aggressive move sets that can
> > > > result in a few forces and/or energies becoming extremely large. The
> > > > obvious thing to happen here is to just reject that move and soldier
> > on,
> > > > but if the simulation crashes or becomes corrupted , that may prove
> > > > limiting to some of these kinds of applications.
> > > >
> > > > Just some thoughts.
> > > >
> > > > All the best,
> > > > Jason
> > > >
> > > > --
> > > > Jason M. Swails
> > > > _______________________________________________
> > > > AMBER-Developers mailing list
> > > > AMBER-Developers.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> AMBER-Developers Info Page<http://lists.ambermd.org/
> mailman/listinfo/amber-developers>
> lists.ambermd.org
> To unsubscribe from AMBER-Developers, get a password reminder, or change
> your subscription options enter your subscription email address:
>
>
>
> > > >
> > > _______________________________________________
> > > AMBER-Developers mailing list
> > > AMBER-Developers.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > _______________________________________________
> > > AMBER-Developers mailing list
> > > AMBER-Developers.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > >
> > _______________________________________________
> > AMBER-Developers mailing list
> > AMBER-Developers.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
> AMBER-Developers Info Page<http://lists.ambermd.org/
> mailman/listinfo/amber-developers>
> lists.ambermd.org
> To unsubscribe from AMBER-Developers, get a password reminder, or change
> your subscription options enter your subscription email address:
>
>
>
> > _______________________________________________
> > AMBER-Developers mailing list
> > AMBER-Developers.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
> AMBER-Developers Info Page<http://lists.ambermd.org/
> mailman/listinfo/amber-developers>
> lists.ambermd.org
> To unsubscribe from AMBER-Developers, get a password reminder, or change
> your subscription options enter your subscription email address:
>
>
>
> >
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Thu Jun 08 2017 - 14:00:03 PDT
Custom Search