Hey David,
My issue is this. Coming from the outside, it looked like the cn1 and cn2
tables were generated from a pre-defined model rather than as an arbitrary
set of editable values. This isn't strongly specified one way or the other
in the code.
Given the existence of a pre-defined model (The AMBER force field), I made
the assumption it was OK to extract sigma and epsilon for each of the m
atom types and store those per each of the n atoms instead of per each of
the m atom types. This eats O(n) global memory instead of O(m^2) global
memory, which is a net-win for the cn1 and cn2 tables up to this point.
However, pmemd.cuda attains its performance boost by running entirely out
of cache. With the old approach, 8 * O(m^2) + 4 bytes per atom in the
cache is consumed as opposed to 8 bytes per atom for storing per atom.
Significant spillage out of cache is a performance disaster on GPUs and
their biggest Achilles Heel IMO.
This of course breaks off-diagonal hacking. SM 1.3 GPUs don't have enough
cache left to do this. SM 2.x and higher can probably support 50-100 atom
types before spilling. i have to assume someone will go past this limit
because that's what always happens IMO.
Scott
On Wed, May 30, 2012 at 10:52 AM, David A Case <case.biomaps.rutgers.edu>wrote:
> On Wed, May 30, 2012, Scott Le Grand wrote:
> >
> > I'm in the ugly middle of adding NMR support to pmemd.cuda. Adrian and I
> > have observed the pastalicious mess that the FORTRAN code for this is
> right
> > now.
>
> Agreed. You and Adrian can figure which parts you want to support,
> and ignore the rest. Much of the "varying conditions" stuff is hardly
> used. For the restraints themselves, the code gets a lot of use, so is
> probably worth putting in, but you could consider not supporting (or only
> supporting) the interface from Matt Seetin.
>
> >
> > The prmtop is indeed flexible enough to support what you wish to do, but
> it
> > also locks in the method of implementing the VDW parameters. This is a
> bad
> > thing.
>
> I guess I'm lost here. We do need an input routine to check if the
> off-diagonal terms match the model that is being used, and to exit with a
> error message if it is not. Longer term, we need to avoid dependence on
> mixing rules, which leave too little flexibility the physics of things. We
> had not before hit the memory problem you are experiencing on GPU's. [Are
> you storing A and C coefficients by atom type (as in the prmtop file), or
> in
> some other way that grows with the size of the system?]
>
> >
> > In fact it's a very bad thing. GPU and CPU architectures change over
> time
> > and with over 20 years of commercial software development under my belt
> > (and lots of battle scars from dysfunctional practices therein) I really
> > think you don't want to do this.
>
> What is "it" and "this" here (that we don't want to do)? Are you talking
> about mixing rules, about "weird FORTRAN IV legacy code", or about
> something
> else? To me, these seem like very different things.
>
> ....dac
>
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Wed May 30 2012 - 11:30:03 PDT