Re: [AMBER-Developers] CMake in Amber

From: Scott Le Grand <varelse2005.gmail.com>
Date: Sun, 4 Apr 2021 12:46:50 -0700

1) Because that's the benchmark we've used since day one. Apples to apples
and all that. It's a relatively small system for single GPU which is the
perfect stand-in for large system multi-GPU efficiency. My goal is 4x
scaling on 8 GPUs with a positive scaling experience beyond that relaxing
system size limits up to 1B atoms in the process. If JAC gets faster, we
can scale farther.
2) Because the path to AMBER 20 broke multiple implicit assumptions in my
design* for AMBER so I went back in time to change the future. All relevant
functionality will be restored over time, but I spent 6 months of 2020
trying to do exactly that before throwing my arms up in utter frustration.
The alternative is walking away from all the code and starting a new
framework.
3) RTX3090**
4) Remember this is PMEMD 2.0 we're building here. It's been almost 12
years, it's time to rewrite.

But... Your local force code still shows an acceleration over the original
local force code even at full 64-bit accumulation. So that's getting
refactored along the way. Everything else so far ia a perf regression
without the precision model changes alas. But... you and I have
accidentally created working variant of SPXP. Your stuff will live again in
its reviva and you get first authorship IMO because while it's great work,
it's *not* SPFP with those precision changes in place (18-bit mantissa?
C'mon man(tm)...)

*Should have spelled them out, but even I couldn't predict the end of the
CUDA Fellow program a priori ending any support for further work, but now
my bosses have let we work on it again as my dayjob.
**
https://www.exxactcorp.com/blog/Molecular-Dynamics/rtx3090-benchmarks-for-hpc-amber-a100-vs-rtx3080-vs-2080ti-vs-rtx6000

On Sun, Apr 4, 2021 at 12:03 PM David Cerutti <dscerutti.gmail.com> wrote:

> "Meanwhile, AMBER16 refactored to SM 7 and beyond is already hitting 730
> ns/day on JAC NVE 2 fs. AMBER20 with the grid interpolation and local force
> precision sub FP32 force hacks removed hits 572 ns/day (down from 632 if
> left in as we shipped it). That puts me nearly 1/3 to my goal of doubling
> overall AMBER performance which is what is important to me and where I'm
> going to focus my efforts..."
>
> Please explain here.
> 1.) Why are we back to using the old JAC NVE 2fs benchmark? The new
> benchmarks were redesigned several years ago to make more uniform tests and
> take settings that standard practitioners are now using.
> 2.) Why is Amber16 being refactored rather than Amber20?
> 3.) What does it mean to be hitting 730 ns/day? What card is being
> compared here--the Amber20 benchmarks look like they could be a V100,
> Titan-V, or perhaps an RTX-2080Ti.
>
>
> On Sun, Apr 4, 2021 at 12:11 PM Scott Le Grand <varelse2005.gmail.com>
> wrote:
>
> > But getting back on topic, CUDA 7.5 is a 2015 toolkit and SM 5.x and
> below
> > are deprecated now. SM 6 is a huge jump over SM 5 enabling true virtual
> > memory and I suggest deprecating support for SM 5 across the board. SM 7
> > and beyond alas mostly complicated warp programming and introduced tensor
> > cores which currently seem useless for straight MD, but perfect for
> running
> > AI models inline with MD.
> >
> > CUDA 8 is a 2017 toolkit. That's way too soon to deprecate IMO and if
> cmake
> > has ish with it, that's a reason not to use cmake, not a reason to
> > deprecate CUDA 8.
> >
> >
> > On Sun, Apr 4, 2021 at 8:55 AM Scott Le Grand <varelse2005.gmail.com>
> > wrote:
> >
> > > Ross sent me two screenshots of cmake losing its mind with an 11.x
> > > toolkit. I'll file an issue, but no, I'm not going to fix cmake issues
> > > myself at all. I'm open to someone convincing me cmake is better than
> the
> > > configure script, but no one has made that argument yet beyond "because
> > > cmake" and until that happens, that just doesn't work for me. Happy to
> > > continue helping with the build script that worked until convinced
> > > otherwise. Related: I still use nvprof, fight me.
> > >
> > > Meanwhile, AMBER16 refactored to SM 7 and beyond is already hitting 730
> > > ns/day on JAC NVE 2 fs. AMBER20 with the grid interpolation and local
> > force
> > > precision sub FP32 force hacks removed hits 572 ns/day (down from 632
> if
> > > left in as we shipped it). That puts me nearly 1/3 to my goal of
> doubling
> > > overall AMBER performance which is what is important to me and where
> I'm
> > > going to focus my efforts as opposed to the new shiny build system that
> > is
> > > getting better (and I *hate* cmake for cmake's sake), but we rushed it
> to
> > > production IMO like America reopened before the end of the pandemic.
> > >
> > >
> > >
> > >
> > >
> > > On Sun, Apr 4, 2021 at 5:51 AM David A Case <david.case.rutgers.edu>
> > > wrote:
> > >
> > >> On Sat, Apr 03, 2021, Scott Le Grand wrote:
> > >>
> > >> >cmake is still not quite ready for prime time disruption of
> configure.
> > >> It's
> > >> >getting there though.
> > >>
> > >> If there are problems with cmake, please create an issue on gitlab,
> and
> > >> mention .multiplemonomials to get Jamie's attention. Please try to
> > avoid
> > >> the syndrome of saying "I can get this to work with configure, and I'm
> > to
> > >> busy right now to do anything else."
> > >>
> > >> I have removed the documentation for the configure process in the
> > Amber21
> > >> Reference Manual, although the files are still present. We can't
> > continue
> > >> to support and test two separate build systems, each with their own
> > bugs.
> > >>
> > >> ...thx...dac
> > >>
> > >>
> > >> _______________________________________________
> > >> AMBER-Developers mailing list
> > >> AMBER-Developers.ambermd.org
> > >> http://lists.ambermd.org/mailman/listinfo/amber-developers
> > >>
> > >
> > _______________________________________________
> > AMBER-Developers mailing list
> > AMBER-Developers.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
> >
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sun Apr 04 2021 - 13:00:02 PDT
Custom Search