Re: [AMBER-Developers] CMake in Amber

From: Scott Le Grand <varelse2005.gmail.com>
Date: Sun, 4 Apr 2021 13:02:15 -0700

PPPS There should be no explicit limit on TI atoms whatsoever. AMBER should
handle that under the hood to let the scientists science the &h!+ out of
things, fight me on that.

On Sun, Apr 4, 2021 at 12:58 PM Scott Le Grand <varelse2005.gmail.com>
wrote:

> PPS the future* would appear to be more cores of approximately the same
> computational power as Ampere, not the same number of cores but beefier. As
> such, we need to figure out how to distribute the same basis set of
> calculations across more cores going forward. Doubly so now that we have
> NVLINK, an interconnect that makes multi-GPU not suck.
>
> *A prediction pulled entirely from my Easter Bonnet(tm) based on the
> progression from SM 5 and SM 8 and which should not at all be construed as
> insider information because it's not.
>
> On Sun, Apr 4, 2021 at 12:50 PM Scott Le Grand <varelse2005.gmail.com>
> wrote:
>
>> PS I'm killing off both TI paths and writing the path I wanted to write
>> in the first place that both exploits the original Uber-kernels and
>> Taisung's multi-streaming variant incorporating Darren and Taisung's
>> improvements in the science whilst doing so. After those six impossible
>> things or so, breakfast...
>>
>> On Sun, Apr 4, 2021 at 12:46 PM Scott Le Grand <varelse2005.gmail.com>
>> wrote:
>>
>>> 1) Because that's the benchmark we've used since day one. Apples to
>>> apples and all that. It's a relatively small system for single GPU which is
>>> the perfect stand-in for large system multi-GPU efficiency. My goal is 4x
>>> scaling on 8 GPUs with a positive scaling experience beyond that relaxing
>>> system size limits up to 1B atoms in the process. If JAC gets faster, we
>>> can scale farther.
>>> 2) Because the path to AMBER 20 broke multiple implicit assumptions in
>>> my design* for AMBER so I went back in time to change the future. All
>>> relevant functionality will be restored over time, but I spent 6 months of
>>> 2020 trying to do exactly that before throwing my arms up in utter
>>> frustration. The alternative is walking away from all the code and starting
>>> a new framework.
>>> 3) RTX3090**
>>> 4) Remember this is PMEMD 2.0 we're building here. It's been almost 12
>>> years, it's time to rewrite.
>>>
>>> But... Your local force code still shows an acceleration over the
>>> original local force code even at full 64-bit accumulation. So that's
>>> getting refactored along the way. Everything else so far ia a perf
>>> regression without the precision model changes alas. But... you and I have
>>> accidentally created working variant of SPXP. Your stuff will live again in
>>> its reviva and you get first authorship IMO because while it's great work,
>>> it's *not* SPFP with those precision changes in place (18-bit mantissa?
>>> C'mon man(tm)...)
>>>
>>> *Should have spelled them out, but even I couldn't predict the end of
>>> the CUDA Fellow program a priori ending any support for further work, but
>>> now my bosses have let we work on it again as my dayjob.
>>> **
>>> https://www.exxactcorp.com/blog/Molecular-Dynamics/rtx3090-benchmarks-for-hpc-amber-a100-vs-rtx3080-vs-2080ti-vs-rtx6000
>>>
>>> On Sun, Apr 4, 2021 at 12:03 PM David Cerutti <dscerutti.gmail.com>
>>> wrote:
>>>
>>>> "Meanwhile, AMBER16 refactored to SM 7 and beyond is already hitting 730
>>>> ns/day on JAC NVE 2 fs. AMBER20 with the grid interpolation and local
>>>> force
>>>> precision sub FP32 force hacks removed hits 572 ns/day (down from 632 if
>>>> left in as we shipped it). That puts me nearly 1/3 to my goal of
>>>> doubling
>>>> overall AMBER performance which is what is important to me and where I'm
>>>> going to focus my efforts..."
>>>>
>>>> Please explain here.
>>>> 1.) Why are we back to using the old JAC NVE 2fs benchmark? The new
>>>> benchmarks were redesigned several years ago to make more uniform tests
>>>> and
>>>> take settings that standard practitioners are now using.
>>>> 2.) Why is Amber16 being refactored rather than Amber20?
>>>> 3.) What does it mean to be hitting 730 ns/day? What card is being
>>>> compared here--the Amber20 benchmarks look like they could be a V100,
>>>> Titan-V, or perhaps an RTX-2080Ti.
>>>>
>>>>
>>>> On Sun, Apr 4, 2021 at 12:11 PM Scott Le Grand <varelse2005.gmail.com>
>>>> wrote:
>>>>
>>>> > But getting back on topic, CUDA 7.5 is a 2015 toolkit and SM 5.x and
>>>> below
>>>> > are deprecated now. SM 6 is a huge jump over SM 5 enabling true
>>>> virtual
>>>> > memory and I suggest deprecating support for SM 5 across the board.
>>>> SM 7
>>>> > and beyond alas mostly complicated warp programming and introduced
>>>> tensor
>>>> > cores which currently seem useless for straight MD, but perfect for
>>>> running
>>>> > AI models inline with MD.
>>>> >
>>>> > CUDA 8 is a 2017 toolkit. That's way too soon to deprecate IMO and if
>>>> cmake
>>>> > has ish with it, that's a reason not to use cmake, not a reason to
>>>> > deprecate CUDA 8.
>>>> >
>>>> >
>>>> > On Sun, Apr 4, 2021 at 8:55 AM Scott Le Grand <varelse2005.gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Ross sent me two screenshots of cmake losing its mind with an 11.x
>>>> > > toolkit. I'll file an issue, but no, I'm not going to fix cmake
>>>> issues
>>>> > > myself at all. I'm open to someone convincing me cmake is better
>>>> than the
>>>> > > configure script, but no one has made that argument yet beyond
>>>> "because
>>>> > > cmake" and until that happens, that just doesn't work for me. Happy
>>>> to
>>>> > > continue helping with the build script that worked until convinced
>>>> > > otherwise. Related: I still use nvprof, fight me.
>>>> > >
>>>> > > Meanwhile, AMBER16 refactored to SM 7 and beyond is already hitting
>>>> 730
>>>> > > ns/day on JAC NVE 2 fs. AMBER20 with the grid interpolation and
>>>> local
>>>> > force
>>>> > > precision sub FP32 force hacks removed hits 572 ns/day (down from
>>>> 632 if
>>>> > > left in as we shipped it). That puts me nearly 1/3 to my goal of
>>>> doubling
>>>> > > overall AMBER performance which is what is important to me and
>>>> where I'm
>>>> > > going to focus my efforts as opposed to the new shiny build system
>>>> that
>>>> > is
>>>> > > getting better (and I *hate* cmake for cmake's sake), but we rushed
>>>> it to
>>>> > > production IMO like America reopened before the end of the pandemic.
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > On Sun, Apr 4, 2021 at 5:51 AM David A Case <david.case.rutgers.edu
>>>> >
>>>> > > wrote:
>>>> > >
>>>> > >> On Sat, Apr 03, 2021, Scott Le Grand wrote:
>>>> > >>
>>>> > >> >cmake is still not quite ready for prime time disruption of
>>>> configure.
>>>> > >> It's
>>>> > >> >getting there though.
>>>> > >>
>>>> > >> If there are problems with cmake, please create an issue on
>>>> gitlab, and
>>>> > >> mention .multiplemonomials to get Jamie's attention. Please try to
>>>> > avoid
>>>> > >> the syndrome of saying "I can get this to work with configure, and
>>>> I'm
>>>> > to
>>>> > >> busy right now to do anything else."
>>>> > >>
>>>> > >> I have removed the documentation for the configure process in the
>>>> > Amber21
>>>> > >> Reference Manual, although the files are still present. We can't
>>>> > continue
>>>> > >> to support and test two separate build systems, each with their own
>>>> > bugs.
>>>> > >>
>>>> > >> ...thx...dac
>>>> > >>
>>>> > >>
>>>> > >> _______________________________________________
>>>> > >> AMBER-Developers mailing list
>>>> > >> AMBER-Developers.ambermd.org
>>>> > >> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>> > >>
>>>> > >
>>>> > _______________________________________________
>>>> > AMBER-Developers mailing list
>>>> > AMBER-Developers.ambermd.org
>>>> > http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>> >
>>>> _______________________________________________
>>>> AMBER-Developers mailing list
>>>> AMBER-Developers.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>>
>>>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sun Apr 04 2021 - 13:30:02 PDT
Custom Search