Re: [AMBER-Developers] multi gpu gromacs benchmarks

From: Scott Brozell <sbrozell.comcast.net>
Date: Mon, 7 Feb 2022 16:52:02 -0500

Hi,

Thanks for the links, Scott.

Note that these are MVAPICH2 developers. The benchmarks are used to
produce typical communication patterns. So real world, great science
benchmarks are best, but since that is somewhat in the eye of the
beholder, the benchmarks that are actually available are useful.

If anyone has comments, criticisms, or suggestions beyond this thread
then i can provide contact details.

thanks,
scott


On Thu, Feb 03, 2022 at 09:14:50AM -0800, Scott Le Grand wrote:
> After CUDA and GPUs tore down the walls of the walled garden of HPC, it's
> astounding how fast everyone scrambled to rebuild those walls. And here we
> are, right back where we started in 2009.
>
> That said, I wish AI types were half as willing as GROMACS users to tune
> their machines to train models and run inference. But failing that, I'd be
> happy if crypto people stopped powering their gpus with coal. None of
> that's going to come to pass but I do believe the situation will keep
> getting worse.
>
> On Thu, Feb 3, 2022, 07:57 Ross Walker <ross.rosswalker.co.uk> wrote:
>
> > Hi Jason,
> >
> > My point is that benchmark numbers where someone has hand tuned the
> > options for a given machine such that only an expert user willing to spend
> > days on end tweaking things to get those numbers are not typically very
> > useful.
> >
> > My personal opinion is that codes should either be auto tuning or should
> > be designed to give close to optimum performance without needing one to
> > tweak a whole bunch of options. This is where codes radically differ. AMBER
> > GPU for example was designed to give close to optimum performance without
> > needing any special settings. Gromacs on the other hand has literally
> > hundreds of little parameters one can tune and rarely gives good
> > performance out of the box.
> >
> > It really comes down to what one is trying to do here. Is one trying to
> > generate some benchmarks that will be of general use to 95+% of users,
> > which for most users is 'what do I get out of the box', or is one trying to
> > do some kind of hero run and demonstrate that with an epic amount of tuning
> > that 'my code is faster than yours(tm)'. Personally I find the latter to be
> > of limited use in helping users do better / more effective science.
> >
> > My 0.02 btc
> >
> > All the best
> > Ross
> >
> > > On Feb 3, 2022, at 10:43, Jason Swails <jason.swails.gmail.com> wrote:
> > >
> > > On Thu, Feb 3, 2022 at 8:44 AM Ross Walker <ross.rosswalker.co.uk>
> > wrote:
> > >
> > >> And in there lies the real problem. IMHO benchmark numbers should be
> > what
> > >> the average user will get out of the box on reasonably priced hardware
> > >> without tinkering. If you want numbers that are actually useful in the
> > real
> > >> world I'd suggest benchmarking by finding a grad student or postdoc who
> > is
> > >> using MD in their work but is not a developer. Give them a typical
> > >> workstation that can be purchased for <$6K and a list of PDB IDs to run.
> > >> Ask them to report back the performance they get with various codes.
> > >>
> > >> That will give you a real world benchmark that is actually useful.
> > >>
> > >
> > > for running MD codes on commodity workstations purchased for <$6K.
> > > Otherwise you'd need to run the benchmark on whatever hardware you
> > actually
> > > plan on using for them to be actually useful. [1]
> > >
> > > You'll obviously get some performance variability when performance
> > depends
> > > on so many things, but to declare that the only useful benchmarks are
> > those
> > > that run under conditions known to be optimal for one code at the expense
> > > of another is disingenuous. [2] All benchmarks are useful [3] if you
> > know
> > > how to read them and understand the caveats (I'd wager most people do to
> > a
> > > reasonable extent). Performance differences that don't come close to an
> > > order of magnitude won't do all that much to move the needle. You aren't
> > > going to go from the us -> ms regime by running 50% faster.
> > >
> > > Dhruv, I'd be interested in seeing those benchmarks.
> > >
> > > Thanks,
> > > Jason
> > >
> > > [1] For narrow definitions of "useful"
> > > [2] Recall the only benchmarks we published pre-2012 were performed on
> > > supercomputers
> > > [3] In the sense that it allows you to make more informed choices

On Wed, Feb 02, 2022 at 03:26:29PM -0800, Scott Le Grand wrote:
> PS because gromacs tries to load balance the CPU and the GPU, you're going
> to have to put more time into coming up with the optimal system for
> measuring the performance of gromacs than you would for benchmarking OpenMM
> or PMEMD.
>
> This is where you will run into the no true benchmark dilemma. But since
> the Amber PIs and the gromacs PIs are playing matchmaker, is there a chance
> you can get the gromacs guys to benchmark their own code by their own
> standards?
>
> I have never been able to reproduce any of their numbers back when I used
> to work for AWS and that was part of my day job, but I don't doubt that
> they can get those numbers at all. We just didn't have the right set of
> CPUs and GPUs in the same box.
>
> On Wed, Feb 2, 2022, 14:40 Scott Le Grand <varelse2005.gmail.com> wrote:
> > Here's gromacs DHFR... But also, not the same benchmark... Kind of MCU vs
> > DCEU but who am I kidding? Who cares?
> >
> > https://www.gromacs.org/Documentation_of_outdated_versions/Installation_Instructions_4.5/GROMACS-OpenMM#GPU_Benchmarks
> >
> > STMV is probably in this container somewhere.
> > https://www.amd.com/en/technologies/infinity-hub/gromacs

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Mon Feb 07 2022 - 14:00:02 PST
Custom Search