RE: amber-developers: Fw: How many atoms?

From: Ross Walker <ross.rosswalker.co.uk>
Date: Tue, 4 Dec 2007 19:14:30 -0800

My understanding from Bob's email, and Bob can correct me if I am wrong
here, is that it is a memory consideration. I.e. large systems could use
significant amounts of memory and it is the work in keeping the memory
footprint small that is complicated and time consuming.
 
However, from what I can glean Bob may have expectations for memory that are
somewhat lower than what will actually be deployed, based on experience with
Blue Gene. My assertion would be that we try to support > 999,999 atoms but
in the short term not worry about the memory requirements of such
calculations. In this way the limiting factor becomes the available memory
per node and not the underlying file formats. Since Blue Gene is the
exception rather than the rule in HPC systems I think the problem will be
much less than Bob is anticipating. It seems crazy to focus effort on
optimizing for the lowest common denominator especially when 99% of
available SUs on NSF allocated resources will shortly be non-blue gene type
architectures.
 
I am of course neglecting the myriad of complexities involved in terms of
performance as a function of memory usage etc but at least for Amber 10 it
would seem to make sense to aim at the types of machines that will be
generally available to NSF researchers over the next two years and all of
these will have between 1 to 2GB per core (4GB+ per core if you leave cores
idle on various nodes) and enough processors to make even Bob run away
screaming that the apocalypse is coming.
 
Just my 2c.
 
All the best
Ross

/\
\/
|\oss Walker

| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
| http://www.rosswalker.co.uk <http://www.rosswalker.co.uk/> | PGP Key
available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

 


  _____

From: owner-amber-developers.scripps.edu
[mailto:owner-amber-developers.scripps.edu] On Behalf Of Carlos Simmerling
Sent: Tuesday, December 04, 2007 18:10
To: amber-developers.scripps.edu
Subject: Re: amber-developers: Fw: How many atoms?


it sounded like Bob thinks there there IS a cost to doing this.
My feeling is that if there was no cost, go for it, but if it takes
away Bob's precious time that he could be using to get this
stuff up and working for smaller systems, then we should let
him focus on the sizes that people actually run rather than having
delays or overall slower code just to support things that none of us
actually simulate. Sure, it could be great PR, and yes, maybe
focusing on smaller systems isn't visionary enough, but I think
there is a lot to be gained by getting better code for more modest
systems that still have biological relevance, rather that us wasting
Bob's time on code that none of us need (yet).
carlos



On Dec 4, 2007 8:46 PM, Ken Merz <merz.qtp.ufl.edu> wrote:


Hi,
 If it costs us nothing then why not scale PMEMD beyond 999,999 atoms.
Someone out there might want to do 1MM+ atom simulation with the AMBER
program suite! Kennie

On 4 Dec 2007, at 2:14 PM, Robert Duke wrote:


Hello folks!
I am working hard on high-scaling pmemd code, and in the course of the work
it became clear to me, due to large async i/o buffer and other issues, that
going to very high atom counts may require a bunch of extra work, especially
on certain platforms (BG/L in particular...). I posed the question below
to Dave Case; he suggested I bounce it off the list, so here it is. The
crux of the matter is how people feel about having an MD capability in pmemd
for systems bigger than 999,999 atoms in the next release. Please respond
to the dev list if you have strong feelings in either direction.
Thanks much! - Bob

----- Original Message ----- From: "Robert Duke" <rduke.email.unc.edu>
To: "David A. Case" < <mailto:case.scripps.edu> case.scripps.edu>
Sent: Tuesday, December 04, 2007 8:45 AM
Subject: How many atoms?



Hi Dave,
Just thought I would pulse you about how strong the desire is to go above
1,000,000 atom systems in the next release. I personally see this as more
an advertising issue than real science; it's hard to get good
statistics/good science on 100,000 atoms let alone 10,000,000 atoms.
However, we do have competition. So the prmtop is not an issue, but the
inpcrd format is, and one thing that could be done is to move to supporting
the same type of flexible format in the inpcrd as we do in the new-style
prmtop. Tom D. has an inpcrd format in amoeba that would probably do the
trick; I can easily read this in pmemd but not yet write it (I actually have
pulled the code out - left it in the amoeba version of course, but can put
it back in as needed). I ask the question now because I am hitting size
issues already on BG/L on something like cellulose. Some of this I can fix;
some of it really is more appropriately fixed by running on 64 bit memory
systems where there actually is a multi-GB physical memory. The problem is
particularly bad with some new code I am developing, due to extensive async
i/o and requirements for buffers that at least theoretically could be pretty
big (up to natom possible; by spending a couple of days writing really
complicated code I can actually handle this in small amounts of space with
effectively no performance impact - but it is the sort of thing that will be
touchy and require additional testing). Anyway, I do want to gauge the
desire to move up past 999,999 atoms, and make the point that on something
like BG/L, it would actually require a lot more work to be able to run
multi-million atom problems (basically got to go back and look at all the
allocations, make them dense rather than sparse by doing all indexing
through lists, allow for adaptive minimal i/o buffers, etc. etc. - messy
stuff, some of it sourcing from having to allocate lots of arrays
dimensioned by natom).
Best Regards - Bob




Professor Kenneth M. Merz, Jr.
Department of Chemistry
Quantum Theory Project
2328 New Physics Building
PO Box 118435
University of Florida
Gainesville, Florida 32611-8435

e-mail: merz.qtp.ufl.edu
http://www.qtp.ufl.edu/~merz <http://www.qtp.ufl.edu/%7Emerz>

Phone: 352-392-6973
FAX: 352-392-8722
Cell: 814-360-0376








-- 
===================================================================
Carlos L. Simmerling, Ph.D.
Associate Professor                 Phone: (631) 632-1336 
Center for Structural Biology       Fax:   (631) 632-1555
CMM Bldg, Room G80
Stony Brook University              E-mail: carlos.simmerling.gmail.com
Stony Brook, NY 11794-5115          Web: http://comp.chem.sunysb.edu
=================================================================== 
Received on Wed Dec 05 2007 - 06:07:37 PST
Custom Search