Hi Tom,
I don't at all disagree with your scientific assessment of mega-scale system
simulations here; heck, I am not even on board wrt remd yet... So it is from
my perspective, purely "me too" politics, to keep namd from being able to
strut around to potentially naive funding review boards, pointing to
something the size of a slab of beef wiggling for 10 psec, crowing about how
they are the only ones that can do that, and then getting funded on that
account. And I think that sort of thing is very definitely happening; Ross
probably has a better idea than I do, and I expect you have a few unique
insights too. So I am not personally interested in this in terms of the
results; I am only concerned that I support stuff that gets us funded, and
in a slight compromise on my principles (I am heavy on the "perish" side of
the equation when it comes to what I choose to do on the basis of
principles), I am willing to do some of this earlier than it makes sense.
Okay, time for the heavies in the group to put their heads together and let
us know what you ultimately want this cycle. It is not a lot of work for
me, given the caveat that I am not worrying about what happens when you run
out of memory (ie., dying with a message is an acceptable end to a
ridiculously big job on a ridiculously small machine). The problem, though,
is it makes no sense to do any of this at all unless it happens in pmemd /
sander / the leaps / ptraj. Dave, if you would drive us to consensus I
would appreciate it (sorry, I know you don't need me bugging you, but
whatever leadership we have, I think you are it ;-)).
Best Regards - Bob
----- Original Message -----
From: "Thomas Cheatham" <tec3.utah.edu>
To: <amber-developers.scripps.edu>
Sent: Wednesday, December 05, 2007 11:20 AM
Subject: RE: amber-developers: Fw: How many atoms?
>
> Well, since bob asked if I'm on board; I wrote this late last night and I
> do not really know who is on this list so I should be careful in what I
> say, but...
>
>> That's not really true. Simulations with 0.5M is about as routine today
>> as
>> 20,000 atoms a few years ago. We are talking about protein complexes in
>
> I did not want to jump into this fray, but as I busily review proposals
> for computer time at the NSF centers tonight (and I'm a bit late, as
> usual)-- proposals that mostly involve running simulations of less than
> 100K atoms-- I would contend that we do not need to reach that megascale,
> yet, and I do say "yet". NAMD works, and at present there are very few
> large scale simulations beyond those of a few groups, like Schulten
> (viruses, Bar domains, ...), Voth (Bar domains), Sanbonmatsu
> (ribosome), ...
>
> I would contend that 500K is not as routine as 20K was a decade
> ago. Moreover, we have many many more people running simulations in the
> 10-100K range now than we ever did in *any* range a decade ago (including
> in-vacuo). The field has exploded.
>
> My personal take on the large hero simulations is that they are easy.
> With so many atoms, there is a much smaller chance of a small instability
> or local error occuring that propagates to kill the system. Moreover,
> with so many atoms, how can you possible detect a funny alpha/gamma
> artifact in DNA or an alpha-helix bias in the protein?
>
> I have run a large set of proteasome simulations of 200K-500K atoms and
> also DNA mini-circles at 200K atoms using sander. You can hot start these
> at 300K from a loosely minimized solvated crude protein model straight
> from LEaP and everything is peachy, i.e. it runs fine. In addition to
> being stable, you cannot really run long enough to find artifact or see
> problems. I think the model of these hero calculations is: stable one-off
> simulation + cursory analysis + "benchmark" as first of kind ==
> publication.
>
> This is what Mike was stating and this is not what I think AMBER is about.
>
> I think we need to think in terms of what AMBER's strengths are: validated
> force fields, critical assessment of simulation methods and results, and
> as exhaustive sampling/exploration as possible. Changing PMEMD or sander
> to work on "one billion" atoms is not hard; it is only hard if you want
> good performance and you are constrained by the machine (as Ross and Bob
> mentioned). If you are doing a one-off simulation, performance does not
> matter.
>
> What I-- and I think "we"-- want is the fastest set of methods to
> exhaustively explore systems in the range of 10-100K atoms today and
> likely up to 1M or more atoms within 2 or so years. Given the machines
> coming online, ideally what we want *now* is fast methods that work in
> ensembles so I can do replica-exchange or path sampling or dG runs
> optimally on the emerging machines, i.e. chained or loosely coupled
> parallel jobs (multisander) that scale internally to up to 1K cores or
> beyond. With multiple replicas, each at 1K cores, we will be able to use
> the emerging machines very effectively.
>
> In some sense, we are in a paradigm shift in the way we need to think
> about the simulations (and their analysis). Running a billion atom
> simulation on the whole machine is not as useful as running thousands of
> smaller systems that can be fully explored. But the real paradigm shift
> will be in how to handle all the data. If these new machines do offer
> 100x the power we've seen previously, how do we handle 100x the
> results (aka trajectory size)?
>
> --tom
>
> p.s. regarding formats, my opinion in the short run (aka amber10) is use
> what we have...
>
Received on Sun Dec 09 2007 - 06:07:11 PST