amber-developers: Fw: PMEMD 9 beta checkin from Robert Duke on 2006-01-19 (Amber Developers Archive Jan 2006)

From: Robert Duke <rduke.email.unc.edu>
Date: Thu, 19 Jan 2006 10:09:23 -0700

Folks -
My mail system said this message was dropped on the floor, so I am
retrying;
apologies if this constitutes a double-send. - Bob

----- Original Message -----
From: "Robert Duke" <rduke.email.unc.edu>
To: <amber-developers.scripps.edu>
Sent: Thursday, January 19, 2006 11:57 AM
Subject: PMEMD 9 beta checkin

> Folks -
> As Dave Case posted yesterday, I finally did a code drop of the pmemd 9
> beta
> source. I of course immediately found a couple of problems in testing,
so
> you need to resync to the source code tree as of late this morning
(11:30
> EST) (chagall currently is very, very slow). The fixes are 1) a fix in
> runmd.fpp to get proper vels in mdvel/restrt under pme when printing of
> these files coincides with a COM motion removal step (this was broken
> hacking GB into pmemd - it is a relatively minor error you are unlikely
to
> see, or recognize as a problem if you do, but pick up the fix anyway
:-))
> 2) rbornstat support was not actually complete, but now is - fixes in
> runmd.fpp, gb_ene.fpp, and 3) and trivial printout fix to list alpb in
the
> mdout (alpb use not yet actually supported).
>
> Okay, this pmemd drop mostly has:
>
> 1) Lots of speedup/scaling enhancements for pme.
> 2) Various compatibility fixes to match sander 9 behaviour.
> 3) Support for Generalized Born, igb versions 1, 2, 5.
> 4) Dual cutoffs support - instead of "cut" in the mdin ctrl section, you

> can
> specify separate vdw_cutoff and es_cutoff, with it understood (and
> enforced)
> that vdw_cutoff >= es_cutoff. This is a small performance enhancement,
> mostly intended to eventually support amoeba.
> 5) fftw 3 support. Use double precision fftw of course. This supports
> prime factors of 2, 3, 5, and 7 (the 7 is the difference) and should
allow
> better grid fitting. The auto grid algorithm is a bit more aggressive
> about
> targetting a grid density of >= 1.0 grids/angstrom by default (so you
may
> get finer grids and lower pme recip space error); you can also now
> directly
> specify your fft grid density (as fft_grids_per_ang) in the mdin &ewald
> namelist.
> 6) There is a -l command line option to name the logfile, and better
time
> reporting overall. The -l option is mostly useful if you are doing a
> bunch
> of concurrent runs in one dir (say you are benchmarking...).
>
> Building -
>
> In dir pmemd, do a "configure -help".
> Do what it says. If you have one of the new 64 bit intel machines
(em64t)
> and ifort, tell the configure script that you are a linux64_opteron with
> ifort. The linux_p4 config files apply to 32 bit machines with a 32 bit

> OS.
> I'll fix this before long... (waiting to get around to actually using
my
> em64t machine) If you want to use gfortran or g95, I don't yet support
> them; I am not opposed to adding them, but it is a low priority for me,
> due
> to the fact that these compilers don't produce optimal performance. I
am
> also a little nervous about how big a support hassle they may turn into
> (in
> other words, I am open to adding config files for this stuff, but don't
> yet
> have my machines config'd to support them, so support is "limited").
>
> Performance -
>
> Is pretty good. Attached are benchmarks for hfva (193K atoms), fix and
> jac.
> The scaling is better, and per processor performance is better than
pmemd
> 8,
> especially for the NPT ensemble. These benchmarks were run on a cray
xt3
> at
> psc, and an ibm sp5 and opteron/infiband cluster at nersc. There is a
set
> of benchmarks (comp) comparing machines; the other 3 files are all the
> same
> benchmarks on the xt3, for pmemd 9 beta, 8, sander 8, at 3 different
(lo,
> med, hi) processor ranges so you can see low end details or the grand
> sweep
> of things.
>
> Performance improvements on small hardware are less apparent in the
above
> benchmarks. For comparison, though, we are doing significantly better
> than
> pmemd 8 for my 3.2 GHz 32 bit P4 setup (GB ethernet):
>
> Factor ix, constant pressure
>
> #proc pmemd 8 pmemd 9
> psec/day psec/day
>
> 1 86 116
> 2 138 182
> 4 238 293
>
> (note that the pmemd 8 times are better than in the original release;
this
> is probably mostly a compiler/OS issue;
> we reported 75 psec/day and 120 psec/day for 1 and 2 procs respectively
at
> time of amber 8 release - same machine)
>
> The fftw libraries were used here, in double precision mode, with -sse2
> instructions enabled.
>
> The Generalized Born implementation is not yet optimized, but on the P4
it
> is about 10% faster than sander 9. I will work on improving things a
bit
> more as I get time. I need to add support for MKL, check out other
> machines
> (especially itanium), etc. etc.
>
> Any problems, let me know!
>
> The beta without GB has been made available to select users at psc,
nersc,
> and ncsa. Let me know if you have access on machines at those
facilities
> and want the beta (or build it yourself, of course).
>
> Porting to new machines -
> Optimization is a little complex now. Probably best to let me do it if
> possible, as I currently do a bake-off between a bunch of different
> possible
> code paths, depending on the machine. Otherwise, guess what your new
cpu
> is
> most similar to that is supported and use the defines for that as a
> starting
> point.
>
> Best Regards - Bob
>
>

application/vnd.ms-excel attachment: xt3_bm_122305_medproc.xls

application/vnd.ms-excel attachment: xt3_bm_122305_hiproc.xls

application/vnd.ms-excel attachment: xt3_bm_122305_loproc.xls

application/vnd.ms-excel attachment: compmach_bm_123005_hiproc.xls

Received on Wed Apr 05 2006 - 23:49:44 PDT