RE: [AMBER-Developers] PMEMD now built by default

From: Ross Walker <ross.rosswalker.co.uk>
Date: Tue, 2 Mar 2010 14:02:30 -0800

Hi Bob,

> A few quick notes -
> The DIRFRC_* defines are important; you need to at least default to
> what is
> being used for em64t on these. A good example is DIRFRC_EFS which
> enables
> direct force calc splining optimizations that give you single processor
> speedups in the range of 30%. The other stuff is in the 5-10% range
> typically, and if you get the em64t defaults, you probably won't be far
> off
> target.

Indeed. Right now when you issue ./configure intel

You get:

PMEMD_FPP=cpp -traditional -P -DBINTRAJ -DDIRFRC_EFS -DDIRFRC_COMTRANS
-DDIRFRC_NOVEC -DFFTLOADBAL_2PROC -DPUBFFT
PMEMD_F90=ifort
PMEMD_FOPTFLAGS=-fast

Which uses the correct ifdefs I think. I know there are things like FFTW but
I never saw much speedup with that so figured it wasn't worth the hassle.
The one that seems strange is the FFTLOADBAL_2PROC. This only does anything
if you are running mpirun -np 2. Yes? Is there any harm in having this there
or should we consider stripping it out?

I assume the vec version of DIRFRC_ is mainly aimed at RISC systems correct?
Have you tried it of late on any EM64T systems, the ones with SSE5 etc? The
-fast does vectorization and full interprocedural optimization at linking
stage so the difference may not be so great now but was wondering if you had
tried it with Nehalm systems?

> But you do need to pick up this level of optimization that is
> basically implemented through alternative code paths. The slow mpi
> stuff is
> pretty unimportant now, and more likely to cause problems because folks
> include it for non-ethernet systems, so seeing that go away is good
> (there
> is a potential for small systems to run out of mpi buffer space
> however,
> which can cause fairly annoying system hangs; something to watch out
> for).

Yeah, I saw that a lot because people (including me) would use mpich for
running just on our desktops. When these were dual core it made no
difference but now with 8 or 12 or even 16 cores in a desktop it starts to
hurt.

> It is true that a lot of the alternative codepath complexity has to do
> with
> supporting various risc architectures and itanium, both of which have
> lost
> the wars. On netcdf, well, in reality, at least in my experience, I

Yeah a real shame but cheapness is everything these days... :-( In the minds
of the politicians they are extending the American ideal to supercomputers
as well so it is stated that '...all flops are created equal'.

> install would typically take. Gosh, I hope ibm power systems become
> available again somewhere; they are the best machines available in my
> opinion; I guess they are expensive too though.

Yeah there are a couple of bright sparks with power 7 systems coming online
but getting access to them will be hard and I doubt anyone can convince a
university to buy one in place of cheap flop rich cluster these days. :-(

All the best
Ross

/\
\/
|\oss Walker

| Assistant Research Professor |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
| http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.






_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Tue Mar 02 2010 - 14:30:05 PST
Custom Search