Re: [AMBER-Developers] fftw questions

From: Duke, Robert E Jr <>
Date: Sun, 12 Feb 2012 04:03:09 +0000

Hi Dave,
I know I left pmemd capable of handling both fftw2 and 3 in case different users had different versions. By now, it would probably be fine to move forward to fftw3. With pmend, it used to be true that fftw could give you up to 10% higher performance, but only in single processor mode. This was accomplished by extreme processor optimization in fftw, with actual testing at fftw initialization. The cost of this was a little additional setup time (this all happens "automagically" on startup), and more significantly, somewhat confusing variation in the output fairly early on, in the last digit (due to actual differences in fft output, due to variation in algorithms in use - timing factors in the rest of the system could cause fftw to choose different algorithms at different times on the same machine). While fftw is some cool stuff, pmemd can live totally without it, and in parallel, the optimizations I did to the original sander fft code by Tom (originally from NetLib, if memory serves) were sufficiently goo
d that by the time you were using several processors, any fftw advantage was pretty much lost. So you don't really need fftw2 or 3 for pmemd, to be sure; pmemd was redesigned with a facility to easily plug in different fft implementations, but the need for this too, is minimal.

I have not kept up with the SSE stuff - I expect that this could bite you to not have flags if you have old hardware and old compilers mostly. One of the pmemd complexities in all this was that on really big installations folks might build pmemd on a login node with a different architecture than the backend compute nodes.

Probably all stuff you already mostly knew, but that is my review of history on these issues; I wish Intel would do something simple with their compiler switches and stick with it; perhaps they already have (the PathScale compilers were such a pleasure to use in this regard).

Regards - Bob

From: case []
Sent: Saturday, February 11, 2012 7:40 PM
Subject: [AMBER-Developers] fftw questions

Some questions about fftw:

1. Do we still need fftw2? The configure2 script configures it if both
rims and mdgx are set to "no", but I don't understand why. What codes are
using fftw2?

Note that there is buglet in the Makefile: the clean and uninstall targets
try to run "make" inside fftw-2.1.5, but if those are not configured, then
there is no Makefile to be run inside those directories.

2. I'm still having problems with fftw3, on an i386 Intel mac (OS 10.5.8).
I find that I have to add -nosse to the configure2 script: this wasn't caught
before because of a bug (now fixed) in configure2 that never passed sse flags
to fftw3 even if sse was turned on. I'd be interested to hear if anyone else
can confirm this; and if sse problems occur on other Macs, or other platforms.
(I'm using gcc 4.5.2).

3. In general, all this stuff about SSE seems fragile and possibly
un-necessary. Can't compilers optimize code with minimal help
from the user? That is, do we know that just using -mtune=native (or -fast
for Intel) doesn't do a good job. Should I be expected to know the
differences between SSE2, SSE3 and SSE4.1 or 4.2??? It would sure be nice
to simplify this.

4. Do we really need the -nomdgx or -nocpptraj flags? I'd be inclined to
have fewer flags, and ask users to just comment out the relevant lines in the
Makefile if they happen on a machine where these programs fail to compile.
Probably the same answer for -nomtkpp. I'm inclined to keep
-noX11, since that is a relatively common option.


AMBER-Developers mailing list
AMBER-Developers mailing list
Received on Sat Feb 11 2012 - 20:30:02 PST
Custom Search