Hi All,
After some extensive Portland group compiler testing I have found a set of
options that appear to work well.
With the default options with configure and -p4 giving the SSE options on
a
64 bit pentium machine there are a number of test failures (big failures)
with QMMM/PIMD and Amoeba.
I have tried a number of different options, here is a summary.
-----------------
On 32 bit P4 machines the following seem to work well:
FFLAGS= -O1 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -fast $(LOCALFLAGS) $(AMBERBUILDFLAGS)
(Gives only 3 very minor differences)
FFLAGS= -tp p7 -O1 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -tp p7 -Mscalarsse -Mvect=sse -Mflushz -fast -O3 $(LOCALFLAGS)
$(AMBERBUILDFLAGS)
(Gives only 5 very minor differences)
So we can choose either of these for regular (old) 32 bit P4 machines -
and
I assume, although have not tested, old 32 bit AMD chips as well.
-----------------
On 64 bit machines things seem to be a lot more complicated.
Using just -O1 and -fast still give a lot of major differences in PIMD
tests
involving QMMM and Amoeba tests.
The following options DO NOT WORK:
FFLAGS= -O1 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -fast $(LOCALFLAGS) $(AMBERBUILDFLAGS) !FAILS
FFLAGS= -O1 -Kieee -Mnofprelaxed $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -fast -Kieee -Mnofprelaxed $(LOCALFLAGS) $(AMBERBUILDFLAGS)
!FAILS
However, turning off optimization completely works:
FFLAGS= -g $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -g $(LOCALFLAGS) $(AMBERBUILDFLAGS)
So it seems that the problems in 64 bit mode are a function of
optimisation
- possibly on a single file but this may take quite a long time to find.
The only option that I can get to work properly on a 64 bit Pentium is to
compile it 32 bit. The following option works (one has to explicitly
specify
-tp p7 and also specify -m32 on the gcc line):
CFLAGS= -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -O2 -m32
FFLAGS= -tp p7 -O1 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -tp p7 -Mscalarsse -Mvect=sse -Mflushz -fast -O3 $(LOCALFLAGS)
$(AMBERBUILDFLAGS)
So, for the moment it looks like Portland group's 64 bit compilation code
is
broken. Kim, can you try compiling 32 bit using the attached config file
on
your Opteron machine with the Portland group compiler.
I have also attached the TEST_FAILURES.diff file for an EM64T machine
running the latest Amber 9 CVS tree compiled with PGF90 6.1-3 using the 32
bit compilation target given above.
Unfortunately I can't test many more options this weekend as the wife is
threatening me with divorce if I so much as think about switching my
computer on tomorrow :-(...
I'll try and narrow things down some more on Monday. For the release
though
I think we should use (depending on how the opteron tests pan out) one of
the following options (depending on how safe we feel with the sse options)
for ALL machines, both x86 and x86-64, with the Portland group compilers:
CFLAGS= -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -O2 -m32
FFLAGS= -tp p7 -O1 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -tp p7 -fast $(LOCALFLAGS) $(AMBERBUILDFLAGS)
or
CFLAGS= -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -O2 -m32
FFLAGS= -tp p7 -O1 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
FOPTFLAGS= -tp p7 -Mscalarsse -Mvect=sse -Mflushz -fast -O3 $(LOCALFLAGS)
$(AMBERBUILDFLAGS)
the -tp p7 (and -m32) is what forces 32 bit compilation.
Note, to make sure that ptraj is also built 32 bit we need to add -m32 to
the LOADCC line as well (otherwise it defaults to linking 64 bit with 32
bit
object files and, at least in my case, the gcc compiler segfaults at link
time...):
LOADCC= gcc -m32 $(LOCALFLAGS) $(AMBERBUILDFLAGS)
All the best
Ross
/\
\/
|\oss Walker
| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
|
http://www.rosswalker.co.uk | PGP Key available on request |
Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.
Received on Wed Apr 05 2006 - 23:49:31 PDT