RE: [AMBER-Developers] pmemd.cuda errors

From: Ross Walker <>
Date: Sat, 6 Mar 2010 09:53:44 -0800

> A lot of the tests passed with the 2.3 toolkit (some had the NaN error
> everywhere and was just awful). With the 3.0 toolkit, the only
> failures were precision differences. They were slightly more
> pronounced than the errors in regular sander, but I'll chalk that up
> to the single precision being used for the operations. They were
> still only differing in the hundredths/thousandths place for values
> that were as large as 30,000.

Yes. We have 2 debug modes right now as well. -cuda_SPSP which runs
everything stonkingly fast in all single precision, ala acemd and othe GPU
implementations. This however has issues with large molecules blowing up
after a few hundred steps. Interesting that things like the 1million atom
test case use in the ACEMD paper was only run for 50 steps. Hmmmmmmm...

Then there is -cuda_DPDP which runs everything stonkingly SLOWLY in double
precision. This is the gold standard though by which I am testing things and
it matches the CPU output almost perfectly (to the same last dp differences
as we see in CPU vs CPU runs). the default -cuda (which is really cuda_SPDP)
is half way between on the precision and gets differences beyond the 7th or
8th sig fig. Especially in the PME calculations. I am trying to validate
this properly right now, order parameters, radial distribution functions etc
and this generally looks good. Lots of other things keep getting in the way
though which is slowing this progress drastically. It seems that EVERYTHING
has a deadline in March... Sigh...

Anyway, the short answer is that recent changes to the way PME is done will
have changed the PME answers beyond the 8th sig fig or so (which is the
limit of single precision) for the default precision case. Before I update
the test cases to match this I need to run through the test cases with the
DPDP precision and make sure they match properly. Then I can update the SPDP
save files. However, there is an issue with DPDP not working due to
exhausting memory on the card which needs to be fixed before I do this.
Hence for the moment I have left the save files at the previous version.
> Perhaps I didn't allow the tests to go all the way through the PME
> ones... I'll do a more thorough test on Tuesday. A note on compiling,
> however. ./configure linux_em64t_cuda gfortran nopar did not work.
> It failed at the link step, making some complaint about a reference to
> MAIN_ or something. I don't quite recall... It was resolved when I
> used the linux_em64t_cuda_SPDP.gfortran config data. If you want/need
> more information, I'll be happy to try and re-create the error.

Stop using these configure files and move over to the main AMBER build.

cd $AMBERHOME/src/
./configure -cuda gnu
make -j8
cd ../test/
make test.serial.cuda

Ps. Intel compilers and any non x86_64 machine will still be broken unless
you compile up your own version of cudpp. I have the static library in there
right now which needs to go before release. To do this though I need to cut
down the cudpp library to extract out JUST the radix sort, otherwise it is
one massive pile of BOOST. Right now though the 1.1.1 cudpp we need to use
(for Fermi) just segfaults on my machine deep inside the driver which has
put the brakes on this effort...

Note, all of this, plus paper deadlines, end of year report deadlines and
weeklong all hands meetings for my DOE grant ALL in Feb and Mar is why there
is NO documentation for any of this yet.

All the best

|\oss Walker

| Assistant Research Professor |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- |
| | |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

AMBER-Developers mailing list
Received on Sat Mar 06 2010 - 10:00:05 PST
Custom Search