Robert Duke wrote:
> I would try Alexey Onofriev's monster GB benchmark if you really want to 
> see what happens.  This thing has lots of atoms so the arrays are big.  
> I expect Ross has it lying around; if not I can forward you a copy but 
> will have to dig out the mdin from another machine (it is a big 
> prmtop/inpcrd - at least 2 MB zipped).
Ok, I've got some results. First off, I applied this modification to 
PMEMD 10 since I had a copy of it handy on the Triton cluster and a 
compile method all in place. For the testcase, I used the test at 
amber11/test/cuda/nucleosome. The mdin from Run_md.1 was used, but 
changed to run for 1000 steps. Two binaries of PMEMD10 were compiled 
with and without the patch mentioned in this thread. Ten runs for each 
binary were carried out over 256 processors. The full results can be 
found at:
http://www.wmd-lab.org/mjw/for_amber_dev/mpich2_1.2.0_pmemd_testing/results/
Looking at the logfile outputs for the two binaries and in the  "GB 
NonBond Parallel Profiling - NonSetup CPU Seconds:"  using the avg value 
as a metric here:
                                  O         D
                                  f         i
              R                   f         s         T
    T         a         D         D         t         o
    a         d         i         i         r         t
    s         i         a         a         i         a
    k         i         g         g         b         l
------------------------------------------------------
    0       8.8      13.8     131.4      20.0     174.0
<..snip..>
  255       8.6      13.5     129.4      22.2     173.6
------------------------------------------------------
  avg       8.6      13.5     129.2      22.3     173.6  <-----
------------------------------------------------------
NORMAL
======
  grep avg *logfile* | grep -v 0.0  | awk '{print $2,$3,$4,$5,$6,$7}'
avg 8.6 13.5 128.8 25.3 176.3
avg 8.6 13.5 128.9 24.4 175.4
avg 8.6 13.5 128.7 22.5 173.3
avg 8.6 13.5 128.8 106.8 257.7
avg 8.6 13.5 128.7 122.5 273.3
avg 8.6 13.5 128.6 21.8 172.4
avg 8.6 13.5 128.8 24.7 175.6
avg 8.6 13.5 128.6 35.7 186.4
avg 8.6 13.5 128.8 25.8 176.8
avg 8.6 13.5 128.8 23.8 174.8
MODDED
======
avg 8.6 13.5 129.1 113.5 264.7
avg 8.6 13.5 129.2 22.3 173.6
avg 8.6 13.5 129.2 22.1 173.4
avg 8.6 13.5 129.3 19.8 171.2
avg 8.6 13.5 129.2 23.9 175.3
avg 8.6 13.5 129.2 22.0 173.3
avg 8.6 13.5 129.3 22.0 173.4
avg 8.6 13.5 129.1 21.5 172.7
avg 8.6 13.5 129.2 21.1 172.5
avg 8.6 13.5 129.2 125.8 277.0
I'm putting the random jumps in "Distrib" values for both tests, down to 
transient cluster traffic issues. The Modded "OffDiag" value seems to be 
on average, 0.5 seconds greater than the Normal one. I think my change 
has not had a significantly detrimental effect on the performance. Would 
you agree with this?
regards,
Mark
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Mon Apr 19 2010 - 11:30:02 PDT