amber-developers: Verlet update time and ntt=3 parallel scaling

From: Carlos Simmerling <carlos.simmerling.gmail.com>
Date: Tue, 6 May 2008 17:27:22 -0400

Hi all,
I've noticed that while running explicit water sander jobs on the Blue
Gene with ntt=3 that the
time is nearly dominated by Verlet update time, which tends to be >50%
of the total
time and most important is nearly completely independent of the number of CPUs.
In contrast, the ntt=1 Verlet time is small (1%), meaning that my
simulations scale
much much better. Has anyone else noticed this?

>From looking at the code my guess is the runmd.f gauss calls that are
done for all atoms
regardless of parallelism (to keep the random # generators in sync). Preliminary
testing on 16 nodes on my cluster seems to confirm this, though Verlet takes
a much smaller fraction since it's not nearly as many cpus so the
nonbonds dominate.
When I disable the extra gauss calls the Verlet time drops back to
what it is for ntt=1.
I can't test that easily on the blue gene since wait times are days
and there is no
debug queue.

Has anyone else noticed similar problems at high processor counts for
sander and ntt=3,
or is this another nice feature of the BG?
Carlos

Verlet is the dominating factor in overall sander scaling for ntt=3 at 256 CPUs:


| Other 0.73 ( 8.40% of Recip)
| Recip Ewald time 8.68 (49.97% of Ewald)
| Force Adjust 1.13 ( 6.49% of Ewald)
| Virial junk 1.34 ( 7.71% of Ewald)
| Start sycnronization 0.00 ( 0.01% of Ewald)
| Other 0.02 ( 0.09% of Ewald)
| Ewald time 17.36 (90.19% of Nonbo)
| IPS excludes 0.00 ( 0.01% of Nonbo)
| Other 0.01 ( 0.05% of Nonbo)
| Nonbond force 19.25 (76.12% of Force)
| Bond/Angle/Dihedral 0.13 ( 0.52% of Force)
| FRC Collect time 4.92 (19.45% of Force)
| Other 0.99 ( 3.92% of Force)
| Force time 25.29 (40.39% of Runmd)
| Shake time 0.33 ( 0.53% of Runmd)
| Verlet update time 33.53 (53.55% of Runmd)
| CRD distribute time 3.45 ( 5.50% of Runmd)
| Other 0.02 ( 0.02% of Runmd)
| Runmd Time 62.61 (95.78% of Total)
| Other 2.76 ( 4.22% of Total)
| Total time 65.37 (100.0% of ALL )
Received on Wed May 07 2008 - 06:07:49 PDT
Custom Search