Re: [AMBER-Developers] Suggestions for dealing with mpich2-1.2.1p1

From: Robert Duke <>
Date: Fri, 16 Apr 2010 12:25:07 -0400

Hi Mark,
I had the same basic idea, but hesitate because it may have an impact as the
processor count goes up somewhere north of 64 - hard to say; I would test it
at like 256 on a high atom count system and see what happens there. The
reason for the concern, off the top of my head - the extra copy in the
master stalls the master a bit at all places this gets called, and that has
the impact of stalling everybody else. I do realize this is a heck of a
mess. I am not wildly passionate about GB performance (nobody has ever made
it a real priority; if the amber community really wanted it superfast and I
had funding to do it, I could make it a heck of a lot faster), so aside from
a bit of pride-of-ownership and cringing over anything that slows pmemd
down, I could probably let it go.

I was going to mention earlier that if a scheme were used where the user has
made a choice that can be clearly identified as mpich 1 vs 2, then you may
be able to simplify the rest of this. I just don't have the matrix of all
the things that work and don't work and are and are not available for the
various mpi subversions, so I can't really sort it all out without lots of
work. Another case of being subverted by subversion code inherited from
others; time to write our own mpi ;-)
Regards - Bob

----- Original Message -----
From: "Mark Williamson" <>
To: "AMBER Developers Mailing List" <>
Sent: Friday, April 16, 2010 12:13 PM
Subject: Re: [AMBER-Developers] Suggestions for dealing with mpich2-1.2.1p1

> Ross Walker wrote:
>> Hi All,
>> I am trying to address the mpich2-1.2.1p1 issue regarding pmemd (and also
> With reference to the PMEMD test fails with recent mpich2 versions,
> I find the following patch works well for me:
> --- a/src/pmemd/src/gb_parallel.fpp
> +++ b/src/pmemd/src/gb_parallel.fpp
> .. -305,14 +305,20 .. subroutine gb_mpi_gathervec(atm_cnt, vec)
> integer :: atm_cnt
> double precision :: vec(3, atm_cnt)
> + double precision :: recv_buf(3, atm_cnt)
> ! Local variables:
> call mpi_gatherv(vec(1, atm_offsets(mytaskid) + 1), &
> vec_rcvcnts(mytaskid), mpi_double_precision, &
> - vec, vec_rcvcnts, vec_offsets, &
> + recv_buf, vec_rcvcnts, vec_offsets, &
> mpi_double_precision, 0, mpi_comm_world, err_code_mpi)
> + if (mytaskid == 0) then
> + vec = recv_buf
> + end if
> +
> +
> return
> Highlights:
> * Compiles and runs fine with mpich2-1.2.1p1
> * All test.parallel.pmemd pass
> * Does not require changes to Makefiles/config or passing of extra flags.
> * Is MPI-1 Compliant ;)
> * Seems not to affect performance, viz:
> Benchmarks
> ==========
> Two pmemd.MPI binaries only differing in the above patch were prepared in
> an identical fashion. Three repeat run of a modified gb benchmark were
> carried out for each binary.
> Tested with mpich2-1.0.7 since the old code will fail with the newer
> mpich2.
> Method
> ------
> cd $AMBERHOME/benchmarks/gb_mb
> # Modify bench.gb_mb so that nstlim=10000 and
> # so that it does not delete bench.gb_mb.out at the end
> export TESTsander=../../exe/pmemd.MPI
> export DO_PARALLEL="mpirun -np 8"
> # Test loop
> for i in `seq 1 3`;
> do
> ./bench.gb_mb && grep "Master Total wall time" bench.gb_mb.out
> done
> Results
> -------
> old
> ---
> | Master Total wall time: 346 seconds 0.10 hours
> | Master Total wall time: 347 seconds 0.10 hours
> | Master Total wall time: 347 seconds 0.10 hours
> new
> ---
> | Master Total wall time: 345 seconds 0.10 hours
> | Master Total wall time: 347 seconds 0.10 hours
> | Master Total wall time: 346 seconds 0.10 hours
> regards,
> --
> Mark Williamson, Post Doc
> Walker Molecular Dynamics Group
> Room 395E
> San Diego Supercomputer Center
> 9500 Gilman Drive
> La Jolla, CA 92093-0505
> Email: mjw at
> Office: 858-246-0827
> _______________________________________________
> AMBER-Developers mailing list

AMBER-Developers mailing list
Received on Fri Apr 16 2010 - 09:30:06 PDT
Custom Search