Re: [AMBER-Developers] Suggestions for dealing with mpich2-1.2.1p1 from Robert Duke on 2010-04-16 (Amber Developers Archive Apr 2010)

From: Robert Duke <rduke.email.unc.edu>
Date: Fri, 16 Apr 2010 12:31:05 -0400

So one other comment - when I worry about breaking pmemd performance, I am
NOT typically concerned that I, or somebody else will wedge performance on <
16 cpu's; I am concerned about what happens at >= 128 cpu's, and the issues
become really really sticky with trying to protect performance once you are
over a few tens of processors. It is unbelievably easy to add a line or two
that just wreaks havoc (I don't think this one is THAT bad, but I am just
trying to justify/explain my general paranoia :-)).
Regards - Bob
----- Original Message -----
From: "Robert Duke" <rduke.email.unc.edu>
To: "AMBER Developers Mailing List" <amber-developers.ambermd.org>
Sent: Friday, April 16, 2010 12:25 PM
Subject: Re: [AMBER-Developers] Suggestions for dealing with mpich2-1.2.1p1

> Hi Mark,
> I had the same basic idea, but hesitate because it may have an impact as
> the processor count goes up somewhere north of 64 - hard to say; I would
> test it at like 256 on a high atom count system and see what happens
> there. The reason for the concern, off the top of my head - the extra
> copy in the master stalls the master a bit at all places this gets called,
> and that has the impact of stalling everybody else. I do realize this is
> a heck of a mess. I am not wildly passionate about GB performance (nobody
> has ever made it a real priority; if the amber community really wanted it
> superfast and I had funding to do it, I could make it a heck of a lot
> faster), so aside from a bit of pride-of-ownership and cringing over
> anything that slows pmemd down, I could probably let it go.
>
> I was going to mention earlier that if a scheme were used where the user
> has made a choice that can be clearly identified as mpich 1 vs 2, then you
> may be able to simplify the rest of this. I just don't have the matrix of
> all the things that work and don't work and are and are not available for
> the various mpi subversions, so I can't really sort it all out without
> lots of work. Another case of being subverted by subversion code
> inherited from others; time to write our own mpi ;-)
> Regards - Bob
>
> ----- Original Message -----
> From: "Mark Williamson" <mjw.sdsc.edu>
> To: "AMBER Developers Mailing List" <amber-developers.ambermd.org>
> Sent: Friday, April 16, 2010 12:13 PM
> Subject: Re: [AMBER-Developers] Suggestions for dealing with
> mpich2-1.2.1p1
>
>
>> Ross Walker wrote:
>>> Hi All,
>>>
>>> I am trying to address the mpich2-1.2.1p1 issue regarding pmemd (and
>>> also
>>
>> With reference to the PMEMD test fails with recent mpich2 versions,
>> I find the following patch works well for me:
>>
>> --- a/src/pmemd/src/gb_parallel.fpp
>> +++ b/src/pmemd/src/gb_parallel.fpp
>> .. -305,14 +305,20 .. subroutine gb_mpi_gathervec(atm_cnt, vec)
>>
>> integer :: atm_cnt
>> double precision :: vec(3, atm_cnt)
>> + double precision :: recv_buf(3, atm_cnt)
>>
>> ! Local variables:
>>
>> call mpi_gatherv(vec(1, atm_offsets(mytaskid) + 1), &
>> vec_rcvcnts(mytaskid), mpi_double_precision, &
>> - vec, vec_rcvcnts, vec_offsets, &
>> + recv_buf, vec_rcvcnts, vec_offsets, &
>> mpi_double_precision, 0, mpi_comm_world,
>> err_code_mpi)
>>
>> + if (mytaskid == 0) then
>> + vec = recv_buf
>> + end if
>> +
>> +
>> return
>>
>>
>> Highlights:
>>
>> * Compiles and runs fine with mpich2-1.2.1p1
>> * All test.parallel.pmemd pass
>> * Does not require changes to Makefiles/config or passing of extra flags.
>> * Is MPI-1 Compliant ;)
>> * Seems not to affect performance, viz:
>>
>> Benchmarks
>> ==========
>>
>> Two pmemd.MPI binaries only differing in the above patch were prepared in
>> an identical fashion. Three repeat run of a modified gb benchmark were
>> carried out for each binary.
>>
>> Tested with mpich2-1.0.7 since the old code will fail with the newer
>> mpich2.
>>
>> Method
>> ------
>>
>> cd $AMBERHOME/benchmarks/gb_mb
>>
>> # Modify bench.gb_mb so that nstlim=10000 and
>> # so that it does not delete bench.gb_mb.out at the end
>> export TESTsander=../../exe/pmemd.MPI
>> export DO_PARALLEL="mpirun -np 8"
>>
>> # Test loop
>> for i in `seq 1 3`;
>> do
>> ./bench.gb_mb && grep "Master Total wall time" bench.gb_mb.out
>> done
>>
>>
>>
>> Results
>> -------
>>
>> old
>> ---
>> | Master Total wall time: 346 seconds 0.10 hours
>> | Master Total wall time: 347 seconds 0.10 hours
>> | Master Total wall time: 347 seconds 0.10 hours
>>
>>
>> new
>> ---
>> | Master Total wall time: 345 seconds 0.10 hours
>> | Master Total wall time: 347 seconds 0.10 hours
>> | Master Total wall time: 346 seconds 0.10 hours
>>
>>
>> regards,
>>
>> --
>> Mark Williamson, Post Doc
>> Walker Molecular Dynamics Group
>> Room 395E
>> San Diego Supercomputer Center
>> 9500 Gilman Drive
>> La Jolla, CA 92093-0505
>> Email: mjw at sdsc.edu
>> Office: 858-246-0827
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>>
>
>
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
>

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Fri Apr 16 2010 - 10:00:04 PDT