I like the "don't support MPICH-1.2.1p1" option myself as a first guess. I
have not looked at the code to see whether there is potential trouble there,
but there are a lot of performance implications in parts of the mpi code...
I will try to find some time to look at this further...
(the principle I am applying here is you don't dink up your own code to
support stuff that is broken, unless you absolutely have to...)
Regards - Bob
----- Original Message -----
From: "Ross Walker" <ross.rosswalker.co.uk>
To: "'AMBER Developers Mailing List'" <amber-developers.ambermd.org>
Sent: Tuesday, April 13, 2010 7:12 PM
Subject: [AMBER-Developers] Issues with MPICH-1.2.1p1
> Hi All,
>
> Mark and I have been looking at the problems occurring with MPICH-1.2.1p1
> that give the following error with PMEMD:
>
> Assertion failed in file helper_fns.c at line 335: 0
> memcpy argument memory ranges overlap, dst_=0x6e51a4 src_=0x6e51a0
> len_=100
>
> I believe a similar issue may be occurring with MPI RISM. This, in my
> opinion, looks like an overzealous interpretation of the MPI 1 standard by
> the mpich2 authors. Note, earlier versions, such as mpich2-1.0.7 work
> fine.
>
> The problem is occurring in the use of mpi_gatherv in PMEMD
> gb_parallel.fpp
> which uses the same send and receive buffer for the gatherv call. Looking
> at
> the standard this should be perfectly reasonable and indeed works with
> EVERY
> other MPI I have ever tried. However, we probably want to address this
> ASAP
> before users start complaining. Attached is an example program that
> reproduces this problem.
>
> Specifically one has the following 3 options:
>
> 1) !Send each tasks chunk of send array to the receive array on the master
> call mpi_gatherv(send_array(my_array_offset), my_array_count,
> MPI_INTEGER,
> rec_array, rec_counts, rec_offsets, MPI_INTEGER, 0, mpi_comm_world, ierr)
>
> This works for everything.
>
> 2) !What this 'should' be according to the MPICH2 people:
> if (mytaskid==0) then
> call mpi_gatherv(MPI_IN_PLACE, my_array_count, MPI_INTEGER, send_array,
> rec_counts, rec_offsets, MPI_INTEGER, 0, mpi_comm_world, ierr)
> else
> call mpi_gatherv(send_array(my_array_offset), my_array_count,
> MPI_INTEGER,
> send_array, rec_counts, rec_offsets, MPI_INTEGER, 0, mpi_comm_world, ierr)
> end if
>
> Note this works ONLY with MPI v2.
>
> 3) !Send each tasks chunk of send array to the send array on the master
> call mpi_gatherv(send_array(my_array_offset), my_array_count,
> MPI_INTEGER,
> send_array, rec_counts, rec_offsets, MPI_INTEGER, 0, mpi_comm_world, ierr)
>
> This is what we use right now that works with all previous MPICH2's plus
> every other MPI implementation I have tried.
>
> Suggestions for how we want to address this? We can have a -DMPI2 which
> means we have to update all the configure rules to either detect this or
> rely on the user to specify it.
>
> Or we rewrite this section of PMEMD (plus other places which may cause
> issues such as in the non power of 2 cpu code in sander + RISM?) to use
> different buffers and do a copy afterwards.
>
> Or we just tell people not to use mpich2 v1.2.1p1 (and probably later
> versions).
>
> All the best
> Ross
>
> /\
> \/
> |\oss Walker
>
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
--------------------------------------------------------------------------------
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Tue Apr 13 2010 - 17:00:03 PDT