Re: [AMBER-Developers] near-final testing

From: Jason Swails <jason.swails.gmail.com>
Date: Tue, 13 Apr 2010 11:51:42 -0400

On Tue, Apr 13, 2010 at 11:44 AM, Mark Williamson <mjw.sdsc.edu> wrote:
> case wrote:
>>
>> On Tue, Apr 13, 2010, Jason Swails wrote:
>>>>
>>>> 2. Jason: For Mac OSX10.6, gnu 4.43:  what is the nature of the failure
>>>> in parallel sander.RISM.MPI?  What is your stack size (that causes the
>>>> parallel pmemd.MPI jobs to fail)?  What do you mean by "memory overlap"?
>>>
>>> The errors I was getting are the same as the ones I reported for the
>>> Ubuntu build on the wiki (I posted sample error messages there for
>>> both the RISM and pmemd errors).  I'm beginning to think it may be a
>>> gnu 4.4-related problem (since that is the only commonality that my
>>> systems share, even though they are different 4.4's).
>>
>> Is it possibly an MPI problem?  What version of MPI are you using on the
>> MAC
>> (I see mpich2 on ubuntu).  Have you ever tried openmpi using the configure
>> script we provide?
>>
>
> Ok, Ross and I have noticed this one too:
>
> ./Run.gbrna
> if ( ! 1 ) set TESTsander = ../../exe/sander
> if ( ! 1 ) then
> set numprocs=`echo $DO_PARALLEL | awk -f ../numprocs.awk `
> echo mpirun -np 2
> awk -f ../numprocs.awk
> if ( 2 > 19 ) then
> if ( 0 ) then
> endif
> endif
> cat
> set output = mdout.gbrna
> mpirun -np 2 ../../exe/pmemd.MPI -O -i gbin -c md4.x -o mdout.gbrna
> Assertion failed in file helper_fns.c at line 335: 0
> memcpy argument memory ranges overlap, dst_=0xe812e0 src_=0xe812e0 len_=7680
>
> internal ABORT - process 0
> rank 0 in job 387  caffeine.sdsc.edu_43871   caused collective abort of all
> ranks
>  exit status of rank 0: killed by signal 9
> goto error
> echo   ./Run.gbrna:  Program error
>  ./Run.gbrna:  Program error
> exit ( 1 )
>
> /server-home/netbin/mpi/mpich2-1.2.1p1-ifort-10.1.018/
>
>
> We think it is something to do with the latest version of mpich2, were are
> actively investigating. Jason, what version of mpich2 are you using?

1.2.1 -- I hadn't thought to check the MPI implementations, as I
generally always used OpenMPI on my Mac, but for reasons of
convenience I used MPICH2 when I recently reconfigured my system. I'm
working on verifying everything works with OpenMPI right now on
Ubuntu, and if that works I'll move it over to my Mac.

>
>
>> If it is a compiler problem, this is a major stumbling block, since 4.4.x
>> is the current released compiler, and is found all over the place.  Users
>> can mostly live without running parallel RISM, but the pmemd errors are in
>> bog-simple calculations that everyone will want to run.  I'll see if I can
>> try
>> some tests myself.
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>



-- 
---------------------------------------
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Graduate Student
352-392-4032
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Tue Apr 13 2010 - 09:00:06 PDT
Custom Search