Re: [AMBER-Developers] near-final testing

From: Mark Williamson <mjw.sdsc.edu>
Date: Tue, 13 Apr 2010 08:44:39 -0700

case wrote:
> On Tue, Apr 13, 2010, Jason Swails wrote:
>>> 2. Jason: For Mac OSX10.6, gnu 4.43: what is the nature of the failure
>>> in parallel sander.RISM.MPI? What is your stack size (that causes the
>>> parallel pmemd.MPI jobs to fail)? What do you mean by "memory overlap"?
>> The errors I was getting are the same as the ones I reported for the
>> Ubuntu build on the wiki (I posted sample error messages there for
>> both the RISM and pmemd errors). I'm beginning to think it may be a
>> gnu 4.4-related problem (since that is the only commonality that my
>> systems share, even though they are different 4.4's).
>
> Is it possibly an MPI problem? What version of MPI are you using on the MAC
> (I see mpich2 on ubuntu). Have you ever tried openmpi using the configure
> script we provide?
>

Ok, Ross and I have noticed this one too:

./Run.gbrna
if ( ! 1 ) set TESTsander = ../../exe/sander
if ( ! 1 ) then
set numprocs=`echo $DO_PARALLEL | awk -f ../numprocs.awk `
echo mpirun -np 2
awk -f ../numprocs.awk
if ( 2 > 19 ) then
if ( 0 ) then
endif
endif
cat
set output = mdout.gbrna
mpirun -np 2 ../../exe/pmemd.MPI -O -i gbin -c md4.x -o mdout.gbrna
Assertion failed in file helper_fns.c at line 335: 0
memcpy argument memory ranges overlap, dst_=0xe812e0 src_=0xe812e0 len_=7680

internal ABORT - process 0
rank 0 in job 387 caffeine.sdsc.edu_43871 caused collective abort of
all ranks
   exit status of rank 0: killed by signal 9
goto error
echo ./Run.gbrna: Program error
   ./Run.gbrna: Program error
exit ( 1 )

/server-home/netbin/mpi/mpich2-1.2.1p1-ifort-10.1.018/


We think it is something to do with the latest version of mpich2, were
are actively investigating. Jason, what version of mpich2 are you using?


> If it is a compiler problem, this is a major stumbling block, since 4.4.x
> is the current released compiler, and is found all over the place. Users
> can mostly live without running parallel RISM, but the pmemd errors are in
> bog-simple calculations that everyone will want to run. I'll see if I can try
> some tests myself.

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Tue Apr 13 2010 - 09:00:06 PDT
Custom Search