Re: [AMBER-Developers] sander.MPI tests randomly stall in Ubuntu 9.10

From: Lachele Foley <lfoley.ccrc.uga.edu>
Date: Mon, 10 May 2010 16:23:58 -0400

Sorry I've been quiet lately... very overwhelmed.

A student did some testing, and she got a similar hang. For us, the hangs weren't random:

==============================================================
make[2]: Leaving directory `/usr/local/amber11_20100507/test'
cd neb-testcases/neb_gb_partial && ./Run.neb_gb_partial

 Running multisander version of sander Amber11
    Total processors = 4
    Number of groups = 4
.......

She said she tried it twice, and I confirmed it a third time just for fun. The serial tests went mostly well with 13 failures (all ncsu stuff) and one error (I think DFTB).

Info of relevance (64-bit something?):

Linux aarya.woods.ccrc 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

GNU Fortran (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)

gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)

mpirun (Open MPI) 1.4.1

mpicc -showme
gcc -I/usr/local/include -pthread -L/usr/local/lib -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl

mpif90 -showme
gfortran -I/usr/local/include -pthread -I/usr/local/lib -L/usr/local/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl

mpif77 -showme
gfortran -I/usr/local/include -pthread -L/usr/local/lib -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl


She says she'll try on the machine in Ireland (64-bit with the wiggy file system), tonight.

:-) Lachele
--
B. Lachele Foley, PhD '92,'02
Assistant Research Scientist
Complex Carbohydrate Research Center, UGA
706-542-0263
lfoley.ccrc.uga.edu
----- Original Message -----
From: Gustavo Seabra
[mailto:gustavo.seabra.gmail.com]
To: AMBER Developers Mailing List
[mailto:amber-developers.ambermd.org]
Sent: Mon, 10 May 2010 14:40:54
-0400
Subject: Re: [AMBER-Developers] sander.MPI tests randomly stall in
Ubuntu 9.10
> Hi,
> 
> Here's what I have:
> $ uname -a
> Linux TIE1 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC
> 2009 x86_64 GNU/Linux
> 
> $ gfortran --version
> GNU Fortran (Ubuntu 4.4.1-4ubuntu9) 4.4.1
> 
> $ gcc --version
> gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1
> 
> $ mpirun --version
> mpirun (Open MPI) 1.3.2
> 
> The serial version works just fine. I only see this with the parallel
> version. The machine is a dual core AMD  Athlon(tm) II X2 B22
> Processor 800MHz, and I was running the tests with 4 mpi processes.
> 
> Thanks,
> Gustavo.
> 
> On Fri, May 7, 2010 at 10:46 PM, Daniel Roe wrote:
> > I personally havent seen this before. What compiler(s) are you using, and
> > what MPI implementation?
> >
> > -Dan
> >
> > On Fri, May 7, 2010 at 2:14 PM, Gustavo Seabra
> <gustavo.seabra.gmail.com>wrote:
> >
> >> Hi All,
> >>
> >> I have compiled and tested Amber11 (release version) serial with no
> >> problems on Ubuntu 9.10. Now, when I try the parallel version, it
> >> compiles fine but some tests just stall. The processes keep running
> >> (from "top" I can stil see the sander.MPI processes using all the
> >> processor they need), but the calculation never advances.
> >>
> >> Which test will be a problem is random: I had with PIMD, then when I
> >> try again it happens with QM tests, then something else, etc.
> >>
> >> I have found the following message on the archives:
> >> http://structbio.vanderbilt.edu/archives/amber-archive/2009/5928.php
> >>
> >> But with no replies.
> >>
> >> Does anyone else experience such problems?
> >>
> >> Thanks,
> >>
> >> --
> >> Gustavo Seabra
> 
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
> 
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Mon May 10 2010 - 13:30:03 PDT
Custom Search