[AMBER-Developers] Fwd: [AMBER] make test trouble with Amber11

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 27 Jul 2010 09:04:40 -0400

I figured I would ask about this on the Developers list first. I've
also seen this behavior (see users forwarded email below) on linux
x86_64 with both Intel 11.1 and GNU (4.4.3) compilers (using mpich2
1.2.1). The test.sander.PIMD.MPI.full tests (8 threads) never finish
for me. Seems ok with 4 threads. Has anyone else seen this?
-Dan

---------- Forwarded message ----------
From: 池田輝彦 <ikedaike.gmail.com>
Date: 2010/7/27
Subject: [AMBER] make test trouble with Amber11
To: amber.ambermd.org


Dear Sir,

I encountered a problem when running "make test" with Amber11.
I built Amber11 with the following environment.

* CPU: Intel Xeon X5570(2.93GHz) x 2CPU(Total: 8core)
* OS: Red Hat Enterprise Linux 5.3 (x86_64)
* Compiler: Intel Compiler 11.1.046(not use MKL) or GCC 4.1.2
* MPI: OpenMPI 1.4.2 or OpenMPI 1.3.3

I tried to run "make test" with 8 processors but test "Centroid MD" did not end.
(With 4 processors, all tests ended normally.)
I checked the script "Run.full_cmd" on
$AMBERHOME/test/PIMD/full_cmd_water/equilib,
but it seems that sander.MPI works normally with 8 processors.

[guest.localhost ~]$ cd $AMBERHOME/test/PIMD/full_cmd_water/equilib/
[guest.localhost equilib]$ export
DO_PARALLEL="/usr/local/openmpi-1.4.2/intel-11.1.046/bin/mpirun -np 4"
[guest.localhost equilib]$ time ./Run.full_cmd
Testing Centroid MD
diffing cmd.out.save with cmd.out
PASSED
==============================================================
diffing cmd_bead1.out.save with cmd_bead1.out
PASSED
==============================================================
diffing cmd_bead2.out.save with cmd_bead2.out
PASSED
==============================================================
diffing cmd_bead3.out.save with cmd_bead3.out
PASSED
==============================================================
diffing cmd_bead4.out.save with cmd_bead4.out
PASSED
==============================================================

real 0m1.546s
user 0m0.647s
sys 0m0.360s
[guest.localhost ~]$ export
DO_PARALLEL="/usr/local/openmpi-1.4.2/intel-11.1.046/bin/mpirun -np 8"
[guest.localhost ~]$ time ./Run.full_cmd
Testing Centroid MD
mpirun: killing job... <--- Ctrl+C

--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 6746 on node localhost
exited on signal 0 (Unknown signal 0).
--------------------------------------------------------------------------
mpirun: clean termination accomplished



real 2m18.742s
user 0m0.013s
sys 0m0.010s

I checked that 8 processes were generated and they wasted CPU using top(1).
Can I avoid this problem ?

Thanks for your advice.

Regards,
T.Ikeda

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Tue Jul 27 2010 - 06:30:17 PDT
Custom Search