The test script is as follows:
--------- runsim -------------
#!/bin/csh -f
set sander = $AMBERHOME/exe/sander.MPI
/bin/rm -f min
cat > min <<EOF
Initial Minimization of solvent + ions
&cntrl
imin = 1,
maxcyc = 10,
ncyc = 5,
ntb = 1,
ntr = 0,
cut = 8.0,
ntpr = 1, ntx = 1,
icfe=1,klambda=6,clambda=0.5
/
EOF
cat > groups_md <<EOF
-O -i min -p prmtop1 -c inpcrd -o min.out.p1 -r min.rst.p1 -x
mdcrd_02.traj.p1
-O -i min -p prmtop2 -c inpcrd -o min.out.p2 -r min.rst.p2 -x
mdcrd_02.traj.p2
EOF
mpiexec -np 4 $sander -ng 2 -groupfile groups_md < /dev/null || goto error
------------------------------------
The error message is as follows:
----
[yildirim.malatya02 ~/l_0.5]# runsim
Running multisander version of sander amber9
Total processors = 4
Number of groups = 2
Looping over processors:
WorldRank is the global PE rank
NodeID is the local PE rank in current group
Group = 0
WorldRank = 0
NodeID = 0
Group = 1
WorldRank = 2
NodeID = 0
WorldRank = 1
NodeID = 1
WorldRank = 3
NodeID = 1
[cli_1]: [cli_3]: aborting job:
Fatal error in MPI_Bcast: Message truncated, error stack:
MPI_Bcast(784).........................: MPI_Bcast(buf=0xe63888, count=1,
MPI_INTEGER, root=0, comm=0x84000000) f
ailed
MPIR_Bcast(198)........................:
MPIDI_CH3U_Post_data_receive_found(163): Message from rank 0 and tag 2
truncated; 144408 bytes received but buffe
r size is 4
aborting job:
Fatal error in MPI_Bcast: Message truncated, error stack:
MPI_Bcast(784).........................: MPI_Bcast(buf=0xe63888, count=1,
MPI_INTEGER, root=0, comm=0x84000000) f
ailed
MPIR_Bcast(198)........................:
MPIDI_CH3U_Post_data_receive_found(163): Message from rank 0 and tag 2
truncated; 144408 bytes received but buffe
r size is 4
rank 3 in job 1 malatya02_48011 caused collective abort of all ranks
exit status of rank 3: return code 1
rank 1 in job 1 malatya02_48011 caused collective abort of all ranks
exit status of rank 1: return code 1
error: label not found.
--------------
mpich2version is
[yildirim.malatya02 ~/l_0.5]# mpich2version
Version: 1.0.5
Device: ch3:sock
Configure Options: '--prefix=/programs/mpich2'
'CC=/opt/intel/cce/10.0.026/bin/icc'
'CXX=/opt/intel/cce/10.0.026/bin/icpc'
CC: /opt/intel/cce/10.0.026/bin/icc
CXX: /opt/intel/cce/10.0.026/bin/icpc
F77: /opt/intel/fce/10.0.026/bin/ifort
F90: ifort
---------------
Thanks.
On Tue, 11 Nov 2008, Volodymyr Babin wrote:
> On Tue, November 11, 2008 14:18, Ilyas Yildirim wrote:
> > While the subject has minimization in it, I have a quick question: Is it
> > possible to use more than 2 CPUs in Thermodynamic Integration
> > minimization? I could never use 4 CPUs when I wanted to do minimization
> > with icfe set to either 1 or 2. Everything works fine with 2 CPUs, but not
> > with 4 CPUs in the minimization process. I use mpich2 and amber9.
>
> Could you provide more details on what happens with > 2 cpus?
> An easily runnable test-case that shows the problem would also be
> very helpful.
>
> Have a great day,
> Volodymyr
>
>
--
Ilyas Yildirim, Ph.D.
---------------------------------------------------------------
= Hutchison Hall B#10 - Department of Chemistry =
= - University of Rochester =
= 585-275-6766 (office) - =
= http://www.pas.rochester.edu/~yildirim/ =
---------------------------------------------------------------
Received on Fri Dec 05 2008 - 14:29:25 PST