Re: amber-developers: 'MPI_BCAST : Message truncated' error

From: Ilyas Yildirim <yildirim.pas.rochester.edu>
Date: Tue, 6 Mar 2007 17:32:03 -0500 (EST)

This is a system where I am following the Thermodynamic Integration
Approach. I have to use multisander. With 2 cpu's, everything is fine,
though.

On Tue, 6 Mar 2007, Carlos Simmerling wrote:

> does it work when you are not using multisander?
>
> On 3/6/07, Ilyas Yildirim <yildirim.pas.rochester.edu> wrote:
> > Dear All,
> >
> > Using sander.MPI in a minimization with 2 cpu's work fine, but if I try to
> > use 4/8/... cpu's, it is giving me the following error:
> >
> > ---------------------------------------------------------------------
> > arde00:/home/yildirim/test/l_0.2>runmin &
> > [1] 2428
> > arde00:/home/yildirim/test/l_0.2>/bin/rm: No match.
> > mpirun -stdin /dev/null -np 4 -nolocal -machinefile /tmp/tmp.mpi.2434
> > /home/yildirim/amber9/exe/sander.MPI -ng 2 -groupfile
> > /home/yildirim/test/l_0.2/groups_min1; rm -f /tmp/tmp.mpi.2434
> > running on arde11:1 arde12:1 arde13:2
> >
> > Running multisander version of sander amber9
> > Total processors = 4
> > Number of groups = 2
> >
> > Looping over processors:
> > WorldRank is the global PE rank
> > NodeID is the local PE rank in current group
> >
> > Group = 0
> > WorldRank = 0
> > NodeID = 0
> >
> > WorldRank = 1
> > NodeID = 1
> >
> > Group = 1
> > WorldRank = 2
> > NodeID = 0
> >
> > WorldRank = 3
> > NodeID = 1
> >
> > p3_19669: p4_error: : 14
> > 3 - MPI_BCAST : Message truncated
> > [3] Aborting program !
> > [3] Aborting program!
> > p1_7187: p4_error: : 14
> > 1 - MPI_BCAST : Message truncated
> > [1] Aborting program !
> > [1] Aborting program!
> > rm_l_3_19670: (2.024163) net_send: could not write to fd=5, errno = 32
> > rm_l_1_7188: (2.869610) net_send: could not write to fd=5, errno = 32
> > p2_12533: p4_error: net_recv read: probable EOF on socket: 1
> > rm_l_2_12534: (2.259215) net_send: could not write to fd=5, errno = 32
> > p1_7187: (2.871182) net_send: could not write to fd=5, errno = 32
> > p2_12533: (6.264409) net_send: could not write to fd=5, errno = 32
> > mpirun -stdin /dev/null -np 4 -nolocal -machinefile /tmp/tmp.mpi.2595
> > /home/yildirim/amber9/exe/sander.MPI -ng 2 -groupfile
> > /home/yildirim/test/l_0.2/groups_min2; rm -f /tmp/tmp.mpi.2595
> > ---------------------------------------------------------------------
> >
> > For the md runs, I dont see any problems (can run with 4/8/... cpu's). The
> > system is an 8-mer solvated with water. I was wondering if this is normal
> > for AMBER9, or if I am missing something. Thanks.
> >
> > --
> > Ilyas Yildirim
> > ---------------------------------------------------------------
> > - Department of Chemisty - -
> > - University of Rochester - -
> > - Hutchison Hall, # B10 - -
> > - Rochester, NY 14627-0216 - Ph.:(585) 275 67 66 (Office) -
> > - http://www.pas.rochester.edu/~yildirim/ -
> > ---------------------------------------------------------------
> >
> >
>
>
> --
>
>

-- 
  Ilyas Yildirim
  ---------------------------------------------------------------
  - Department of Chemisty       -				-
  - University of Rochester      -				-
  - Hutchison Hall, # B10        -				-
  - Rochester, NY 14627-0216     - Ph.:(585) 275 67 66 (Office)	-
  - http://www.pas.rochester.edu/~yildirim/			-
  ---------------------------------------------------------------
Received on Wed Mar 07 2007 - 06:07:46 PST
Custom Search