Re: amber-developers: Extra Points Calculation

From: Volodymyr Babin <vbabin.ncsu.edu>
Date: Tue, 11 Nov 2008 19:43:38 -0500 (EST)

Ilyas,

this has been fixed in newer versions by the following commits:

revision 9.3
date: 2006/08/17 23:10:58; author: case; state: Exp; lines: +1 -0
initialize gmin to 1 to avoid divide by zero

----------------------------
revision 9.8
date: 2007/10/24 22:46:00; author: steinbrt; state: Exp; lines: +19 -33
TBS: Extended and cleaned up Softcore potentials for TI

Please find the .diff attached (I did not test it throughly though).

Have a great night,
  Volodymyr

On Tue, November 11, 2008 14:38, Ilyas Yildirim wrote:
> The test script is as follows:
>
> --------- runsim -------------
> #!/bin/csh -f
>
> set sander = $AMBERHOME/exe/sander.MPI
>
> /bin/rm -f min
>
> cat > min <<EOF
> Initial Minimization of solvent + ions
> &cntrl
> imin = 1,
> maxcyc = 10,
> ncyc = 5,
> ntb = 1,
> ntr = 0,
> cut = 8.0,
> ntpr = 1, ntx = 1,
> icfe=1,klambda=6,clambda=0.5
> /
> EOF
>
> cat > groups_md <<EOF
> -O -i min -p prmtop1 -c inpcrd -o min.out.p1 -r min.rst.p1 -x
> mdcrd_02.traj.p1
> -O -i min -p prmtop2 -c inpcrd -o min.out.p2 -r min.rst.p2 -x
> mdcrd_02.traj.p2
> EOF
>
> mpiexec -np 4 $sander -ng 2 -groupfile groups_md < /dev/null || goto error
> ------------------------------------
>
> The error message is as follows:
>
> ----
> [yildirim.malatya02 ~/l_0.5]# runsim
>
> Running multisander version of sander amber9
> Total processors = 4
> Number of groups = 2
>
> Looping over processors:
> WorldRank is the global PE rank
> NodeID is the local PE rank in current group
>
> Group = 0
> WorldRank = 0
> NodeID = 0
>
> Group = 1
> WorldRank = 2
> NodeID = 0
>
> WorldRank = 1
> NodeID = 1
>
> WorldRank = 3
> NodeID = 1
>
> [cli_1]: [cli_3]: aborting job:
> Fatal error in MPI_Bcast: Message truncated, error stack:
> MPI_Bcast(784).........................: MPI_Bcast(buf=0xe63888, count=1,
> MPI_INTEGER, root=0, comm=0x84000000) f
> ailed
> MPIR_Bcast(198)........................:
> MPIDI_CH3U_Post_data_receive_found(163): Message from rank 0 and tag 2
> truncated; 144408 bytes received but buffe
> r size is 4
> aborting job:
> Fatal error in MPI_Bcast: Message truncated, error stack:
> MPI_Bcast(784).........................: MPI_Bcast(buf=0xe63888, count=1,
> MPI_INTEGER, root=0, comm=0x84000000) f
> ailed
> MPIR_Bcast(198)........................:
> MPIDI_CH3U_Post_data_receive_found(163): Message from rank 0 and tag 2
> truncated; 144408 bytes received but buffe
> r size is 4
> rank 3 in job 1 malatya02_48011 caused collective abort of all ranks
> exit status of rank 3: return code 1
> rank 1 in job 1 malatya02_48011 caused collective abort of all ranks
> exit status of rank 1: return code 1
> error: label not found.
>
> --------------
>
> mpich2version is
>
> [yildirim.malatya02 ~/l_0.5]# mpich2version
> Version: 1.0.5
> Device: ch3:sock
> Configure Options: '--prefix=/programs/mpich2'
> 'CC=/opt/intel/cce/10.0.026/bin/icc'
> 'CXX=/opt/intel/cce/10.0.026/bin/icpc'
> CC: /opt/intel/cce/10.0.026/bin/icc
> CXX: /opt/intel/cce/10.0.026/bin/icpc
> F77: /opt/intel/fce/10.0.026/bin/ifort
> F90: ifort
>
> ---------------
>
> Thanks.
>
> On Tue, 11 Nov 2008, Volodymyr Babin wrote:
>
>> On Tue, November 11, 2008 14:18, Ilyas Yildirim wrote:
>> > While the subject has minimization in it, I have a quick question: Is
>> it
>> > possible to use more than 2 CPUs in Thermodynamic Integration
>> > minimization? I could never use 4 CPUs when I wanted to do
>> minimization
>> > with icfe set to either 1 or 2. Everything works fine with 2 CPUs, but
>> not
>> > with 4 CPUs in the minimization process. I use mpich2 and amber9.
>>
>> Could you provide more details on what happens with > 2 cpus?
>> An easily runnable test-case that shows the problem would also be
>> very helpful.
>>
>> Have a great day,
>> Volodymyr
>>
>>
>
> --
> Ilyas Yildirim, Ph.D.
> ---------------------------------------------------------------
> = Hutchison Hall B#10 - Department of Chemistry =
> = - University of Rochester =
> = 585-275-6766 (office) - =
> = http://www.pas.rochester.edu/~yildirim/ =
> ---------------------------------------------------------------
>
>

Received on Fri Dec 05 2008 - 14:31:02 PST
Custom Search