Re: [AMBER-Developers] [AMBER] Setting the cut-off and size of box

From: Timothy J Giese <timothyjgiese.gmail.com>
Date: Mon, 02 Dec 2013 21:16:17 -0500

From: Pengzhi <zhangpengzhi1988.gmail.com>
Date: Mon, 2 Apr 2012 21:59:58 -0500

On Mon, Apr 02, 2012, Pengzhi wrote:
>
> I have a system with PBC. The dimension of the cubic box I use is 900? *
> 900? * 900?. When I set the cut-off to be 450? or slightly less than that,
> my system ends up with explosion. I have the error message:
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>
> I can avoid this by setting a smaller cutoff (anything smaller than 360?
> works). However, if I want to keep as much interactions as possible,
> technically what is up limit of the cut-off in amber10 given size of the
> box?


I've run into the same problem with a normal system.
110 tip3p waters in a 25 A^3 box.
Serial sander compiled with gnu 4.6.3 on fedora 15.
All serial tests pass.

The following runs without a segfault:

title
&cntrl
  imin=0
  nstlim=10000 ! 1.ns
  ntt=0 ! 0=constE 3=langevin constT
  tempi=300. ! initial temp
  temp0=300 ! target temp
! shake
  ntc=2 ! =2 water shake
  ntf=2 ! =2 don't eval water H-O bonds
! periodic boundaries
  ntb=1 ! vacuum=0 constant volume=1 constant pressure=2
! output
  cut=8.
! iwrap=0
  ntpr=1 ! mdout
  ntwx=1 ! mdcrd
/
&ewald
nfft1=30
nfft2=30
nfft3=30
order=6
/

HOWEVER, if I change cut=8 to cut=9, then I get a segfault.
I recompiled with -g and ran it through valgrind.
The problem is in src/sander/nonbond_list.F90
subroutine get_nb_list
...around line ~860

[snip]
do j = nstart,nstop
   jtran = tranyz+xtran(j-nstart+1,xtindex)
   jj = index1+j-xtran(j-nstart+1,xtindex)*nucgrd1
   m1 = nlogrid(jj)
   m2 = nhigrid(jj)
   if ( m2 >= m1 )then
      do m = m1,m2
         numlist = numlist+1
!!!! HERE
         atmlist(numlist) = m
!!!! AND HERE
         itran(numlist)=jtran
      end do
   end if
end do
[/snip]

atmlist and itran are length natom, but numlist grows beyond this,
causing bad memory writes and a segfault.

The problem appears to be the value of jj.
I created a boolean mask that tests whether or not a jj is iterated over more than once. It is.
It is unclear to me if the problem is with the formula used to compute jj or if the problem is the nstart,nstop pair.

I also note that the 8 A cut is less than 1/3 the size of the box
and the 9 A cut if more than 1/3 the size of the box.
Both values are less than 1/2 the size of the box.
9 A + 2 A (skinnb) is still less than 1/2 the size of the box.
The reason why this catches my attention is that, from what I can guess, the code is looping over the faces of a 3x3 grid (?)

I can get sander to run without segfaults if I change
"""
if ( numimg(index) > 0 )then
   ncell_lo = nlogrid(index)
   ncell_hi = nhigrid(index)
   numlist = 0
"""
to read
"""
if ( numimg(index) > 0 )then
   ncell_lo = nlogrid(index)
   ncell_hi = nhigrid(index)
   numlist = 0
   seen = .FALSE. !! HERE
"""
where seen is a SIZE(nlogrid) allocatable LOGICAL array,
and I change
"""
do j = nstart,nstop
   jtran = tranyz+xtran(j-nstart+1,xtindex)
   jj = index1+j-xtran(j-nstart+1,xtindex)*nucgrd1
"""
to read
"""
do j = nstart,nstop
   jtran = tranyz+xtran(j-nstart+1,xtindex)
   jj = index1+j-xtran(j-nstart+1,xtindex)*nucgrd1
   IF ( seen(jj) .AND. m2 >= m1 ) THEN ! HERE
       CYCLE ! HERE
   ELSE ! HERE
       seen(jj) = .TRUE. ! HERE
   END IF ! HERE
"""

Although it runs, I can't say with certainty that it is correct.
I seriously doubt that the above "fix" would be correct with sander.MPI
because each process would "see" things differently.
Nor can I say whether or not a similar issue is present elsewhere in the code.
...or maybe I'm running sander wrong - certainly possible because I've never really used it before.

-Tim

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Mon Dec 02 2013 - 18:30:03 PST
Custom Search