Here is a patch that will roll back the fharm() stuff in evb_umb.f.
Kim, can you confirm that the correct behavior in evb_umb is to set fharm()
to zero inside the loops and not outside? Thanks!
-Dan
On Fri, Apr 2, 2010 at 9:14 AM, Daniel Roe <daniel.r.roe.gmail.com> wrote:
> Hi Kim,
>
> The stack size was indeed the problem. Default stack size in my setup was
> 10M:
>
> [droe.case1 sander]$ ulimit -a
> ...
> stack size (kbytes, -s) 10240
> ...
>
> Doubling the stack size to 20M allowed the tests to complete successfully.
> So we could potentially add a script called checkStackSize that any tests
> that require a large stack could call. I am attaching one that uses the
> "ulimit" command (note: not unlimit - passing a value of unlimited to the
> stack on my system isn't allowed) to check the stack size and increase it if
> it is below 20480 kbytes; 1 is returned if this could not be done, 0 if the
> stack is ok. It can be called from the necessary EVB test run scipts like
> this:
>
> set stack=`../../checkStackSize.sh`
> if ($stack == 1) then
> echo "This test requires a larger stack."
> exit(0)
> endif
>
> It works just fine on my linux machine and cygwin rig (where it fails
> because cygwin's stack limit is hard coded), but I'm not sure how portable
> it is - anyone want to test it out?
>
> Dave, Kim's patch seems to work fine. However I think it will be compatible
> with the fharm() stuff I changed in evb_umb.f, so that part of my patch
> should be rolled back (I can come up with a patch to do this if you want).
> It looks like Kim changed the output of the test cases, so I guess the
> output of the test cases before wasn't correct.
>
> -Dan
>
>
> On Fri, Apr 2, 2010 at 8:31 AM, Kim F. Wong <kimberlyyellow.gmail.com>wrote:
>
>> Dan,
>>
>> It may have to do with the default stacksize. On my laptop, I was seeing
>> this problem and it goes away if I do "unlimit" before these tests. These
>> tests read in ~3X more ab initio data than the other DG-EVB tests. Perhaps
>> we can place a "unlimit" within the Run.evb in each of these tests. What do
>> you suggest (both for the short-term & long-term)?
>>
>> -Kim
>>
>>
>> On 4/2/2010 8:22 AM, Daniel Roe wrote:
>>
>>> The EVB patch seems to work well, but I am still having problems with a
>>> few
>>> of the tests:
>>>
>>> cd evb/poh_dbonds_umb_dg_UFF_9DG&& ./Run.evb
>>> cd evb/poh_dbonds_umb_dg_UFF_9DG_pimd_ld_full&& ./Run.evb
>>> cd evb/poh_dbonds_umb_dg_UFF_9DG_pimd_nhc_full&& ./Run.evb
>>> cd evb/poh_dbonds_umb_dg_UFF_9DG_nmpimd_full&& ./Run.evb
>>> cd evb/poh_dbonds_umb_dg_UFF_9DG_nmpimd_full_TST-freqf&& ./Run.evb
>>>
>>> Previously however I was having issues with these tests that Mark was not
>>> seeing. Does anybody else have these tests fail with an MPI_abort? I've
>>> had
>>> it happen to me with both gnu and intel compilers (2 versions, 10 and 11)
>>> as
>>> well as 2 different MPICH2 versions.
>>>
>>> -Dan
>>>
>>> On Thu, Apr 1, 2010 at 10:04 PM, Kim F. Wong<kimberlyyellow.gmail.com
>>> >wrote:
>>>
>>>
>>>
>>>> Dan,
>>>>
>>>> Thanks for your help. I made a patch (see attached) earlier today& was
>>>> running the tests. Although I've verified that the patch works, I would
>>>> appreciate it if you can test it at your end before committing to the
>>>> RC.
>>>>
>>>> -Kim
>>>>
>>>>
>>>> On 4/1/2010 6:18 PM, Daniel Roe wrote:
>>>>
>>>>
>>>>
>>>>> Hi All,
>>>>>
>>>>> This is regarding the previously discussed EVB test cases that
>>>>> segfault:
>>>>>
>>>>> cd evb/malon_dbonds_umb_dg_UFF_3DG_qi_full_2D-PMF&& ./Run.evb
>>>>> cd evb/malon_dbonds_umb_dg_UFF_3DG_qi_full_corrF&& ./Run.evb
>>>>>
>>>>> I have made some modifications to the code that constitute a partial
>>>>> fix,
>>>>> but I can't proceed further without input from EVB people.
>>>>>
>>>>> These tests can both be protected from segfaults by making the loop at
>>>>> line
>>>>> 148 in pimd_force.f that references dmdlm dependent on the value of
>>>>> itimass
>>>>> (which is what triggers the init of dmdlm), e.g.
>>>>>
>>>>> pimd_force.f
>>>>> 148c148
>>>>> < if( i_qi> 0 ) then
>>>>> ---
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> if( i_qi> 0 .and. itimass> 0) then
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> At this point the tests will run but the output energies don't match at
>>>>> all.
>>>>> I was able to find a version of amber10 (from June 2008) that passed
>>>>> both
>>>>> of
>>>>> these test cases.
>>>>>
>>>>> I was able to recover the test results for the 2D-PMF test by modifying
>>>>> evb_umb.f, setting the array fharm(:) to zero outside of loops it is
>>>>> involved in (the way it was done previously) instead of inside (the way
>>>>> it
>>>>> is currently done).
>>>>>
>>>>> evb_umb.f
>>>>> 102c102
>>>>> < ! fharm(:) = 0.0d0
>>>>> ---
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> fharm(:) = 0.0d0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> 106c106
>>>>> < fharm(:) = 0.0d0
>>>>> ---
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> ! fharm(:) = 0.0d0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> 191c191
>>>>> < ! fharm(:) = 0.0d0
>>>>> ---
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> fharm(:) = 0.0d0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> 195c195
>>>>> < fharm(:) = 0.0d0
>>>>> ---
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> ! fharm(:) = 0.0d0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> You can even see that the fharm(:) statements were only commented out
>>>>> and
>>>>> not removed - does anyone familiar with the code know why it was
>>>>> changed?
>>>>> According to comments in the code it seems to have been changed around
>>>>> Dec.
>>>>> 2008. Anyway, when I reverse these changes the 2D-PMF test results
>>>>> match
>>>>> (aside from a few diffs that are output format-related). Of course, the
>>>>> test
>>>>> case itself could be wrong, but I have no easy way of knowing that.
>>>>>
>>>>> However, the corrF test still fails by a mile - as far as I can tell
>>>>> the
>>>>> likely culprit is with the qi_corrf_les() subroutine in pimd_force.f -
>>>>> much
>>>>> of it was changed around March 2009. These changes are far more
>>>>> extensive
>>>>> (>
>>>>> 100 lines at least) so I don't feel comfortable rolling them back.
>>>>>
>>>>> Anyway, I am attaching a patch that makes the changes that I discussed.
>>>>> If
>>>>> nothing else it prevents the ugly segfaults.
>>>>>
>>>>> Someone more familiar with what EVB *should* be doing should definitely
>>>>> have
>>>>> a close look at all of these changes.
>>>>>
>>>>> -Dan
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> AMBER-Developers mailing list
>>>>> AMBER-Developers.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> AMBER-Developers mailing list
>>>> AMBER-Developers.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>
>
>
> --
> -------------------------
> Daniel R. Roe
> Postdoctoral Associate
> SAS - Chemistry & Chemical Biology
> 610 Taylor Road
> Piscataway, NJ 08854
>
>
>
--
-------------------------
Daniel R. Roe
Postdoctoral Associate
SAS - Chemistry & Chemical Biology
610 Taylor Road
Piscataway, NJ 08854
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Fri Apr 02 2010 - 06:30:05 PDT