Re: [AMBER-Developers] pbsa crashes

From: Scott Brozell <sbrozell.rci.rutgers.edu>
Date: Fri, 16 Oct 2009 12:33:40 -0400

Hi,

> > case wrote:
> > >On Mon, Oct 12, 2009, Scott Brozell wrote:
> > >
> > >>Several of the pbsa tests, such as pbsa_dmp, are showing memory bug
> > >>related failures: seg faults, failed allocate assertions, inconsistent
> > >>runs.
> > >>These problems occur with
> > >>gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)
> > >>
> > >>I tried to use valgrind and gdb, but didnt learn much; the failures occur
> > >>in pbsa.f:
> > >> allocate( x(lastr), ix(lasti), ipairs(lastpr), ih(lasth), stat = ier )
> > >> REQUIRE( ier == 0 )
> > >>
> > >>I wonder whether these statements (based on sander.f) need updating ?
> > >>It's been a while since Ive looked at this code; so maybe there's someone
> > >>with a better suggestion.
> > >
> > >All tests pass for me with gcc 4.4. In the past, there were lots of
> > >problems with gfortran 4.1.2

On Fri, Oct 16, 2009 at 02:04:57PM +0200, Andreas Svrcek-Seiler wrote:
>
> >Testing with
> >gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)
> >has all the pbsa tests passing.
> ...I still (after a cvs update) get this weird behavior:
> Sometimes e.g. pbsa_trx runs through
> and then echoes PASSED, most times it crashes. I even tested other
> machines to exclude faulty hardware from the possible sources of error.
> Valgrind also still reports numerous instances of corrupted memory,
> no matter which compiler (also tried icc/ifort in ia32 mode) I use.
> usually one gets such on/off errors with competing threads and
> faulty openmp programming, but there's no openmp and only one thread,
> so I'm stumped.
>
> The last time I observed such puzzling behavior this was caused
> by the use of an uninitialzed variable (which was multiplied by zero,
> so it never caused numerical trouble). But this was in C (inside NAB),
> where it's much easier for me to follow the code.

The behavior is symptomatic of memory related bugs: invalid memory
accesses either via pointers or out of range array indexes, use of
uninitialzed memory, etc.
Note also that valgrind may not directly detect problems
due to invalid use of stack or static memory.
Aside from my earlier suggestion to start reading the code at
the allocate statements, Mengjuei could start debugging by
turning on the memory checking compiler options such as
gnu's -fbounds-check etc or
intel's -check bounds -check uninit etc.
See old posts to this list (or maybe the wiki)
for additional popular compiler options and tips.

scott


_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Fri Oct 16 2009 - 10:00:02 PDT
Custom Search