Re: [AMBER-Developers] Sander parallel build broken by PBSA updates

From: Hai Nguyen <nhai.qn.gmail.com>
Date: Sat, 5 Mar 2016 15:38:43 -0500

But I want to stress that the error happens seem randomly. In most trials,
the builds were finished without errors.

> Did this
happen in the middle of making sander or something else?

.Ray: Let me reproduce the error and report later.

Hai

On Sat, Mar 5, 2016 at 3:36 PM, Hai Nguyen <nhai.qn.gmail.com> wrote:

> -j8 works fine for serial build (I am testing openmp build now). I used
> -j24 since my node has 24 cores.
>
> Hai
>
>
> On Sat, Mar 5, 2016 at 3:34 PM, Daniel Roe <daniel.r.roe.gmail.com> wrote:
>
>> On Sat, Mar 5, 2016 at 12:36 AM, Hai Nguyen <nhai.qn.gmail.com> wrote:
>> > FYI: I've just tried "*make install -j24"* in another machine and got
>> error
>>
>> I wonder if "make" with 24 threads is simply testing the limits of the
>> already complex and fragile network of dependencies that sander
>> has...does the problem happen at lower "make" thread counts?
>>
>> -Dan
>>
>> >
>> > "Fatal Error: Can't rename module file 'pbtimer_module.mod0' to
>> > 'pbtimer_module.mod': No such file or directory
>> > make[2]: *** [timer.o] Error 1
>> > make[2]: *** Waiting for unfinished jobs....
>> > make[2]: Leaving directory
>> > `/gpfs/gpfs/project1/dacase-001/haichit/amber/amber/AmberTools/src/pbsa'
>> > make[1]: *** [serial] Error 2
>> > make[1]: Leaving directory
>> > `/gpfs/gpfs/project1/dacase-001/haichit/amber/amber/AmberTools/src'
>> > make: *** [install] Error 2
>> > "
>> >
>> > info: gcc 4.4.7, Linux, 24 cores/node; use "*git clean -fdx .*" then
>> > "*./configure
>> > -noX11 gnu*" then "*make install -j24*"; git
>> > commit: 570e427eaf02a99d81e6bf1c36d7d3a51a27c529 (today)
>> >
>> > *However*, this error seems happened randomly, sometimes the install
>> > finished without any errors.
>> >
>> > Hai
>> >
>> > On Thu, Feb 18, 2016 at 3:54 PM, Hai Nguyen <nhai.qn.gmail.com> wrote:
>> >
>> >> I just tried "make -j 8 install" for serial build and ok with my
>> machine
>> >> too.
>> >>
>> >> Hai
>> >>
>> >> On Thu, Feb 18, 2016 at 3:36 PM, Ray Luo <rluo.uci.edu> wrote:
>> >>
>> >>> Gerald,
>> >>>
>> >>> Your fix also worked on my side! Thanks a lot!
>> >>>
>> >>> I used "make -j 8 install" though I skipped python install in both the
>> >>> serial and parallel tests.
>> >>>
>> >>> All the best,
>> >>> Ray
>> >>> --
>> >>> Ray Luo, Ph.D.
>> >>> Professor
>> >>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> Chemical and Biomedical Engineering
>> >>> University of California, Irvine, CA 92697-3900
>> >>>
>> >>>
>> >>> On Thu, Feb 18, 2016 at 11:23 AM, Ray Luo <rluo.uci.edu> wrote:
>> >>> > Thanks, I'm downloading your fix ...
>> >>> >
>> >>> > Also, the python-related installation seems to mess up the
>> dependence
>> >>> > issue whether I use single or multi-thread make ... I don't know why
>> >>> > my changes in pbsa and sander have caused this to break.
>> >>> >
>> >>> > All the best,
>> >>> > Ray
>> >>> > --
>> >>> > Ray Luo, Ph.D.
>> >>> > Professor
>> >>> > Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> > Chemical and Biomedical Engineering
>> >>> > University of California, Irvine, CA 92697-3900
>> >>> >
>> >>> >
>> >>> > On Thu, Feb 18, 2016 at 11:08 AM, Gerald Monard
>> >>> > <Gerald.Monard.univ-lorraine.fr> wrote:
>> >>> >> Hi,
>> >>> >>
>> >>> >> I think that I found a fix. The problem was in the makedepend
>> program
>> >>> in
>> >>> >> pbsa/ that had not been updated. That should work now. Can anybody
>> >>> test?
>> >>> >>
>> >>> >> Best,
>> >>> >>
>> >>> >> Gerald.
>> >>> >>
>> >>> >> On 02/18/2016 07:01 PM, Ross Walker wrote:
>> >>> >>> I second this - I tried to disable PBSA building for now but of
>> >>> course it is linked into Sander etc so we'd disabling Sander as well.
>> >>> >>>
>> >>> >>> I'd suggest spending a few hours trying to fix it and if you can't
>> >>> the checkin should be reversed until this is fixed.
>> >>> >>>
>> >>> >>> Serial builds really aren't a viable option these days - try
>> doing a
>> >>> serial build on a Xeon Phi and you'll understand. :-(
>> >>> >>>
>> >>> >>> All the best
>> >>> >>> Ross
>> >>> >>>
>> >>> >>>> On Feb 18, 2016, at 9:13 AM, Daniel Roe <daniel.r.roe.gmail.com>
>> >>> wrote:
>> >>> >>>>
>> >>> >>>> On Wed, Feb 17, 2016 at 6:04 PM, Ray Luo, Ph.D. <ray.luo.uci.edu
>> >
>> >>> wrote:
>> >>> >>>>> Ross,
>> >>> >>>>>
>> >>> >>>>> Please pull again to see whether it fixed the build problem on
>> your
>> >>> >>>>> side. Looks like multi-thread make does not work ... Too many
>> >>> >>>>> dependence issues ...
>> >>> >>>>
>> >>> >>>> The multi-threaded build definitely doesn't work for me when I
>> hit
>> >>> pbsa stuff:
>> >>> >>>>
>> >>> >>>> $ make -j6 install
>> >>> >>>> ...
>> >>> >>>> make[2]: Entering directory
>> >>> '/home/droe/Amber/GIT/amber/AmberTools/src/pbsa'
>> >>> >>>> gcc -c -O3 -mtune=native -fPIC ...
>> >>> >>>> ...
>> >>> >>>> Makefile:238: recipe for target 'pb_p3m.LIBPBSA.o' failed
>> >>> >>>> Makefile:238: recipe for target 'pb_fdfrc.LIBPBSA.o' failed
>> >>> >>>>
>> >>> >>>> Amber is pretty large nowadays, and not being able to build in
>> >>> >>>> parallel is a real drawback. We can currently get around this by
>> >>> >>>> adding a '.NOTPARALLEL' target to the pbsa Makefile (or forcing
>> -j1),
>> >>> >>>> but ideally I think the multi-threaded build should be fixed.
>> >>> >>>>
>> >>> >>>> -Dan
>> >>> >>>>
>> >>> >>>>>
>> >>> >>>>> All the best,
>> >>> >>>>> Ray
>> >>> >>>>> --
>> >>> >>>>> Ray Luo, Ph.D.
>> >>> >>>>> Professor
>> >>> >>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>> Chemical and Biomedical Engineering
>> >>> >>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>> On Wed, Feb 17, 2016 at 12:59 PM, Ray Luo, Ph.D. <
>> ray.luo.uci.edu>
>> >>> wrote:
>> >>> >>>>>> Okay, I think I've fixed it. Don't know why the same makefile
>> >>> works in
>> >>> >>>>>> amber15 but doesn't work in amber16. I'm testing all the
>> >>> >>>>>> sander/pbsa/nab combs to make sure all can build without
>> >>> interruption
>> >>> >>>>>> and also pass the tests.
>> >>> >>>>>>
>> >>> >>>>>> Ray
>> >>> >>>>>> --
>> >>> >>>>>> Ray Luo, Ph.D.
>> >>> >>>>>> Professor
>> >>> >>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>>> Chemical and Biomedical Engineering
>> >>> >>>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>>
>> >>> >>>>>>
>> >>> >>>>>> On Tue, Feb 16, 2016 at 10:34 PM, Ray Luo, Ph.D. <
>> ray.luo.uci.edu>
>> >>> wrote:
>> >>> >>>>>>> Finally I can reproduce your problem in building the MPI
>> sander
>> >>> ...
>> >>> >>>>>>>
>> >>> >>>>>>> Will get it fixed tomorrow morning ...
>> >>> >>>>>>>
>> >>> >>>>>>> Ray
>> >>> >>>>>>> --
>> >>> >>>>>>> Ray Luo, Ph.D.
>> >>> >>>>>>> Professor
>> >>> >>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>>>> Chemical and Biomedical Engineering
>> >>> >>>>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>>>
>> >>> >>>>>>>
>> >>> >>>>>>> On Tue, Feb 16, 2016 at 10:17 PM, Ray Luo, Ph.D. <
>> ray.luo.uci.edu>
>> >>> wrote:
>> >>> >>>>>>>> Hi Ross,
>> >>> >>>>>>>>
>> >>> >>>>>>>> Looks like that there is no extra untracked file at least in
>> all
>> >>> the
>> >>> >>>>>>>> src folders, though there were indeed two .swp files in the
>> test
>> >>> >>>>>>>> cases. So I think lack of "git clean -fdx" is not the
>> problem.
>> >>> I've
>> >>> >>>>>>>> finished the serial build and is running "test.sander.BASIC".
>> >>> Looking
>> >>> >>>>>>>> good, so far.
>> >>> >>>>>>>>
>> >>> >>>>>>>> I'll do the MPI build next after another "git clean -fdx" ...
>> >>> >>>>>>>>
>> >>> >>>>>>>> However, I think one possible reason could be the use of
>> >>> multi-thread
>> >>> >>>>>>>> in make. There may be inter-dependent issues involved.
>> >>> >>>>>>>>
>> >>> >>>>>>>> I'll do some experiment with the MPI build to see whether
>> this
>> >>> is the cause.
>> >>> >>>>>>>>
>> >>> >>>>>>>> Ray
>> >>> >>>>>>>> --
>> >>> >>>>>>>> Ray Luo, Ph.D.
>> >>> >>>>>>>> Professor
>> >>> >>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>>>>> Chemical and Biomedical Engineering
>> >>> >>>>>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>>>>
>> >>> >>>>>>>>
>> >>> >>>>>>>> On Tue, Feb 16, 2016 at 8:24 PM, Ross Walker <
>> >>> ross.rosswalker.co.uk> wrote:
>> >>> >>>>>>>>> Hi Ray,
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> It's not about whether you compiled in that folder - it is
>> >>> whether you maybe edited files in that folder that you never
>> committed /
>> >>> pushed. Try doing:
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> git clean -f -d -x
>> >>> >>>>>>>>> git status
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> And see what it reports.
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> All the best
>> >>> >>>>>>>>> Ross
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>> On Feb 16, 2016, at 23:04, Ray Luo <rluo.uci.edu> wrote:
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> Hi Ross,
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> Yes, I think my folder is clean since I have never compiled
>> >>> anything
>> >>> >>>>>>>>>> in my amber git folder. Every time after syncing with the
>> >>> master
>> >>> >>>>>>>>>> branch, I copy the whole master branch to a new folder for
>> >>> compiling
>> >>> >>>>>>>>>> and testing.
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> I'm about to test this evening's synced version, but will
>> do
>> >>> "git
>> >>> >>>>>>>>>> clean -fxd" before configuring/compiling. Maybe this is
>> why.
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> By the way, could you add my uci email to the mailing list
>> as
>> >>> well,
>> >>> >>>>>>>>>> i.e. "ray.luo.uci.edu"?
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> All the best,
>> >>> >>>>>>>>>> Ray
>> >>> >>>>>>>>>> --
>> >>> >>>>>>>>>> Ray Luo, Ph.D.
>> >>> >>>>>>>>>> Professor
>> >>> >>>>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>>>>>>> Chemical and Biomedical Engineering
>> >>> >>>>>>>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> On Tue, Feb 16, 2016 at 7:51 PM, Ross Walker <
>> >>> ross.rosswalker.co.uk> wrote:
>> >>> >>>>>>>>>>> Hi Ray,
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> If I do this - on my OSX laptop (GCC 4.9.3), MPICH version
>> >>> 3.1.4
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> git pull
>> >>> >>>>>>>>>>> git clean -f -d -x
>> >>> >>>>>>>>>>> git status
>> >>> >>>>>>>>>>> On branch master
>> >>> >>>>>>>>>>> Your branch is up-to-date with 'origin/master'.
>> >>> >>>>>>>>>>> nothing to commit, working directory clean
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> ./configure -mpi gnu
>> >>> >>>>>>>>>>> make install
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> ...
>> >>> >>>>>>>>>>> ...
>> >>> >>>>>>>>>>> dnrm2.F90:113.24:
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> ASSIGN 110 TO NEXT
>> >>> >>>>>>>>>>> 1
>> >>> >>>>>>>>>>> Warning: Deleted feature: ASSIGN statement at (1)
>> >>> >>>>>>>>>>> mpif90 -DBINTRAJ -DEMIL -DMPI -c -O3 -mtune=native
>> -fPIC
>> >>> -ffree-form -I/Users/rcw/amber/amber/include
>> >>> -I/Users/rcw/amber/amber/include -o dcopy.o dcopy.F90
>> >>> >>>>>>>>>>> mpif90 -DBINTRAJ -DEMIL -DMPI -c -O3 -mtune=native
>> -fPIC
>> >>> -ffree-form -I/Users/rcw/amber/amber/include
>> >>> -I/Users/rcw/amber/amber/include -o pb_force.o pb_force.F90
>> >>> >>>>>>>>>>> pb_force.F90:292.77:
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> sa_init, sa_driver,
>> >>> sa_free, sa_free_mb, &
>> >>> >>>>>>>>>>>
>> >>> 1
>> >>> >>>>>>>>>>> Error: Symbol 'saslave_init' referenced at (1) not found
>> in
>> >>> module 'solvent_accessibility'
>> >>> >>>>>>>>>>> make[3]: *** [pb_force.o] Error 1
>> >>> >>>>>>>>>>> make[2]: *** [libpbsa] Error 2
>> >>> >>>>>>>>>>> make[1]: *** [parallel] Error 2
>> >>> >>>>>>>>>>> make: *** [install] Error 2
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> Same problem. Are you sure you have the latest tree and
>> it is
>> >>> clean and doesn't have any modified files locally or files you forgot
>> to
>> >>> add?
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> All the best
>> >>> >>>>>>>>>>> Ross
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>>> On Feb 16, 2016, at 15:56, Ray Luo, Ph.D. <
>> ray.luo.uci.edu>
>> >>> wrote:
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> Hi Ross,
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> So far I can "configure -mpi gnu", "make install", and
>> can
>> >>> also pass
>> >>> >>>>>>>>>>>> "make sander.BASIC.MPI". I'm running "make
>> test.parallel.MM".
>> >>> I'm
>> >>> >>>>>>>>>>>> using "mpirun -np 4" for the mpi jobs. This is on Rocks
>> >>> 6.1/Centos
>> >>> >>>>>>>>>>>> 6.1. Which step was the problem?
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> However I disabled "pbsa_mbfocus" in AmberTools/test/
>> since
>> >>> I'm in the
>> >>> >>>>>>>>>>>> process of removing this feature to accommodate incoming
>> >>> CUDA. I also
>> >>> >>>>>>>>>>>> need to update the test cases for sander/nab/mmpbsa ...
>> >>> which will be
>> >>> >>>>>>>>>>>> partially in today for sander/nab. I'm trying to get
>> >>> mmpbsa.py to work
>> >>> >>>>>>>>>>>> with the new python environment.
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> All the best,
>> >>> >>>>>>>>>>>> Ray
>> >>> >>>>>>>>>>>> --
>> >>> >>>>>>>>>>>> Ray Luo, Ph.D.
>> >>> >>>>>>>>>>>> Professor
>> >>> >>>>>>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>>>>>>>>> Chemical and Biomedical Engineering
>> >>> >>>>>>>>>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> On Tue, Feb 16, 2016 at 10:21 AM, Ray Luo, Ph.D. <
>> >>> ray.luo.uci.edu> wrote:
>> >>> >>>>>>>>>>>>> Hi Ross,
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> I'm looking into this ... There was a major overhaul of
>> the
>> >>> code in
>> >>> >>>>>>>>>>>>> addition to new features in the last check in.
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> All the best,
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> Ray
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> --
>> >>> >>>>>>>>>>>>> Ray Luo, Ph.D.
>> >>> >>>>>>>>>>>>> Professor
>> >>> >>>>>>>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>> >>> >>>>>>>>>>>>> Chemical and Biomedical Engineering
>> >>> >>>>>>>>>>>>> University of California, Irvine, CA 92697-3900
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>
>> >>> >>>>> _______________________________________________
>> >>> >>>>> AMBER-Developers mailing list
>> >>> >>>>> AMBER-Developers.ambermd.org
>> >>> >>>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> --
>> >>> >>>> -------------------------
>> >>> >>>> Daniel R. Roe, PhD
>> >>> >>>> Department of Medicinal Chemistry
>> >>> >>>> University of Utah
>> >>> >>>> 30 South 2000 East, Room 307
>> >>> >>>> Salt Lake City, UT 84112-5820
>> >>> >>>> http://home.chpc.utah.edu/~cheatham/
>> >>> >>>> (801) 587-9652
>> >>> >>>> (801) 585-6208 (Fax)
>> >>> >>>>
>> >>> >>>> _______________________________________________
>> >>> >>>> AMBER-Developers mailing list
>> >>> >>>> AMBER-Developers.ambermd.org
>> >>> >>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>> >>> >>>
>> >>> >>>
>> >>> >>> _______________________________________________
>> >>> >>> AMBER-Developers mailing list
>> >>> >>> AMBER-Developers.ambermd.org
>> >>> >>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>> >>> >>>
>> >>> >>
>> >>> >> --
>> >>> >>
>> >>>
>> ____________________________________________________________________________
>> >>> >>
>> >>> >> Prof. Gerald MONARD
>> >>> >> SRSMC, Université de Lorraine, CNRS
>> >>> >> Boulevard des Aiguillettes B.P. 70239
>> >>> >> F-54506 Vandoeuvre-les-Nancy, FRANCE
>> >>> >>
>> >>> >> e-mail : Gerald.Monard.univ-lorraine.fr
>> >>> >> tel. : +33 (0)383.684.381
>> >>> >> fax : +33 (0)383.684.371
>> >>> >> web : http://www.monard.info
>> >>> >>
>> >>> >>
>> >>>
>> ____________________________________________________________________________
>> >>> >>
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> AMBER-Developers mailing list
>> >>> >> AMBER-Developers.ambermd.org
>> >>> >> http://lists.ambermd.org/mailman/listinfo/amber-developers
>> >>>
>> >>> _______________________________________________
>> >>> AMBER-Developers mailing list
>> >>> AMBER-Developers.ambermd.org
>> >>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>> >>>
>> >>
>> >>
>> > _______________________________________________
>> > AMBER-Developers mailing list
>> > AMBER-Developers.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 307
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-6208 (Fax)
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sat Mar 05 2016 - 13:00:06 PST
Custom Search