Re: [AMBER-Developers] Sander parallel build broken by PBSA updates

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Sat, 5 Mar 2016 13:34:09 -0700

On Sat, Mar 5, 2016 at 12:36 AM, Hai Nguyen <nhai.qn.gmail.com> wrote:
> FYI: I've just tried "*make install -j24"* in another machine and got error

I wonder if "make" with 24 threads is simply testing the limits of the
already complex and fragile network of dependencies that sander
has...does the problem happen at lower "make" thread counts?

-Dan

>
> "Fatal Error: Can't rename module file 'pbtimer_module.mod0' to
> 'pbtimer_module.mod': No such file or directory
> make[2]: *** [timer.o] Error 1
> make[2]: *** Waiting for unfinished jobs....
> make[2]: Leaving directory
> `/gpfs/gpfs/project1/dacase-001/haichit/amber/amber/AmberTools/src/pbsa'
> make[1]: *** [serial] Error 2
> make[1]: Leaving directory
> `/gpfs/gpfs/project1/dacase-001/haichit/amber/amber/AmberTools/src'
> make: *** [install] Error 2
> "
>
> info: gcc 4.4.7, Linux, 24 cores/node; use "*git clean -fdx .*" then
> "*./configure
> -noX11 gnu*" then "*make install -j24*"; git
> commit: 570e427eaf02a99d81e6bf1c36d7d3a51a27c529 (today)
>
> *However*, this error seems happened randomly, sometimes the install
> finished without any errors.
>
> Hai
>
> On Thu, Feb 18, 2016 at 3:54 PM, Hai Nguyen <nhai.qn.gmail.com> wrote:
>
>> I just tried "make -j 8 install" for serial build and ok with my machine
>> too.
>>
>> Hai
>>
>> On Thu, Feb 18, 2016 at 3:36 PM, Ray Luo <rluo.uci.edu> wrote:
>>
>>> Gerald,
>>>
>>> Your fix also worked on my side! Thanks a lot!
>>>
>>> I used "make -j 8 install" though I skipped python install in both the
>>> serial and parallel tests.
>>>
>>> All the best,
>>> Ray
>>> --
>>> Ray Luo, Ph.D.
>>> Professor
>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> Chemical and Biomedical Engineering
>>> University of California, Irvine, CA 92697-3900
>>>
>>>
>>> On Thu, Feb 18, 2016 at 11:23 AM, Ray Luo <rluo.uci.edu> wrote:
>>> > Thanks, I'm downloading your fix ...
>>> >
>>> > Also, the python-related installation seems to mess up the dependence
>>> > issue whether I use single or multi-thread make ... I don't know why
>>> > my changes in pbsa and sander have caused this to break.
>>> >
>>> > All the best,
>>> > Ray
>>> > --
>>> > Ray Luo, Ph.D.
>>> > Professor
>>> > Biochemistry, Molecular Biophysics, Chemical Physics,
>>> > Chemical and Biomedical Engineering
>>> > University of California, Irvine, CA 92697-3900
>>> >
>>> >
>>> > On Thu, Feb 18, 2016 at 11:08 AM, Gerald Monard
>>> > <Gerald.Monard.univ-lorraine.fr> wrote:
>>> >> Hi,
>>> >>
>>> >> I think that I found a fix. The problem was in the makedepend program
>>> in
>>> >> pbsa/ that had not been updated. That should work now. Can anybody
>>> test?
>>> >>
>>> >> Best,
>>> >>
>>> >> Gerald.
>>> >>
>>> >> On 02/18/2016 07:01 PM, Ross Walker wrote:
>>> >>> I second this - I tried to disable PBSA building for now but of
>>> course it is linked into Sander etc so we'd disabling Sander as well.
>>> >>>
>>> >>> I'd suggest spending a few hours trying to fix it and if you can't
>>> the checkin should be reversed until this is fixed.
>>> >>>
>>> >>> Serial builds really aren't a viable option these days - try doing a
>>> serial build on a Xeon Phi and you'll understand. :-(
>>> >>>
>>> >>> All the best
>>> >>> Ross
>>> >>>
>>> >>>> On Feb 18, 2016, at 9:13 AM, Daniel Roe <daniel.r.roe.gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> On Wed, Feb 17, 2016 at 6:04 PM, Ray Luo, Ph.D. <ray.luo.uci.edu>
>>> wrote:
>>> >>>>> Ross,
>>> >>>>>
>>> >>>>> Please pull again to see whether it fixed the build problem on your
>>> >>>>> side. Looks like multi-thread make does not work ... Too many
>>> >>>>> dependence issues ...
>>> >>>>
>>> >>>> The multi-threaded build definitely doesn't work for me when I hit
>>> pbsa stuff:
>>> >>>>
>>> >>>> $ make -j6 install
>>> >>>> ...
>>> >>>> make[2]: Entering directory
>>> '/home/droe/Amber/GIT/amber/AmberTools/src/pbsa'
>>> >>>> gcc -c -O3 -mtune=native -fPIC ...
>>> >>>> ...
>>> >>>> Makefile:238: recipe for target 'pb_p3m.LIBPBSA.o' failed
>>> >>>> Makefile:238: recipe for target 'pb_fdfrc.LIBPBSA.o' failed
>>> >>>>
>>> >>>> Amber is pretty large nowadays, and not being able to build in
>>> >>>> parallel is a real drawback. We can currently get around this by
>>> >>>> adding a '.NOTPARALLEL' target to the pbsa Makefile (or forcing -j1),
>>> >>>> but ideally I think the multi-threaded build should be fixed.
>>> >>>>
>>> >>>> -Dan
>>> >>>>
>>> >>>>>
>>> >>>>> All the best,
>>> >>>>> Ray
>>> >>>>> --
>>> >>>>> Ray Luo, Ph.D.
>>> >>>>> Professor
>>> >>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>> Chemical and Biomedical Engineering
>>> >>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>
>>> >>>>>
>>> >>>>> On Wed, Feb 17, 2016 at 12:59 PM, Ray Luo, Ph.D. <ray.luo.uci.edu>
>>> wrote:
>>> >>>>>> Okay, I think I've fixed it. Don't know why the same makefile
>>> works in
>>> >>>>>> amber15 but doesn't work in amber16. I'm testing all the
>>> >>>>>> sander/pbsa/nab combs to make sure all can build without
>>> interruption
>>> >>>>>> and also pass the tests.
>>> >>>>>>
>>> >>>>>> Ray
>>> >>>>>> --
>>> >>>>>> Ray Luo, Ph.D.
>>> >>>>>> Professor
>>> >>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>>> Chemical and Biomedical Engineering
>>> >>>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Tue, Feb 16, 2016 at 10:34 PM, Ray Luo, Ph.D. <ray.luo.uci.edu>
>>> wrote:
>>> >>>>>>> Finally I can reproduce your problem in building the MPI sander
>>> ...
>>> >>>>>>>
>>> >>>>>>> Will get it fixed tomorrow morning ...
>>> >>>>>>>
>>> >>>>>>> Ray
>>> >>>>>>> --
>>> >>>>>>> Ray Luo, Ph.D.
>>> >>>>>>> Professor
>>> >>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>>>> Chemical and Biomedical Engineering
>>> >>>>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Tue, Feb 16, 2016 at 10:17 PM, Ray Luo, Ph.D. <ray.luo.uci.edu>
>>> wrote:
>>> >>>>>>>> Hi Ross,
>>> >>>>>>>>
>>> >>>>>>>> Looks like that there is no extra untracked file at least in all
>>> the
>>> >>>>>>>> src folders, though there were indeed two .swp files in the test
>>> >>>>>>>> cases. So I think lack of "git clean -fdx" is not the problem.
>>> I've
>>> >>>>>>>> finished the serial build and is running "test.sander.BASIC".
>>> Looking
>>> >>>>>>>> good, so far.
>>> >>>>>>>>
>>> >>>>>>>> I'll do the MPI build next after another "git clean -fdx" ...
>>> >>>>>>>>
>>> >>>>>>>> However, I think one possible reason could be the use of
>>> multi-thread
>>> >>>>>>>> in make. There may be inter-dependent issues involved.
>>> >>>>>>>>
>>> >>>>>>>> I'll do some experiment with the MPI build to see whether this
>>> is the cause.
>>> >>>>>>>>
>>> >>>>>>>> Ray
>>> >>>>>>>> --
>>> >>>>>>>> Ray Luo, Ph.D.
>>> >>>>>>>> Professor
>>> >>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>>>>> Chemical and Biomedical Engineering
>>> >>>>>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On Tue, Feb 16, 2016 at 8:24 PM, Ross Walker <
>>> ross.rosswalker.co.uk> wrote:
>>> >>>>>>>>> Hi Ray,
>>> >>>>>>>>>
>>> >>>>>>>>> It's not about whether you compiled in that folder - it is
>>> whether you maybe edited files in that folder that you never committed /
>>> pushed. Try doing:
>>> >>>>>>>>>
>>> >>>>>>>>> git clean -f -d -x
>>> >>>>>>>>> git status
>>> >>>>>>>>>
>>> >>>>>>>>> And see what it reports.
>>> >>>>>>>>>
>>> >>>>>>>>> All the best
>>> >>>>>>>>> Ross
>>> >>>>>>>>>
>>> >>>>>>>>>> On Feb 16, 2016, at 23:04, Ray Luo <rluo.uci.edu> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> Hi Ross,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Yes, I think my folder is clean since I have never compiled
>>> anything
>>> >>>>>>>>>> in my amber git folder. Every time after syncing with the
>>> master
>>> >>>>>>>>>> branch, I copy the whole master branch to a new folder for
>>> compiling
>>> >>>>>>>>>> and testing.
>>> >>>>>>>>>>
>>> >>>>>>>>>> I'm about to test this evening's synced version, but will do
>>> "git
>>> >>>>>>>>>> clean -fxd" before configuring/compiling. Maybe this is why.
>>> >>>>>>>>>>
>>> >>>>>>>>>> By the way, could you add my uci email to the mailing list as
>>> well,
>>> >>>>>>>>>> i.e. "ray.luo.uci.edu"?
>>> >>>>>>>>>>
>>> >>>>>>>>>> All the best,
>>> >>>>>>>>>> Ray
>>> >>>>>>>>>> --
>>> >>>>>>>>>> Ray Luo, Ph.D.
>>> >>>>>>>>>> Professor
>>> >>>>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>>>>>>> Chemical and Biomedical Engineering
>>> >>>>>>>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Tue, Feb 16, 2016 at 7:51 PM, Ross Walker <
>>> ross.rosswalker.co.uk> wrote:
>>> >>>>>>>>>>> Hi Ray,
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> If I do this - on my OSX laptop (GCC 4.9.3), MPICH version
>>> 3.1.4
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> git pull
>>> >>>>>>>>>>> git clean -f -d -x
>>> >>>>>>>>>>> git status
>>> >>>>>>>>>>> On branch master
>>> >>>>>>>>>>> Your branch is up-to-date with 'origin/master'.
>>> >>>>>>>>>>> nothing to commit, working directory clean
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> ./configure -mpi gnu
>>> >>>>>>>>>>> make install
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> ...
>>> >>>>>>>>>>> ...
>>> >>>>>>>>>>> dnrm2.F90:113.24:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> ASSIGN 110 TO NEXT
>>> >>>>>>>>>>> 1
>>> >>>>>>>>>>> Warning: Deleted feature: ASSIGN statement at (1)
>>> >>>>>>>>>>> mpif90 -DBINTRAJ -DEMIL -DMPI -c -O3 -mtune=native -fPIC
>>> -ffree-form -I/Users/rcw/amber/amber/include
>>> -I/Users/rcw/amber/amber/include -o dcopy.o dcopy.F90
>>> >>>>>>>>>>> mpif90 -DBINTRAJ -DEMIL -DMPI -c -O3 -mtune=native -fPIC
>>> -ffree-form -I/Users/rcw/amber/amber/include
>>> -I/Users/rcw/amber/amber/include -o pb_force.o pb_force.F90
>>> >>>>>>>>>>> pb_force.F90:292.77:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> sa_init, sa_driver,
>>> sa_free, sa_free_mb, &
>>> >>>>>>>>>>>
>>> 1
>>> >>>>>>>>>>> Error: Symbol 'saslave_init' referenced at (1) not found in
>>> module 'solvent_accessibility'
>>> >>>>>>>>>>> make[3]: *** [pb_force.o] Error 1
>>> >>>>>>>>>>> make[2]: *** [libpbsa] Error 2
>>> >>>>>>>>>>> make[1]: *** [parallel] Error 2
>>> >>>>>>>>>>> make: *** [install] Error 2
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Same problem. Are you sure you have the latest tree and it is
>>> clean and doesn't have any modified files locally or files you forgot to
>>> add?
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> All the best
>>> >>>>>>>>>>> Ross
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> On Feb 16, 2016, at 15:56, Ray Luo, Ph.D. <ray.luo.uci.edu>
>>> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Hi Ross,
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> So far I can "configure -mpi gnu", "make install", and can
>>> also pass
>>> >>>>>>>>>>>> "make sander.BASIC.MPI". I'm running "make test.parallel.MM".
>>> I'm
>>> >>>>>>>>>>>> using "mpirun -np 4" for the mpi jobs. This is on Rocks
>>> 6.1/Centos
>>> >>>>>>>>>>>> 6.1. Which step was the problem?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> However I disabled "pbsa_mbfocus" in AmberTools/test/ since
>>> I'm in the
>>> >>>>>>>>>>>> process of removing this feature to accommodate incoming
>>> CUDA. I also
>>> >>>>>>>>>>>> need to update the test cases for sander/nab/mmpbsa ...
>>> which will be
>>> >>>>>>>>>>>> partially in today for sander/nab. I'm trying to get
>>> mmpbsa.py to work
>>> >>>>>>>>>>>> with the new python environment.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> All the best,
>>> >>>>>>>>>>>> Ray
>>> >>>>>>>>>>>> --
>>> >>>>>>>>>>>> Ray Luo, Ph.D.
>>> >>>>>>>>>>>> Professor
>>> >>>>>>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>>>>>>>>> Chemical and Biomedical Engineering
>>> >>>>>>>>>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Tue, Feb 16, 2016 at 10:21 AM, Ray Luo, Ph.D. <
>>> ray.luo.uci.edu> wrote:
>>> >>>>>>>>>>>>> Hi Ross,
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> I'm looking into this ... There was a major overhaul of the
>>> code in
>>> >>>>>>>>>>>>> addition to new features in the last check in.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> All the best,
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Ray
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> --
>>> >>>>>>>>>>>>> Ray Luo, Ph.D.
>>> >>>>>>>>>>>>> Professor
>>> >>>>>>>>>>>>> Biochemistry, Molecular Biophysics, Chemical Physics,
>>> >>>>>>>>>>>>> Chemical and Biomedical Engineering
>>> >>>>>>>>>>>>> University of California, Irvine, CA 92697-3900
>>> >>>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> AMBER-Developers mailing list
>>> >>>>> AMBER-Developers.ambermd.org
>>> >>>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> -------------------------
>>> >>>> Daniel R. Roe, PhD
>>> >>>> Department of Medicinal Chemistry
>>> >>>> University of Utah
>>> >>>> 30 South 2000 East, Room 307
>>> >>>> Salt Lake City, UT 84112-5820
>>> >>>> http://home.chpc.utah.edu/~cheatham/
>>> >>>> (801) 587-9652
>>> >>>> (801) 585-6208 (Fax)
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> AMBER-Developers mailing list
>>> >>>> AMBER-Developers.ambermd.org
>>> >>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> AMBER-Developers mailing list
>>> >>> AMBER-Developers.ambermd.org
>>> >>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>> >>>
>>> >>
>>> >> --
>>> >>
>>> ____________________________________________________________________________
>>> >>
>>> >> Prof. Gerald MONARD
>>> >> SRSMC, Université de Lorraine, CNRS
>>> >> Boulevard des Aiguillettes B.P. 70239
>>> >> F-54506 Vandoeuvre-les-Nancy, FRANCE
>>> >>
>>> >> e-mail : Gerald.Monard.univ-lorraine.fr
>>> >> tel. : +33 (0)383.684.381
>>> >> fax : +33 (0)383.684.371
>>> >> web : http://www.monard.info
>>> >>
>>> >>
>>> ____________________________________________________________________________
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> AMBER-Developers mailing list
>>> >> AMBER-Developers.ambermd.org
>>> >> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>
>>> _______________________________________________
>>> AMBER-Developers mailing list
>>> AMBER-Developers.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>
>>
>>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sat Mar 05 2016 - 13:00:04 PST
Custom Search