Re: amber-developers: latest tarballs for AmberTools and Amber10

From: Lachele Foley <lfoley.ccrc.uga.edu>
Date: Sun, 30 Mar 2008 17:47:41 -0400

> Sorry for the long email,

Heh... you think -that- was long...


> This means that the "save" file had the Amber 10 header, but that the file
> you ran had an Amber 9 header. This in turn means that that somehow you
> are not really running the current Amber10 codes. Can you double check your
> environment variables and paths, etc.? I don't quite know how to parse
> this, but it doesn't really make sense.

Dang it... I called myself trying to make sure that I hadn't messed up any of the test files, so I had re-run all the tests, but... I forgot to "source .amber_sh" before re-running them. I saw that, too, but the relevance didn't sink in.

Anyhow, I just re-ran them again, with proper environments set. I even ran the parallel tests as "lachele", who runs jobs regularly, as opposed to "installer" in order to minimize scheduler issues -- and I set AMBERHOME in lachele's .cshrc file in case there was some login thing with pbs and the compute nodes. Now I've got 6.4 MB worth of diff output from test.parallel (6.6 on the i686 system) rather than a mere 5.9 MB. I didn't look through those files -- sorry. I'll happily send them to you as well as the files containing stdout redirects for the parallel tests (ia64 and i686, gfortran) if you wish. From a brief scan it looks like the diff comparisons actually happened.

The parallel_QMMM tests that ran seem to have passed. Otherwise, the only things that jump out at me as possible problems are these from test.serial:

---------------------------------------
possible FAILURE: check 2temp.out.dif
/usr/local/programs/amber10/test/LES_TEMP
75c75
< Etot = 49.9288 EKtot = 15.3650 EPtot = 34.5638
---
>  Etot   =         0.  EKtot   =        15.3650  EPtot      =        34.5638
---------------------------------------
---------------------------------------
possible FAILURE:  check amoeba_wat2.out.dif
/usr/local/programs/amber10/test/amoeba_wat2
206a207
>  EKCMT  =         0.  VIRIAL  =         0.  VOLUME     =         0.0002
208a210
>                                                     Density    =         0.
---------------------------------------
I also noticed these things:
1] The test/README implies 4 processors, so that's what I used in "DO_PARALLEL," but a number of tests complain "too many processors for this test, exiting (Max = 2)".  Perhaps a statement about that should be in the README because most people have more sense than me and aren't going to watch the stdout statements or scroll back a few hundred lines to catch this sort of thing (see also later).  And, while on this point, why max=2?  I like 4 because I can check communication between two processors on one node and across two different nodes.  Is it naive to think that a reasonable test?
2] Do you mean Amber 8 here?  It passed, whatever you meant.
==============================================================
cd LES && ./Run.PME_LES
  Amber 8 ADDLES and SANDER.LES test:
addles:
diffing output_addles.save with output_addles
PASSED
==============================================================
3] Please pardon any naivety on my part regarding the relationship between number of threads and processors, but assuming that the test was set up properly and that the problem is something I have control over, I think the following complaints refer to my 4 processors not being enough (?).  If I interpreted that correctly, then could there be different parallel tests that do not have conflicting processor requirements (above, 4 was to many...)?  I don't think it ran because the scheduler didn't return a job number.  There are a number of complaints after this that I think relate to this one not working.  I can send that output if you want.
==============================================================
make[1]: Leaving directory `/usr/local/programs/amber10/test'
cd neb/neb_gb && ./Run.neb_classical
 This test case requires a least 8 mpi threads.
 The number of mpi threads must also be a multiple of 8 and not more than 24.
 Not running test, exiting.....
cd neb/neb_gb_large_system && ./Run.neb_ls_classical
 This test case requires a least 32 mpi threads.
 The number of mpi threads must also be a multiple of 32 and not more than 128.
 Not running test, exiting.....
cd ncsu && ./run-parallel.sh
>>>>>>> doing 'abmd_ANALYSIS'
diffing save/mdout with mdout
possible FAILURE:  check mdout.dif
==============================================================
4] In the parallel QMMM tests, everything that ran seemed to pass.  What didn't run says this:
==============================================================
make[1]: Leaving directory `/usr/local/programs/amber10/test'
export TESTsander=/usr/local/programs/amber10/exe/sander.MPI; make test.sander.DFTB
make[1]: Entering directory `/usr/local/programs/amber10/test'
cd qmmm_DFTB/crambin_DFTB && ./Run.crambin
DFTB SLKO files not found - Skipping Test...
cd qmmm_DFTB/crambin_DFTB && ./Run.crambin_md_hot_start
DFTB SLKO files not found - Skipping Test...
...(leaving out lots of similar complaints)
Ok... apologies for whatever I got wrong this time... (must be something... always is... but, I guess it's worth knowing what your users are likely to get wrong  :-)
> the things you report don't show up anywhere else
I have a knack for that...  I might have gotten some part of this wrong.  But I tried!  The only significant issue seems to be in the "make test.parallel".  It might be about gfortran, but that seems unlikely if all the other tests look mostly ok.  It might also be about me, but maybe if I got the other three more or less right, I might have gotten that one right, too.  You want one of the big parallel test diff files?  My cursory scan says they look similar.
I can try g95, but not before Monday, and maybe not for a week or two.
:-) Lachele
--
B. Lachele Foley, PhD '92,'02
Assistant Research Scientist
Complex Carbohydrate Research Center, UGA
706-542-0263
lfoley.ccrc.uga.edu
Received on Fri Apr 18 2008 - 21:15:25 PDT
Custom Search