Re: [AMBER-Developers] Issues with Amber test framework

From: Daniel Roe via AMBER-Developers <amber-developers.ambermd.org>
Date: Wed, 27 Mar 2024 13:16:14 -0400

Adding this after the "source $AMBERHOME/test/program_error.sh" seems
to fix (protect) the GIT test
($AMBERHOME/test/cuda/gti/lambda_remd/multi-window/Run):

```
numprocs=`$DO_PARALLEL $AMBERHOME/test/numprocs`

if [ $numprocs -ne $NREP ] ; then
  echo "$NREP-window lambda replica PME NPT test requires $NREP procs,
have $numprocs"
  echo "Skipping."
  exit 0
fi
```

-Dan

On Wed, Mar 27, 2024 at 12:27 PM Daniel Roe <daniel.r.roe.gmail.com> wrote:
>
> Hi all,
>
> In the course of testing the latest RC I hit a few issues with the
> current test framework. The first is that several tests now depend on
> the 'bc' binary, but there is no check for it, so you end up with a
> lot of:
>
> ==============================================================
> >>>>>>> doing 'bbmd'
> ./run-pmemd.sh: 1: bc: not found
> ./run-pmemd.sh: 1: bc: not found
> ./run-pmemd.sh: 1: bc: not found
>
> This leads to further issues when running the tests depends on the
> output from 'bc'; for example, the gti/lambda_remd/multi-window/ test
> tries to use bc to calculate lambda values, which fails (the resulting
> groupfile doesn't have enough entries). This test also runs in
> parallel no matter what DO_PARALLEL is set to, so unless you have np
> set to the target number of replicas it fails. I'm not sure if there
> are other tests that aren't properly checking for how many MPI
> processes there actually are.
>
> This also led to a strange situation where because of the program
> error a control character got written to the output log:
>
> --------------------------------------------------------------------------
> ^../Run: Program error
> make[1]: [Makefile:394: test.pmemd.cuda.gti.remd] Error 1 (ignored)
>
> This made 'grep' think the file was binary, so grepping for FAIL
> actually doesn't work as expected (it incorrectly looks like 0
> failures):
>
> $ grep FAIL /u/droe/amber/RC/2024/amber24/logs/test_amber_cuda_parallel/2024-03-27_10-31-59.log
> grep: /u/droe/amber/RC/2024/amber24/logs/test_amber_cuda_parallel/2024-03-27_10-31-59.log:
> binary file matches
> $ grep FAIL /u/droe/amber/RC/2024/amber24/logs/test_amber_cuda_parallel/2024-03-27_10-31-59.log
> | wc -l
> grep: /u/droe/amber/RC/2024/amber24/logs/test_amber_cuda_parallel/2024-03-27_10-31-59.log:
> binary file matches
> 0
>
> So TL;DR we should be checking that 'bc' is present before tests are
> run, and maybe also somehow check that the log is greppable so the
> final error count can be trusted (maybe by making sure 'file
> <logfile>' doesn't return "data"?).
>
> -Dan

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Wed Mar 27 2024 - 10:30:02 PDT
Custom Search