RE: amber-developers: Testing Parallel

From: Ross Walker <ross.rosswalker.co.uk>
Date: Mon, 14 Apr 2008 14:09:57 -0700

Hi Lachele,

Most pbs invocations allow you to request interactive use - this is how I
test things. So you do something like:

qsub -I -V -l walltime=02:00:00 -l nodes=2:ppn=2

Then you end up, after a wait, with a shell on one of the nodes. Then you
can do:

setenv DO_PARALLEL 'mpirun -np 4 -machinefile $PBSNODEFILE '
setenv AMBERHOME '/foo/bar/amber10'
cd $AMBERHOME/test
make test.parallel

when you are done you then type 'exit' to quit the interactive session. If
you cluster doesn't currently have this enabled then I'd ask the person in
charge if they could enable it for you.

Good luck,

Ross

/\
\/
|\oss Walker

| Assistant Research Professor |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
| http://www.rosswalker.co.uk | PGP Key available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

> -----Original Message-----
> From: owner-amber-developers.scripps.edu
> [mailto:owner-amber-developers.scripps.edu] On Behalf Of Lachele Foley
> Sent: Monday, April 14, 2008 13:12
> To: amber-developers.scripps.edu
> Subject: Re: amber-developers: Testing Parallel
>
> > Your machine's queue situation seems odd.
>
> Do lots of people use pbs? I've been reading the fine
> manual, and I can't see how
>
> $DO_PARALLEL $TESTsander anything
>
> is going to work. To use qsub without a job script involves
> starting up a little mini-qsub shell that requires a cntrl-D
> at the end. Quoting the manual from the "Jobs Without a Job
> Script" section ("<ret>" means "hit return"):
>
> qsub <ret>
> [directives]
> [tasks]
> ctrl-D
>
> The amber parallel test suite does not easily lend itself to
> generating a job script or to doing this command-line thing.
> How do folks who use pbs use the tests? Until now, I just
> looked at the output, shuddered, ran an old job of mine,
> checked the output then went on to the next thing on my list.
> I could certainly whip up some rebel engineering (see below)
> to get around it, but does everyone do that? I admit to not
> being the world's best reader, so I might have missed
> something. This is PBS Pro 8.0. Maybe later versions are different.
>
> > Anyway, try
> > set echo
> > set verbose
> > then your two command invocations to see the differences.
> > My guess is that "-W block=true" -> '-W block=true'
> > might work and wont hurt.
>
> I tried that. And similar variations. And bash, csh, ash,
> zsh, ksh... None worked. Thanks, though. Good idea.
>
> Unwilling to be done in by a quote mark:
>
> ======= begin excessive palliative engineering =================
> ]$ more amber_parallel_test.c
> #include <stdio.h>
> #include <stdlib.h>
> #include <stddef.h>
> #include <string.h>
>
> int main(int argc, char *argv[]){
> int a=0;
> char command[4001];
>
> strcpy(command,"/opt/scali/bin/scasub -qsparams \"-W
> block=true\" -mpimon -network gm0,smp -np 2 -npn 2 ");
> for(a=1;a<argc;a++){ sprintf(command,"%s %s",command,argv[a]); }
> strcat(command,"\n");
> system(command);
>
> return 0;
> }
> $ gcc amber_parallel_test.c
> $ setenv DO_PARALLEL ./a.out
> $ $DO_PARALLEL $sander -O -i mdin -c 01.ann10.xyz -o noesy.out
>
> ======= end excessive (and successful) palliative engineering
> =================
>
> I suspect my string handling could be swifter, but it works.
>
> Sigh.
>
> :-) Lachele
> --
> B. Lachele Foley, PhD '92,'02
> Assistant Research Scientist
> Complex Carbohydrate Research Center, UGA
> 706-542-0263
> lfoley.ccrc.uga.edu
>
>
> ----- Original Message -----
> From: Scott Brozell
> [mailto:sbrozell.scripps.edu]
> To: amber-developers.scripps.edu
> Sent: Mon, 14
> Apr 2008 16:04:41 -0400
> Subject: Re: amber-developers: Testing Parallel
>
>
> > Hi,
> >
> > On Mon, 14 Apr 2008, Lachele Foley wrote:
> >
> > > More fun with parallel tests: mostly FYI, but with questions. And
> > apologies in advance for whatever I forgot to notice or check...
> > >
> > > Another issue for parallel testing with us is that the
> tests are run like
> > so:
> > >
> > > =====start abbreviated pseudo-script=======
> > > set up some environment stuff and check sanity
> > > run the parallel job
> > > check the output for differences
> > > =====end abbreviated pseudo-script=======
> > >
> > > On our system, jobs go away to the scheduler by default
> (like running in
> > the background). So, the tests frequently check for output
> that hasn't
> > happened yet -- or, in the noesy case, delete the input
> file before the job
> > runs. I can get the scheduler to run "in the foreground"
> (or "wait" or
> > "block" -- choose your favorite terminology). But, I seem
> to be having a
> > quote-character issue. Can someone who knows the shell
> better than me
> > suggest a fix (SHELL is tcsh)?
> > >
> > > This works:
> > > [lachele.keats noesy]$ /opt/scali/bin/scasub -qsparams
> "-W block=true"
> > -mpimon -network gm0,smp -np 2 -npn 2 $sander -O -i mdin -c
> 01.ann10.xyz -o
> > noesy.out
> > > 44368.keats
> > >
> > > This doesn't:
> > > [lachele.keats noesy]$ echo $DO_PARALLEL
> > > /opt/scali/bin/scasub -qsparams "-W block=true" -mpimon
> -network gm0,smp
> > -np 2 -npn 2
> > > [lachele.keats noesy]$ $DO_PARALLEL $sander -O -i mdin -c
> 01.ann10.xyz -o
> > noesy.out
> > > qsub: script file:: No such file or directory
> > >
> > > I don't think using plain qsub is a good idea, either,
> but if someone
> > wants to explore that option, say so. Brief reasons: qsub
> seems to want a
> > separate script file (=rewrite tests) and scaMPI complains
> if not using
> > scasub. Possibly can work around both, and happy to accept
> suggestions.
> > >
> >
> > Your machine's queue situation seems odd.
> > Anyway, try
> > set echo
> > set verbose
> > then your two command invocations to see the differences.
> > My guess is that "-W block=true" -> '-W block=true'
> > might work and wont hurt.
> >
> > Scott
> >
> >
> >
Received on Fri Apr 18 2008 - 21:19:21 PDT
Custom Search