Re: amber-developers: Verlet update time and ntt=3 parallel scaling from Robert Duke on 2008-05-07 (Amber Developers Archive May 2008)

From: Robert Duke <rduke.email.unc.edu>
Date: Wed, 7 May 2008 15:31:02 -0400

Well, that's the basic idea, but it is not just some simple function where
if you want to go forward 100 values, you multiply by 100, and finding the
shortcut requires a bit of reading/deciphering, and it can turn out to be
the longcut rather than the shortcut. So I did look at the code we have
once with this in mind, and decided pretty quickly that it would be really
really helpful to have an indepth understanding of the underlying algorithm
rather than just trying to wade into the code. The sequence in any of these
things is of course deterministic, but it is not clear to me that with any
given generator x, there is a shortcut to leapfrog ahead in the sequence
without actually stepping the state of the generator through each step. If
you think about it, the less correlated successive numbers are, the more
difficult it is to leapfrog ahead (that's the whole bloody idea). And to
make life even more interesting, some of the better rng's in use now
actually put one rng on top of another...
Regards - Bob

----- Original Message -----
From: "Carlos Simmerling" <carlos.simmerling.gmail.com>
To: <amber-developers.scripps.edu>
Sent: Wednesday, May 07, 2008 3:08 PM
Subject: Re: amber-developers: Verlet update time and ntt=3 parallel scaling

> I'm very naive about rng details, but if we want to simply keep the
> generators in sync, isn't there a way to step forward in the pseudo random
> sequence directly rather than repeatedly calling it? we know how many we
> need
> to skip, can we use that to reindex the stream? again I don't know how
> these work
> so maybe the only way is to call it over and over for the ones we want to
> skip.
> carlos
>
> On Wed, May 7, 2008 at 2:21 PM, Robert Duke <rduke.email.unc.edu> wrote:
>> Well, at least in my reading so far on the subject, this is not the case.
>> There ARE correlations in any pseudo random number generator, and you are
>> not guaranteed of getting a proper distribution unless you have deeper
>> knowledge of the rng algorithm itself, and essentially start with one
>> seed
>> and from there select "leapfrog" points in the deterministic sequence of
>> the
>> rng. That is at least what I have read so far, but I have not read
>> widely;
>> this stuff is in the physics literature, and I am preoccupied at the
>> moment
>> trying to move other mountains, so I am not all over it just yet. So
>> given
>> recent surprises with rng issues, I feel fairly vindicated in not jumping
>> on
>> the "simple" solution here and just generating a bunch of random seeds in
>> parallel. As I read more, I will know more, but I think what you propose
>> here is potentially a really really bad idea, and I would recommend
>> against
>> just trying it because a lot of bad work can be done before the sensitive
>> test case turns up that starts making you wonder. The rng we use is
>> supposed to be pretty good, but I need to read more about it
>> (Marsaglia's),
>> and put it in a proper context relative to current work on parallel
>> rng's. I
>> am at the point where I will sort of promise to come up with something,
>> given that I am still employed, but I won't put anything out unless I am
>> absolutely certain that it generates a good sequence of random numbers
>> (and
>> I won't go any further than saying I will get rng enhancements into pmemd
>> for 11; hopefully I will actually have some usable patches before that,
>> but
>> right now I am preoccupied).
>> Regards - Bob
>>
>> ----- Original Message ----- From: "Ross Walker" <ross.rosswalker.co.uk>
>>
>> To: <amber-developers.scripps.edu>
>> Sent: Wednesday, May 07, 2008 1:53 PM
>> Subject: RE: amber-developers: Verlet update time and ntt=3 parallel
>> scaling
>>
>>
>>
>>
>>
>> > As far as I can tell if our random number generator is any good - which
>> > I
>> > don't know if we have properly checked or not - two sets of random
>> > numbers
>> > from different seeds should not have any correlation. Thus it should be
>> > equally correct (statistically) to do a Langevin run with each
>> > processor
>> > having its own random number stream - with simply different seeds for
>> > each
>> > mpi thread. This should be equivalent to having a single random number
>> > stream shared between all processors where each processor makes sure it
>> > doesn't use the same portion of the stream as other processors.
>> >
>> > Of course the first option makes testing in parallel difficult but then
>> > we
>> > only get about 300 steps or so matching anyway.
>> >
>> > So perhaps we should have two modes of operation (controlled in $cntrl
>> > maybe).
>> >
>> > A testing mode in which it does exactly what we have now and a
>> > production
>> > mode in which each thread uses its own random number stream. The
>> > question
>> is
>> > how to set ig on each processor. One option would be for the master to
>> > use
>> > IG from &cntrl and then each mpi task add a successively bigger prime
>> number
>> > to IG and use that (ig+3,+5,+7,+11 etc...). Another option would be for
>> each
>> > processor to just add its task ID to ig but this may not be safe since
>> > it
>> is
>> > possible that two random number streams for IG and IG+1 have some
>> > correlation - although I think this is purely hearsay and again I don't
>> > think it has been checked.
>> >
>> > These approaches would at least be reproducible on a given number of
>> > processors - for sander at least, perhaps not for PMEMD.
>> >
>> > Comments?
>> >
>> >
>> > /\
>> > \/
>> > |\oss Walker
>> >
>> > | Assistant Research Professor |
>> > | San Diego Supercomputer Center |
>> > | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
>> > | http://www.rosswalker.co.uk | PGP Key available on request |
>> >
>> > Note: Electronic Mail is not secure, has no guarantee of delivery, may
>> > not
>> > be read every day, and should not be used for urgent or sensitive
>> > issues.
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>
>
>
> --
> ===================================================================
> Carlos L. Simmerling, Ph.D.
> Associate Professor Phone: (631) 632-1336
> Center for Structural Biology Fax: (631) 632-1555
> CMM Bldg, Room G80
> Stony Brook University E-mail: carlos.simmerling.gmail.com
> Stony Brook, NY 11794-5115 Web: http://comp.chem.sunysb.edu
> ===================================================================
>
Received on Sun May 11 2008 - 06:07:14 PDT