- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Scott Brozell <sbrozell.scripps.edu>

Date: Wed, 7 May 2008 14:51:07 -0700 (PDT)

Hi,

*> >> As far as I can tell if our random number generator is any good - which I
*

*> >> don't know if we have properly checked or not - two sets of random numbers
*

*> >> from different seeds should not have any correlation. Thus it should be
*

In the sense that Ross probably meant there may be no correlations,

but in another sense two such sequences are 100% correlated:

that sense is that the next element in the sequence after any particular

entry is identical for all seeds.

A real world example is a linear congruential rng with specific good

m, a, and c. A toy example is the rng whose sequence is 4 0 1 3 2

where the seed is the cardinal index; the next entry after 0 is 1

for all seeds.

The classic treatise on rng is Knuth vol 2 (3rd ed is 1997)

where the sad (ie, promulgation of non-random generators) history

can be read. I have forgotten most of what i knew from the 2nd ed,

but I strongly advise following a careful course, as Bob had advocated.

Scott

On Wed, 7 May 2008, Andreas Svrcek-Seiler wrote:

*> In an ideal rng there are no correlations, but such a thing does not
*

*> exist. The statistical properties
*

*> of any subsequences should be indiscernible from physical white noise.
*

*> Currently the best rng (as far as I know) is the Mersenne twister (MT)
*

*>
*

*> P.S.:It seems hard to imagine (for me) that there's a significant
*

*> difference between a "good" and a "bad" rng when it comes to MD (e.g. driving a
*

*> langevin-heatbath). Are you all sure you could tell the difference
*

*> between a "real" rng and -say- the first 1000000 digits of pi
*

*> repeated over and over driving a MD run when looking at the results?
*

*> (nonetheless - in dubio mersenne twisto :-)
*

*>
*

*> On Wed, 7 May 2008, Robert Duke wrote:
*

*>
*

*> > Well, at least in my reading so far on the subject, this is not the case.
*

*> > There ARE correlations in any pseudo random number generator, and you are not
*

*> > guaranteed of getting a proper distribution unless you have deeper knowledge
*

*> > of the rng algorithm itself, and essentially start with one seed and from
*

*> > there select "leapfrog" points in the deterministic sequence of the rng.
*

*> > That is at least what I have read so far, but I have not read widely; this
*

*> > stuff is in the physics literature, and I am preoccupied at the moment trying
*

*> > to move other mountains, so I am not all over it just yet. So given recent
*

*> > surprises with rng issues, I feel fairly vindicated in not jumping on the
*

*> > "simple" solution here and just generating a bunch of random seeds in
*

*> > parallel. As I read more, I will know more, but I think what you propose
*

*> > here is potentially a really really bad idea, and I would recommend against
*

*> > just trying it because a lot of bad work can be done before the sensitive
*

*> > test case turns up that starts making you wonder. The rng we use is supposed
*

*> > to be pretty good, but I need to read more about it (Marsaglia's), and put it
*

*> > in a proper context relative to current work on parallel rng's. I am at the
*

*> > point where I will sort of promise to come up with something, given that I am
*

*> > still employed, but I won't put anything out unless I am absolutely certain
*

*> > that it generates a good sequence of random numbers (and I won't go any
*

*> > further than saying I will get rng enhancements into pmemd for 11; hopefully
*

*> > I will actually have some usable patches before that, but right now I am
*

*> > preoccupied).
*

*> > Regards - Bob
*

*> >
*

*> > ----- Original Message ----- From: "Ross Walker" <ross.rosswalker.co.uk>
*

*> > To: <amber-developers.scripps.edu>
*

*> > Sent: Wednesday, May 07, 2008 1:53 PM
*

*> > Subject: RE: amber-developers: Verlet update time and ntt=3 parallel scaling
*

*> >
*

*> >
*

*> >> As far as I can tell if our random number generator is any good - which I
*

*> >> don't know if we have properly checked or not - two sets of random numbers
*

*> >> from different seeds should not have any correlation. Thus it should be
*

*> >> equally correct (statistically) to do a Langevin run with each processor
*

*> >> having its own random number stream - with simply different seeds for each
*

*> >> mpi thread. This should be equivalent to having a single random number
*

*> >> stream shared between all processors where each processor makes sure it
*

*> >> doesn't use the same portion of the stream as other processors.
*

*> >>
*

*> >> Of course the first option makes testing in parallel difficult but then we
*

*> >> only get about 300 steps or so matching anyway.
*

*> >>
*

*> >> So perhaps we should have two modes of operation (controlled in $cntrl
*

*> >> maybe).
*

*> >>
*

*> >> A testing mode in which it does exactly what we have now and a production
*

*> >> mode in which each thread uses its own random number stream. The question
*

*> >> is
*

*> >> how to set ig on each processor. One option would be for the master to use
*

*> >> IG from &cntrl and then each mpi task add a successively bigger prime
*

*> >> number
*

*> >> to IG and use that (ig+3,+5,+7,+11 etc...). Another option would be for
*

*> >> each
*

*> >> processor to just add its task ID to ig but this may not be safe since it
*

*> >> is
*

*> >> possible that two random number streams for IG and IG+1 have some
*

*> >> correlation - although I think this is purely hearsay and again I don't
*

*> >> think it has been checked.
*

*> >>
*

*> >> These approaches would at least be reproducible on a given number of
*

*> >> processors - for sander at least, perhaps not for PMEMD.
*

Received on Sun May 11 2008 - 06:07:15 PDT

Date: Wed, 7 May 2008 14:51:07 -0700 (PDT)

Hi,

In the sense that Ross probably meant there may be no correlations,

but in another sense two such sequences are 100% correlated:

that sense is that the next element in the sequence after any particular

entry is identical for all seeds.

A real world example is a linear congruential rng with specific good

m, a, and c. A toy example is the rng whose sequence is 4 0 1 3 2

where the seed is the cardinal index; the next entry after 0 is 1

for all seeds.

The classic treatise on rng is Knuth vol 2 (3rd ed is 1997)

where the sad (ie, promulgation of non-random generators) history

can be read. I have forgotten most of what i knew from the 2nd ed,

but I strongly advise following a careful course, as Bob had advocated.

Scott

On Wed, 7 May 2008, Andreas Svrcek-Seiler wrote:

Received on Sun May 11 2008 - 06:07:15 PDT

Custom Search