Re: amber-developers: Verlet update time and ntt=3 parallel scaling

From: Scott Brozell <>
Date: Wed, 7 May 2008 20:44:54 -0700 (PDT)


On Wed, 7 May 2008, Robert Duke wrote:

> I think one problem that comes in is when you randomly seed multiple
> processors, you actually don't know when two sequences on two different
> processes will overlap, but at some point they will. If there is a lot of

Overlap of sequences is interesting in a petascale context.
For example, even a good linear cong rng with a period of 2^64
could overlap in a matter of hours if every op was committed to
random number generation (but this ignores the parallel nature of
really getting to petaflop). The ratio of non-rng ops to rng ops
is very favorable for non-overlap in MD :)
Some MC applications have a much less favorable ratio.

Note that p32 of Knuth v2 2nd ed indicates that there are
"surprisingly better" algorithms than Marsaglia's.


> state to the generator, presumably any two random seed starting points won't
> overlap "soon", but apparently this can be a big enough problem that folks
> that do MC can spot interprocessor correlations. So we really have to be
> sure of what we are doing. It is interesting to me, and I'll pursue it as I
> get time, as long as Tom doesn't shoot me. Then we will be out of the space
> where we are worrying about provability.
> Regards - Bob
> ----- Original Message -----
> From: "David A. Case" <>
> To: <>
> Sent: Wednesday, May 07, 2008 9:22 PM
> Subject: Re: amber-developers: Verlet update time and ntt=3 parallel scaling
> > On Wed, May 07, 2008, Ross Walker wrote:
> >
> >> As far as I can tell if our random number generator is any good - which I
> >> don't know if we have properly checked or not - two sets of random
> >> numbers
> >> from different seeds should not have any correlation. Thus it should be
> >> equally correct (statistically) to do a Langevin run with each processor
> >> having its own random number stream - with simply different seeds for
> >> each
> >> mpi thread. This should be equivalent to having a single random number
> >> stream shared between all processors where each processor makes sure it
> >> doesn't use the same portion of the stream as other processors.
> >
> > I agree with this, but (as Bob points out) it's not clear how you prove
> > it.
> >
> > With the current method, one *assumes* that the single stream of numbers
> > (that
> > you would get with a serial code) is correct, then arranges to get the
> > same
> > results in parallel.
> >
> > The only artifacts I know of have to do with reusing a particular part of
> > the
> > big stream of numbers. Since the period is very long, presumably Ross'
> > scheme
> > would have low probability of having this happen, but without a detailed
> > understanding of the scheme works, you might get fooled. But I think it
> > would
> > be worth the risk.
Received on Sun May 11 2008 - 06:07:20 PDT
Custom Search