Re: amber-developers: Random seeds from Robert Duke on 2008-12-17 (Amber Developers Archive Dec 2008)

From: Robert Duke <rduke.email.unc.edu>
Date: Wed, 17 Dec 2008 12:48:54 -0500

I would probably like to do two things:
1) Institute a mechanism to maintain prng state between runs - in the
restart I would guess. This solves the reseed problem as cleanly as
possible.
2) Actually implement a theoretically clean, but also efficient parallel
pseudo random number generator. There is lots of work on this in the
physics community for MC simulations, I just have not yet managed to dig
into it (there are available libraries out there - I don't remember the
library but it may actually be UCSD-based, but there would be licensing
issues to a simple solution I fear). I absolutely don't want to do something
to improve efficiency without being certain that there are no theoretical
issues, and I trust that the physics community is more-or-less on top of
this.
I see all this great stat stuff coming more and more to the fore, but that
is just my opinion.
Regards - Bob

----- Original Message -----
From: "Ross Walker" <ross.rosswalker.co.uk>
To: "'Robert Duke'" <rduke.email.unc.edu>
Cc: <amber-developers.scripps.edu>
Sent: Wednesday, December 17, 2008 12:37 PM
Subject: amber-developers: Random seeds

> Hi Bob and others,
>
>> Regarding changing the seed in langevin dynamics runs - this is very
>> important. One should absolutely be careful to do this to get valid
>> results, and this is not yet automated in pmemd (next release). Please
>> see
>> Cerutti, Duke, et al, J Chem Theor Comp 4, 1669-80 (2008).
>
> This reminds me of something we should probably discuss at the AMBER
> developers meeting. I propose that we change the way ig works and we
> should
> try to make sure this gets put into both sander and pmemd. My proposal is
> that +ve values of ig should behave as they do now. Then if one sets a
> value
> of ig=0 then it uses the time of day in microseconds for the random number
> stream. If however, one sets a -ve value for ig then it uses the absolute
> value of that as the seed BUT it does not attempt to synchronize the
> random
> numbers in any way between processor threads - removing the parallel
> bottleneck problems with langevin for example. Thus in this case
> (excluding
> load rebalancing etc) one should get the same results (for a few steps)
> providing the same number of processors are used. Obviously it is more
> complicated than this but I don't think we should worry about that.
>
> Clearly on 1cpu -ig = ig.
>
> In the case of ig=0 we would need to decide whether or not random numbers
> are synchronized between threads but I vote not to do this since for
> Langevin as far as I can tell the current approach of enforcing the
> synchronization of random numbers is crazy and doesn't help with the
> actual
> accuracy at all - in fact it allows people to introduce very weird
> correlations.
>
> Thus +ve ig values would essentially be reserved for testing and all
> actual
> production runs would / should be run with ig=0 (which is actually -1
> right
> now in amber 10 - hence why there would be changes). The key then becomes
> whether we change the default value of not - which I also think we should
> do. And then we explicitly set if in all the test cases so they are still
> usable...
>
> What do you think?
>
> All the best
> Ross
>
>
> /\
> \/
> |\oss Walker
>
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
>
Received on Fri Dec 19 2008 - 01:10:39 PST