Hi Ross -
I basically agree with your proposed mechanism, and agree with the
reasoning. There really should be no way you are going to lose things with
a rename old copy / write new copy mechanism. Any file system you want to
rely on for anything has to be able to keep this synchronized, and you won't
lose anything in a crash, regardless of when it occurs. So then you are
really only susceptible to hardware failures on the disk (in a previous life
I worried about such things a lot, in terms of filesystem and recover code).
- Bob
----- Original Message -----
From: "Ross Walker" <ross.rosswalker.co.uk>
To: "'AMBER Developers Mailing List'" <amber-developers.ambermd.org>
Sent: Friday, March 06, 2009 11:02 AM
Subject: RE: [AMBER-Developers] Backup file renaming?
> Hi Joe,
>
>> The only OS-level routine needed is RENAME(), assuming that intrinsic
>> INQUIRE commands work as they should. I only suggest it because I
>> already have code to do it. The usefulness of renaming depends on how
>> often regular AMBER users accidentally overwrite files.
>>
>> I think the automatic LUN allocation is always useful for large
>> projects, and is easy to implement with INQUIRE opened=, which is
>> already used by 'sander/ncsu-utils.f'. It could be built in to the
>> existing amopen(), where lun is assigned a value if called with lun=-1.
>
> I'm not sure we need to protect users from accidentally overwriting their
> files. That can be a good life lesson occasionally even for long time
> AMBER
> users. However, one thing that Tom C, Carlos and I were discussing where
> this might be useful was the saving of two restart files. Sometimes,
> especially when running on the supercomputers a node can crash while
> sander
> / pmemd is writing a restart file. This results in the restart file
> getting
> corrupted and you need to restart this entire run rather than being able
> to
> continue from where it left off.
>
> Hence we think it would be a good idea if sander / pmemd kept at least the
> last restart file each time. At present you can set ntwr to a -ve number
> and
> it will keep all restart files but this is not really practical when
> running
> 100+ ns runs on big machines. Thus it would be useful to have the behavior
> changed (in case of a positive ntwr value) such that when sander / pmemd
> write a restart file they take the existing restart file (if it exists)
> and
> rename it with a .bak extension. Then they write the new restart file to
> the
> filename specified by -r on the command line. The process then repeats at
> the next restart write overwriting the old .bak file and replacing it with
> the current one.
>
> What you suggest might be a good way to address this assuming it is
> portable
> across all machine types - although it might be slightly overcomplicated
> for
> what we need here. I think just a single restart + .bak is suitable to fix
> the majority of crash issues. The key is to do it in a way that doesn't
> hurt
> performance - I.e. copying the current restart file to .bak then
> overwriting
> the .rst file is probably safest in terms of recovering from a crash with
> an
> intact restart file but is not good for performance. Renaming may be very
> fast and in the noise and is probably good enough to avoid corrupting the
> restart file if it crashes while that is happening.
>
> Just my 3c...
>
> Comments?
>
> All the best
> Ross
>
> /\
> \/
> |\oss Walker
>
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sun Mar 08 2009 - 01:10:02 PST