Re: amber-developers: ene and ener array mess in sander from Robert Duke on 2008-11-23 (Amber Developers Archive Nov 2008)

From: Robert Duke <rduke.email.unc.edu>
Date: Mon, 24 Nov 2008 00:44:50 -0500

Hi Scott -
I don't want to rely on inlining that is not specified as part of the
language. Too many rude surprises. And the only advantage I see to get/set
here is you allow including debugging functionality or whatever else you
want to glum onto the access or setting. At least to my mind, the real
advantage of get/set routines in c++ is controlling the part of the code
that has the capability to set something, typically a data member of an
object; I spent about 7 years doing mostly c++ dev at microsoft, developed
approaching a half million lines of code, and the last position I had before
moving to a telecommuting position was in Cairo - that was all the
object-oriented stuff that was being layered on top of NT (I was last
working on the object-oriented filesystem that included full support for
OLE - all that crazy object linking/embedding stuff that allows the various
ms apps to invoke each other - this file system has yet to ship, and before
I left, I wrote a whitepaper about how it was getting way too complex). So
I do know from object-oriented... I like Knuth, I only have 3 of his books,
but I have a reasonable collection of stuff by the other guys like Dykstra,
Wirth, Brooks, Yourdan. They make good points. But what do you REALLY gain
here? Not much, really, against the backdrop of all the other software
engineering sins currently in the code. And as to performance being oft
considered too early - that's true - get a functioning clean algorithm
first. But come on. All I am in the business to do is make this stuff fast
enough and reliable enough that people will choose it over the alternatives.
You can develop the most beautiful abstractions in the world, and it will
not matter if it is slug slow compared to your competitor, or if speed is a
primary objective. The software engineering practices should not be about
religion and what somebody said. They should be about achieving
reliability, readability, maintainability. But depending on what you are
writing, you may very well need to think about performance a lot. My
objection in this case has more to do with the kludge solutions required to
achieve the functionality needed here - the computations on the collection
of energy terms, and the parallel communications. The get/set abstraction
really buys you nothing of note at a cost of making things more complicated.
Needless abstraction and complexity is, to my mind, a sin (getting religious
here...). Of course, it all then boils down to what one considers to be
"needless". There are major changes needed to make something like sander
reliable, readable, and maintainable, and you are not going to get much
traction out of some get/set routines. The problems, the complexities of
developing this s/w in the manner it is developed are much more serious than
that. I have, by the way, been considering dragging pmemd into c++ to get a
better engineering paradigm and to get better access to system-level
functionality and threads. But I am not sure I am going to live that long,
or that it will be a high enough priority ever.
Regards - Bob
----- Original Message -----
From: "Scott Brozell" <sbrozell.scripps.edu>
To: <amber-developers.scripps.edu>
Sent: Sunday, November 23, 2008 11:24 PM
Subject: RE: amber-developers: ene and ener array mess in sander

> Hi,
>
> This thread is spiraling and missing one of my main points:
> the advantages of get/set for the initial conversion
> plus bypassing the need for compiler inlining are easily obtained:
>> > However, consistently chosen get/set names
>> > could be easily globally replaced into array references.
>> > (And globally re-replaced if the debugging need arised.)
>
> I.e., replace all those ene(your_favorite_integer)
> with call xet_energy_readable_name( an_argument ),
> where x is an element of {g,s}, then run the special test cases,
> debug as necessary, and finally execute e.g.,
> sed 's/call get_energy_readable_name( an_arg )/an_arg=ene(
> readable_name )/'
> similarly for set.
>
> On Sun, 23 Nov 2008, Ross Walker wrote:
>
>> > though :-) I do think that get()/set() routines here would be
>> > overkill,
>> > though, and I am not aware of an inlining standard for f90/95 (maybe I
>> > missed something, but I did just drag down the iso/ansi reference).
>> > For
>> > c++, this is a no-brainer, given that you have a situation where you
>> > want
>> > limit the modification scope.
>>
>> Yes, I was thinking the same thing, given that there is no prototyping in
>> Fortran and all optimization is generally done at the object build stage
>> the
>> only way this would work is if the get/set routines were in the same file
>> as
>> where they are used - which obviously wouldn't work. I don't know if the
>> situation with modules is different but I suspect these, in reality, just
>> allow for namespacing and not for actual inline optimizations.
>
> No, there are many many compilers that have an IPA that
> works across files; some store details in object files and have the
> linker "paste" in the code. See the sgi man page for ipa:
>
> By contrast, IPA algorithms analyze more than a single procedure
> (preferably the entire program) at once. The optimizations
> performed
> by the MIPSpro compilers' IPA utility include:
>
> * Inlining: Calls to a procedure are replaced by a suitably modified
> copy of the called procedure's body inline, even if the callee is
> in
> a different source file.
> ...
> IPA works by postponing much of the compilation process until the link
> step, when all of the program components can be analyzed together.
> Specifically, the following occurs:
>
> * The compile step does initial processing of the source file, placing
> an intermediate representation of the procedures it contains into
> the output .o file instead of normal relocatable code. Such object
> files are called WHIRL objects to distinguish them from normal
> relocatable object files.
>
>
>> Compilers such as ifort support Interprocedural optimizations with
>> the -ipo
>> flag which I think will do the sort of inlining you are referring to but
>> at
>> present we only use -ip with sander (Bob uses ipo for pmemd I believe). I
>> tried turning on -ipo for sander and it ran fine until the link stage
>> where
>> the memory requirements exploded and my machine ran out of swap space.
>> Thus
>> I'm not sure we can rely on the compiler to inline things in order to not
>> impact performance.
>
> Yes, ifort does inlining; it use to put intermediate code in files during
> compilation and use that during linking. And yes sometimes this leads to
> slow builds, but by restricting the size of inlinable routines one can
> work around this issue.
>
>> Note I believe there are a number of other C++ like approaches to coding
>> which make sense from a code standpoint but are not dealt with
>> efficiently
>> in Fortran so we have to be careful getting too c++ like in the code.
>
> "...premature optimization is the root of all evil." Knuth, Donald.
>
> Read this
> .article{knuth71,
> author = "{Donald E. Knuth}",
> title = "Empirical Study of {Fortran} Programs",
> journal = spe,
> volume = 1,
> pages = {105--133},
> year = 1971 }
> .string{spe = "Software --- Practice and Experience"}
>
>
> Scott
>
>
Received on Fri Dec 05 2008 - 16:31:23 PST