Re: [AMBER-Developers] New features in pmemd from Josh Berryman on 2014-02-13 (Amber Developers Archive Feb 2014)

From: Josh Berryman <the.real.josh.berryman.gmail.com>
Date: Thu, 13 Feb 2014 09:25:44 +0100

>>Sure, CPUs are really good at this sort of thing, and outside the inner
>>loops I don't care about them. But here, I do.

Absolutely, every time there is an if or a divide inside a loop over N
a baby seal dies.

I prevailed on Joe to add the emil_sc feature for me as someone
mentioned that he had his fingers in the ti code already. I find it
immensely useful and like it a lot, but yes, ideally there would be a
longer "soak" period where I have a look at the changes made (in parts
of the code which I would not really have been confident myself to
start the ball rolling in), explore performance and look for
optimisations.

I know that it is getting late in the day but it seems to me that
there is a really obvious fix for this particular extra cost, so no
need to mess around with compile flags in the name of performance.

If instead of 0 or 1 as the ti_sc_lst flags we use 0 or 2 (or 1 for
emil_sc) then
the test goes from:

if ((ti_sc_lst(real_i)+ti_sc_lst(real_j) .eq. 1) .or. &
(emil_sc_lcl .eq. 1 .and.
ti_sc_lst(real_i)+ti_sc_lst(real_j) .ge. 1))

to:

if (ti_sc_lst(real_i)+ti_sc_lst(real_j) .eq. 2)

which is back to the efficiency that we had before the change. I am
happy to implement this and commit, after all the code freeze is
tomorrow, not today, right?

Josh

The smallest change to give an efficiency

On 13/02/2014, Scott Le Grand <varelse2005.gmail.com> wrote:
> David,
> When I start seeing global switch tests in inner loops:
>
> else if ((ti_sc_lst(real_i)+ti_sc_lst(real_j) .eq. 1) .or. &
> *(emil_sc_lcl .eq. *1 .and.
> ti_sc_lst(real_i)+ti_sc_lst(real_j) .ge. 1)) then
> if (ti_lst(1,real_i)+ti_lst(1,real_j) .ge. 1) then !TI region 1
>
> this has always been a reliable warning sign for me in the past that the
> overall structure of the code needs reorganization.
>
> The GPU code does this sort of thing at compile time and that's where
> decisions like this belong IMO. And it's not any one if/then that is the
> problem but rather that these guys tend to reproduce like bunnies once they
> appear.
>
> Sure, CPUs are really good at this sort of thing, and outside the inner
> loops I don't care about them. But here, I do.
>
> OTOH the more there are, the more I can refactor them out of GPU
> incarnations and that's free performance beer.
>
> That and I'm on the verge of rethinking the GPU code itself because so many
> assumptions that were valid in 2010 no longer hold.
>
> Scott
>
>
>
>
>
>
> On Wed, Feb 12, 2014 at 6:46 PM, David A Case
> <case.biomaps.rutgers.edu>wrote:
>
>> On Wed, Feb 12, 2014, Jason Swails wrote:
>> > >
>> > > What's left in Sander that people crave for in PMEMD?
>> > >
>> >
>> > QM/MM.
>>
>> Certainly, QM/MM is in sander but not pmemd. But do people "crave for
>> it"
>> to be added to pmemd? The usual argument is that the QM part quickly
>> becomes
>> the time bottleneck, so the fact that MM is faster in pmemd becomes
>> irrelevant -- there is little need (yet) to think about porting QM/MM to
>> pmemd. (I could be wrong here.)
>>
>> Similar arguments apply for PBSA and RISM: these are both in sander, but
>> are slow, and limit performance. Again, any speed advantage PMEMD has
>> for
>> the other parts of the calculation are minimal. So there is no hurry to
>> put PBSA or RISM into pmemd.
>>
>> The emap functionality is similar, but maybe not identical, since it is
>> not
>> as big a time-hog as QM, PBSA or RISM. So far, I don't think most emap
>> calculations (and there aren't very many of them, since only a few people
>> are using this capability) need highly efficient MM. But there is more
>> of
>> an argument here that emap capability in pmemd might be desirable.
>>
>> On the other hand, I don't really see Scott's argument, that emap (et al)
>> somehow make it hard to keep going on pmemd.cuda development. It's quite
>> modular, and there is a simple if-statement in mdin to disable this for
>> GPUs.
>> We've always had a list of things that aren't supported for GPUs; having
>> a
>> few more entires in that list doesn't seem that bad (to me). We're
>> learning
>> how to document this better, both in the code and in the Manual.
>>
>> Scott says he wants "more control ... on feature additions that will run
>> on the GPU." We *should* establish a better mechanism for suggesting and
>> vetting requests for new GPU features. But I don't (yet) see that
>> another
>> code fork is needed to accomplish this.
>>
>> ...dac
>>
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>

_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Thu Feb 13 2014 - 00:30:02 PST