Re: [AMBER-Developers] Problems with pmemd.cuda.MPI

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Thu, 13 Mar 2014 13:57:58 -0600

Here are the test output files corresponding to the test .dif files. Let me
know if you need more data. I'll test on our local cluster in the meantime.

-Dan


On Thu, Mar 13, 2014 at 1:12 PM, Daniel Roe <daniel.r.roe.gmail.com> wrote:

> OK, I'll get them to you ASAP. Unfortunately stampede appears to have gone
> (or is going) offline. I'll try and reproduce the test failures on one of
> our local clusters.
>
> -Dan
>
>
> On Thu, Mar 13, 2014 at 11:57 AM, Scott Le Grand <varelse2005.gmail.com>wrote:
>
>> I need your raw mdout files for the failed tests. The comparisons on
>> their
>> own are mostly useless.
>>
>>
>>
>> On Thu, Mar 13, 2014 at 10:52 AM, Daniel Roe <daniel.r.roe.gmail.com>
>> wrote:
>>
>> > The reason I initially used intel compilers is they are the default on
>> > stampede (and other HPC centers as well). However, the issue exists with
>> > GNU compilers as well, version 4.4.6, mvapich 1.9, cuda 5.0 (diffs and
>> log
>> > attached). The differences are similar (though not 100% exactly) to
>> those
>> > seen with the intel compilers.
>> >
>> > Let me know if you need any more information.
>> >
>> > -Dan
>> >
>> >
>> > On Thu, Mar 13, 2014 at 10:39 AM, Scott Le Grand <varelse2005.gmail.com
>> > >wrote:
>> >
>> > > Why do you guys bother with Intel's compilers for the CUDA edition? I
>> > > can't even get my hands on them without paying the big bucks so
>> there's
>> > > zero incentive for me to debug Intel compiler issues other than saying
>> > > don't use them. That said, if it's broken with gcc, then it's
>> > interesting.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Thu, Mar 13, 2014 at 9:24 AM, Daniel Roe <daniel.r.roe.gmail.com>
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > > First, this time I promise I am using an up-to-date GIT tree:
>> > > >
>> > > > commit 8f81fff98e4d3095d0c66070eb3375f6a72708c0
>> > > > Merge: 4de7ced 72c975d
>> > > > Author: Pawel Janowski <pjanowsk.eden.rutgers.edu>
>> > > > Date: Thu Mar 13 09:55:44 2014 -0400
>> > > >
>> > > > I am running on stampede (Tesla K20m) using intel compilers 13.1.0,
>> > cuda
>> > > > 5.0, and mvapich 1.9. For pmemd.cuda.MPI the PME tests go haywire
>> > > (absolute
>> > > > error as high as 1.31e+06!!). No segfaults though. Diffs and log
>> > > attached.
>> > > > Going back to commit d8024087a4d8c4c1e801192839df57d760bcadd2 (Wed
>> Feb
>> > 5
>> > > > 22:17:48 2014 -0500) fixes the problems, although I have not
>> > > systematically
>> > > > gone back to see at what commit the code breaks. The problems happen
>> > even
>> > > > if I run on just 1 thread. GB seems OK, and pmemd.cuda seems OK as
>> > well.
>> > > >
>> > > > Has anyone else seen issues like these? I will try using GNU
>> compilers
>> > > next
>> > > > to see if the issue happens with them as well.
>> > > >
>> > > > -Dan
>> > > >
>> > > > --
>> > > > -------------------------
>> > > > Daniel R. Roe, PhD
>> > > > Department of Medicinal Chemistry
>> > > > University of Utah
>> > > > 30 South 2000 East, Room 201
>> > > > Salt Lake City, UT 84112-5820
>> > > > http://home.chpc.utah.edu/~cheatham/
>> > > > (801) 587-9652
>> > > > (801) 585-6208 (Fax)
>> > > >
>> > > > _______________________________________________
>> > > > AMBER-Developers mailing list
>> > > > AMBER-Developers.ambermd.org
>> > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
>> > > >
>> > > >
>> > > _______________________________________________
>> > > AMBER-Developers mailing list
>> > > AMBER-Developers.ambermd.org
>> > > http://lists.ambermd.org/mailman/listinfo/amber-developers
>> > >
>> >
>> >
>> >
>> > --
>> > -------------------------
>> > Daniel R. Roe, PhD
>> > Department of Medicinal Chemistry
>> > University of Utah
>> > 30 South 2000 East, Room 201
>> > Salt Lake City, UT 84112-5820
>> > http://home.chpc.utah.edu/~cheatham/
>> > (801) 587-9652
>> > (801) 585-6208 (Fax)
>> >
>> > _______________________________________________
>> > AMBER-Developers mailing list
>> > AMBER-Developers.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber-developers
>> >
>> >
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>
>
>
> --
> -------------------------
> Daniel R. Roe, PhD
> Department of Medicinal Chemistry
> University of Utah
> 30 South 2000 East, Room 201
> Salt Lake City, UT 84112-5820
> http://home.chpc.utah.edu/~cheatham/
> (801) 587-9652
> (801) 585-6208 (Fax)
>



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 201
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)



_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers

Received on Thu Mar 13 2014 - 13:00:02 PDT
Custom Search