This smells like a random numbers thing. I may have some time in the
coming week to look into it, but I sure don't have an A40 in my hands yet.
Are the issues spread throughout NVE, NPT, NTT tests, GB as well as PME
setups? From your mail it looks like some (but not all) of the non-REMD
PME tests are failing, and the non-REMD GB tests are failing in the kinetic
energies from step 1 onward. Do any PME non-REMD tests pass? Are you
running the tests in DPFP or SPFP mode?
Dave
On Fri, Jul 9, 2021 at 11:17 AM Charles Lin <charles.lin.roivant.com> wrote:
> Hi all,
>
> I was wondering if anyone has tried running CUDA MPI on the NVIDIA A40
> cards. I’m currently using CUDA 11.0, and using AMD cpus. I’ve gotten the
> following to pass:
> pmemd
> pmemd.MPI
> pmemd.cuda
>
> It seems all REMD passes for pmemd.cuda.MPI, but for non-REMD jobs the
> tests fail. The issue seems to stem from the kinetic energies for some
> tests and the EGB+Kinetic Energies for GB tests (all other energy terms
> including potential energy look fine in step 1). The velocities are coming
> out different so I’m wondering if its an MPI issue in the CUDA code (?),
> but I’m not well-versed in that part of the code, so was wondering if
> someone could investigate that.
>
> Thanks!
> Charlie
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Fri Jul 09 2021 - 09:00:02 PDT