Re: [AMBER-Developers] Problem with Fortran-CUDA interface from David Cerutti on 2017-11-03 (Amber Developers Archive Nov 2017)

From: David Cerutti <dscerutti.gmail.com>
Date: Fri, 3 Nov 2017 14:53:56 -0400

Can you provide the system on which this occurs? I am doing a major sweep
of the code and am nearly finished rebuilding the way bonded interactions
are computed. As such, the GaMD routines are on my list of things to
incorporate. We spotted a bug earlier, which has since been patched, that
involved sending values to extern "C" functions as literals, not pointers,
but that doesn't seem to be the case here. However, that problem was only
detected after I tried to pass additional arguments to the function, which
then got turned to gobbledegook. Are you using code from Amber16, master
branch, or your own mods?

Dave

On Fri, Nov 3, 2017 at 10:35 AM, Yinglong Miao <yinglong.miao.gmail.com>
wrote:

> Hi All,
>
> I run into this problem with the Fortran-CUDA interface. Basically, in the
> pme_force.F90, I called a function in the cuda/gpu.cpp as shown below. The
> calculations inside the gpu function run fine, but it seems the potential
> energies got changed *strangely* in the first place when they are passed
> for calling the function:
>
> pme_force.F90:
> write(*,'(a,2f22.12)') "Debug-p1) (pot_ene%total, pot_ene%dihedral)
> = ", &
> pot_ene%total, pot_ene%dihedral
> call gpu_calculate_and_apply_gamd_weights(pot_ene%total,
> pot_ene%dihedral, &
> pot_ene%gamd_boost,num_gamd_
> lag)
> ...
>
> cuda/gpu.cpp:
> extern "C" void gpu_calculate_and_apply_gamd_weights_(double*
> pot_ene_tot, double* dih_ene_tot,
> double* gamd_ene_tot,
> double* num_gamd_lag)
> {
> PRINTMETHOD("gpu_calculate_and_apply_gamd_weights");
> double tboost = 0.0;
> double fwgtd = 1.0;
> double fwgt = 1.0;
> double tboostall = 0.0;
> double temp0 = gpu->sim.gamd_temp0;
> double ONE_KB = 1.0 / (temp0 * KB);
> printf("Debug-GPU-p0) (pot_ene_tot, dih_ene_tot) = (%12.5f, %12.5f)\n",
> *pot_ene_tot, *dih_ene_tot);
> ...
>
> Output:
> Debug-p1) (pot_ene%total, pot_ene%dihedral) = -5991.862400868107
> 9.501277353615
> Debug-GPU-p0) (pot_ene_tot, dih_ene_tot) = ( -5988.16828, 9.89661)
> =========
>
> As you can see, the energy values before and after they are passed are
> different. And this problem appeared to depend on the simulation length.
> The energy differences are negligible when I test the code with several
> thousand steps, but get larger with hundreds of thousand steps or more like
> the above. Has anyone come across similar issues before? My workstation has
> a new NVIDIA Quadro P5000 GPU card. Could this be related to the hardware?
> If not, how may I fix it?
>
> Any suggestions will be much appreciated,
> Yinglong
>
> Yinglong Miao, Ph.D.
> Assistant Professor
> Center for Computational Biology and
> Department of Molecular Biosciences
> University of Kansas
> http://miao.compbio.ku.edu
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Fri Nov 03 2017 - 12:00:02 PDT