Re: [AMBER-Developers] cudaMalloc GpuBuffer::Allocate failed out of memory

From: Stephan Schott <schottve.hhu.de>
Date: Fri, 25 Jun 2021 17:13:53 +0200

Oki, thanks for the help Scott 馃憤, I will keep this in mind when playing
around with bigger systems.

El vie, 25 jun 2021 a las 17:09, Scott Le Grand (<varelse2005.gmail.com>)
escribi贸:

> Take the win. You can fiddle in gpu.cpp to play with those limits. I
> wouldn't recommend doing that a priori though.
>
> On Fri, Jun 25, 2021 at 8:07 AM Stephan Schott <schottve.hhu.de> wrote:
>
> > So, you were right. It runs on the 3090, but the memory usage is 21438MiB
> > with skinnb=2, vs 9272MiB with skinnb=3. Is that how it is supposed to
> be?
> >
> > El vie, 25 jun 2021 a las 16:16, Scott Le Grand (<varelse2005.gmail.com
> >)
> > escribi贸:
> >
> > > Cool, measure how much memory it's using mid-run at each skinnb
> setting.
> > >
> > > On Fri, Jun 25, 2021 at 7:15 AM Stephan Schott <schottve.hhu.de>
> wrote:
> > >
> > > > I have access to a machine with RTX 3090s. Those are 24GB though. I
> > will
> > > > give it a spin and let you know once I have tried.
> > > >
> > > > El vie, 25 jun 2021 a las 15:59, Scott Le Grand (<
> > varelse2005.gmail.com
> > > >)
> > > > escribi贸:
> > > >
> > > > > Is there somewhere else you can try to run it with a 16 GB GPU?
> I've
> > > seen
> > > > > some issues with containers not playing nice with AMBER recently,
> but
> > > not
> > > > > this particular issue.
> > > > >
> > > > > On Fri, Jun 25, 2021 at 6:42 AM Stephan Schott <schottve.hhu.de>
> > > wrote:
> > > > >
> > > > > > I cannot do that, it is in an HPC 馃槄. But I have tried in
> multiple
> > > > nodes
> > > > > if
> > > > > > that helps somehow.
> > > > > >
> > > > > > El vie, 25 jun 2021 a las 15:41, Scott Le Grand (<
> > > > varelse2005.gmail.com
> > > > > >)
> > > > > > escribi贸:
> > > > > >
> > > > > > > You've tried rebooting the system I assume?
> > > > > > >
> > > > > > > On Fri, Jun 25, 2021, 06:39 Stephan Schott <schottve.hhu.de>
> > > wrote:
> > > > > > >
> > > > > > > > Ok interesting. Could that account for almost twice as much
> as
> > > > needed
> > > > > > > with
> > > > > > > > skinnb=3 though? I can tell you that when the processes are
> > > > spawning,
> > > > > > > they
> > > > > > > > seem to break at around 4 GBs. Of course, the resolution of
> > > > > nvidia-smi
> > > > > > > > might just not be enough to resolve a very rapid spike.
> > > > > > > >
> > > > > > > > The GPUs are allocated solely for this; no other process is
> > > > reported
> > > > > on
> > > > > > > > nvidia-smi at least.
> > > > > > > >
> > > > > > > > El vie, 25 jun 2021 a las 15:35, Scott Le Grand (<
> > > > > > varelse2005.gmail.com
> > > > > > > >)
> > > > > > > > escribi贸:
> > > > > > > >
> > > > > > > > > With a smaller skin value you end up with more non-bond
> cells
> > > and
> > > > > > each
> > > > > > > of
> > > > > > > > > those non-bond cells have memory allocated to them that is
> in
> > > > > excess
> > > > > > of
> > > > > > > > > what they would normally need so you might be getting some
> > sort
> > > > of
> > > > > > > weird
> > > > > > > > > trade-off between more smaller cells and fewer larger
> cells.
> > > The
> > > > > > other
> > > > > > > > > thing to ask is whether there are any other processes
> running
> > > on
> > > > > the
> > > > > > > GPU
> > > > > > > > > that might be eating memory.
> > > > > > > > >
> > > > > > > > > On Fri, Jun 25, 2021, 06:33 Stephan Schott <
> schottve.hhu.de>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > It's 743538 SIRAH particles. The memory reported is 16160
> > MB,
> > > > non
> > > > > > > > > > virtualized. Using skinnb = 3 is using 9340 MB right now.
> > > > > > > > > >
> > > > > > > > > > El vie, 25 jun 2021 a las 15:22, Scott Le Grand (<
> > > > > > > > varelse2005.gmail.com
> > > > > > > > > >)
> > > > > > > > > > escribi贸:
> > > > > > > > > >
> > > > > > > > > > > How many atoms? How much memory on the V100? Is it
> > > > virtualized?
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Jun 25, 2021 at 1:53 AM Stephan Schott <
> > > > > schottve.hhu.de>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > I am playing around with a SIRAH system and just
> > stumbled
> > > > > upon
> > > > > > > this
> > > > > > > > > > > message
> > > > > > > > > > > > consistently whenever I use the default skinnb value
> of
> > > 2,
> > > > > but
> > > > > > it
> > > > > > > > > runs
> > > > > > > > > > > > without issue when it is increased to 3 for whatever
> > > > reason.
> > > > > > Just
> > > > > > > > for
> > > > > > > > > > > some
> > > > > > > > > > > > info, this is on a Tesla V100, driver 460.32.03,
> using
> > > > Amber
> > > > > 21
> > > > > > > > > > compiled
> > > > > > > > > > > > with CUDA 11.0 and gcc 9.3.
> > > > > > > > > > > > My GPU architecture knowledge is very barebones, but
> it
> > > > seems
> > > > > > > like
> > > > > > > > > > there
> > > > > > > > > > > > might be something odd going on, as I would expect
> more
> > > > > memory
> > > > > > > > usage
> > > > > > > > > > the
> > > > > > > > > > > > more particles are included in skinnb, right? Has
> this
> > > > > > something
> > > > > > > to
> > > > > > > > > do
> > > > > > > > > > > with
> > > > > > > > > > > > the partitioning within the GPU memory? I would be
> > > grateful
> > > > > if
> > > > > > > > > someone
> > > > > > > > > > > > could tell me if I am just missing something here.
> > > > > > > > > > > > Best regards,
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Stephan Schott Verdugo
> > > > > > > > > > > > Biochemist
> > > > > > > > > > > >
> > > > > > > > > > > > Heinrich-Heine-Universitaet Duesseldorf
> > > > > > > > > > > > Institut fuer Pharm. und Med. Chemie
> > > > > > > > > > > > Universitaetsstr. 1
> > > > > > > > > > > > 40225 Duesseldorf
> > > > > > > > > > > > Germany
> > > > > > > > > > > > _______________________________________________
> > > > > > > > > > > > AMBER-Developers mailing list
> > > > > > > > > > > > AMBER-Developers.ambermd.org
> > > > > > > > > > > >
> > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > > > > > > > >
> > > > > > > > > > > _______________________________________________
> > > > > > > > > > > AMBER-Developers mailing list
> > > > > > > > > > > AMBER-Developers.ambermd.org
> > > > > > > > > > >
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Stephan Schott Verdugo
> > > > > > > > > > Biochemist
> > > > > > > > > >
> > > > > > > > > > Heinrich-Heine-Universitaet Duesseldorf
> > > > > > > > > > Institut fuer Pharm. und Med. Chemie
> > > > > > > > > > Universitaetsstr. 1
> > > > > > > > > > 40225 Duesseldorf
> > > > > > > > > > Germany
> > > > > > > > > > _______________________________________________
> > > > > > > > > > AMBER-Developers mailing list
> > > > > > > > > > AMBER-Developers.ambermd.org
> > > > > > > > > >
> http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > AMBER-Developers mailing list
> > > > > > > > > AMBER-Developers.ambermd.org
> > > > > > > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Stephan Schott Verdugo
> > > > > > > > Biochemist
> > > > > > > >
> > > > > > > > Heinrich-Heine-Universitaet Duesseldorf
> > > > > > > > Institut fuer Pharm. und Med. Chemie
> > > > > > > > Universitaetsstr. 1
> > > > > > > > 40225 Duesseldorf
> > > > > > > > Germany
> > > > > > > > _______________________________________________
> > > > > > > > AMBER-Developers mailing list
> > > > > > > > AMBER-Developers.ambermd.org
> > > > > > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > > > >
> > > > > > > _______________________________________________
> > > > > > > AMBER-Developers mailing list
> > > > > > > AMBER-Developers.ambermd.org
> > > > > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Stephan Schott Verdugo
> > > > > > Biochemist
> > > > > >
> > > > > > Heinrich-Heine-Universitaet Duesseldorf
> > > > > > Institut fuer Pharm. und Med. Chemie
> > > > > > Universitaetsstr. 1
> > > > > > 40225 Duesseldorf
> > > > > > Germany
> > > > > > _______________________________________________
> > > > > > AMBER-Developers mailing list
> > > > > > AMBER-Developers.ambermd.org
> > > > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > > >
> > > > > _______________________________________________
> > > > > AMBER-Developers mailing list
> > > > > AMBER-Developers.ambermd.org
> > > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > > >
> > > >
> > > >
> > > > --
> > > > Stephan Schott Verdugo
> > > > Biochemist
> > > >
> > > > Heinrich-Heine-Universitaet Duesseldorf
> > > > Institut fuer Pharm. und Med. Chemie
> > > > Universitaetsstr. 1
> > > > 40225 Duesseldorf
> > > > Germany
> > > > _______________________________________________
> > > > AMBER-Developers mailing list
> > > > AMBER-Developers.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > > >
> > > _______________________________________________
> > > AMBER-Developers mailing list
> > > AMBER-Developers.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber-developers
> > >
> >
> >
> > --
> > Stephan Schott Verdugo
> > Biochemist
> >
> > Heinrich-Heine-Universitaet Duesseldorf
> > Institut fuer Pharm. und Med. Chemie
> > Universitaetsstr. 1
> > 40225 Duesseldorf
> > Germany
> > _______________________________________________
> > AMBER-Developers mailing list
> > AMBER-Developers.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber-developers
> >
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>


-- 
Stephan Schott Verdugo
Biochemist
Heinrich-Heine-Universitaet Duesseldorf
Institut fuer Pharm. und Med. Chemie
Universitaetsstr. 1
40225 Duesseldorf
Germany
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Fri Jun 25 2021 - 08:30:03 PDT
Custom Search