Re: [AMBER-Developers] CPPTRAJ: dpeaks clustering

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Wed, 1 Jun 2016 08:38:29 -0600

Remember, by default COORDS data sets are saved in memory so that will
reduce the total memory available as well. You can load the
coordinates as a TRAJ data set instead so that the data stays on the
disk. The only issue there is that TRAJ data sets cannot be modified,
so you will have to save the trajectory separately. So just have input
for the first run go something like:

parm protein.prmtop
trajin md-reimaged.nc
## RMSD
reference pro.pdb
rms reference out rmsd-clustering.dat
:5-27,42-62,80-101,122-144,167-189,210-230,245-264.CA
rmsd :278.*&!.H= reference nofit out rmsd-lig-clustering.dat
strip :1-277 outprefix strip
trajout forcluster.nc

Then input for the second run would be something like this:

parm strip.protein.prmtop
loadtraj forcluster.nc name MyTraj
cluster crdset MyTraj ...

Now the only large memory requirement will be for the pairwise matrix.
In the meantime I'm working on some code that will eliminate the
pairwise matrix memory requirement (at the cost of being slower). Hope
to get that done soon.

-Dan


On Tue, May 31, 2016 at 5:14 PM, Yinglong Miao <yimiao.ucsd.edu> wrote:
> Just a little more information: I previously ran cpptraj within a batch
> script, now as I run it alone, it output the following error:
>
> 0% 10% 20% 30% Segmentation fault (core dumped)
>
> Thanks,
> Yinglong
>
> On Tue, May 31, 2016 at 12:44 PM, Yinglong Miao <yimiao.ucsd.edu> wrote:
>
>> Hi Dan,
>>
>> Thanks for your reply. I also tried DBSCAN and seems dpeaks works better -
>> really hope dpeaks with full functionality will be available soon.
>>
>> And here is my input file:
>>
>> ----
>> parm protein.prmtop
>> trajin md-reimaged.nc
>> ## RMSD
>> reference pro.pdb
>> rms reference out rmsd-clustering.dat
>> :5-27,42-62,80-101,122-144,167-189,210-230,245-264.CA
>> rmsd :278.*&!.H= reference nofit out rmsd-lig-clustering.dat
>> strip :1-277
>> cluster C0 \
>> dpeaks epsilon 1 dvdfile density_vs_dist.dat choosepoints auto
>> distancecut 1.0 runavg runavg.dat deltafile delta.dat \
>> rms :1.*&!.H= nofit \
>> sieve 1 \
>> out cnumvtime.dat \
>> summary summary.dat \
>> info info.dat \
>> cpopvtime cpopvtime.agr normframe \
>> repout rep repfmt pdb \
>> singlerepout singlerep.nc singlerepfmt netcdf \
>> avgout Avg avgfmt restart
>> go
>> ----
>>
>> Yinglong
>>
>>
>> On Tue, May 31, 2016 at 12:03 PM, Daniel Roe <daniel.r.roe.gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> On Tue, May 31, 2016 at 11:36 AM, Yinglong Miao <yinglong.miao.gmail.com>
>>> wrote:
>>> > This may be also a question on the Users mailing list, but as I try the
>>> > cpptraj density peaks (dpeaks) clustering (which is great!), I notice
>>> the
>>> > following warning regarding seiving frames:
>>> > ...
>>> > Restoring sieved frames.
>>> > FIXME: Adding sieved frames not yet supported.
>>> >
>>> > I'm wondering whether this will be coded in the near future.
>>>
>>> Eventually, 'dpeaks' clustering is still under development and I don't
>>> have a time frame for when it will be complete. As is stated in the
>>> output, 'dpeaks' as currently implemented should be used with caution.
>>> You may want to try the DBSCAN method which also density based and is
>>> a little better tested.
>>>
>>> > Also is there
>>> > a limit on the data size for the clustering other than memory usage? The
>>> > workstation I'm running has ~64G memory, but the program exited in the
>>> > middle of matrix calculation (crashed?) even with apparently enough
>>> memory:
>>> >
>>> > Estimated pair-wise matrix memory usage: > 16.587 GB
>>> > Pair-wise matrix set up, 91068 frames
>>> > 0% 10% 20% 30% mkdir: cannot create directory
>>> > `cluster-dpeaks-e1-ncstep1-sieve1': File exists
>>>
>>> That doesn't look like a memory error to me. Can you provide your
>>> exact cpptraj input?
>>>
>>> -Dan
>>>
>>> >
>>> > Any suggestions will be appreciated,
>>> >
>>> > Yinglong
>>> >
>>> > Yinglong Miao, Ph.D.
>>> > Research Specialist I, Howard Hughes Medical Institute
>>> > Assistant Project Scientist, Department of Pharmacology
>>> > University of California, San Diego
>>> > Tel: 858-822-0255; Email: yimiao.ucsd.edu
>>> > http://mccammon.ucsd.edu/~ymiao/
>>> > http://gamd.ucsd.edu
>>> > _______________________________________________
>>> > AMBER-Developers mailing list
>>> > AMBER-Developers.ambermd.org
>>> > http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>
>>>
>>>
>>> --
>>> -------------------------
>>> Daniel R. Roe, PhD
>>> Department of Medicinal Chemistry
>>> University of Utah
>>> 30 South 2000 East, Room 307
>>> Salt Lake City, UT 84112-5820
>>> http://home.chpc.utah.edu/~cheatham/
>>> (801) 587-9652
>>> (801) 585-6208 (Fax)
>>>
>>> _______________________________________________
>>> AMBER-Developers mailing list
>>> AMBER-Developers.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>
>>
>>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Wed Jun 01 2016 - 08:00:02 PDT
Custom Search