Re: [AMBER-Developers] [AMBER] ATTN GLYCAM users: CY atom type work around

From: Jodi Ann Hadden <jodih.uga.edu>
Date: Sun, 11 Dec 2011 21:38:16 +0000

This solution also sounds very reasonable.

On Dec 11, 2011, at 4:16 PM, B. Lachele Foley wrote:

> I wasn't thinking Wiki. I was thinking dedicated table in the /doc section that is referenced from the AmberTools manual (or even an appendix to the manual). I think all of the users would appreciate being able to easily look up atom types. And, if we do that, it's easy for everyone to find and use, and they have to update the table in the documentation, so they have to see.
>
> :-) Lachele
>
> Dr. B. Lachele Foley
> Complex Carbohydrate Research Center
> The University of Georgia
> Athens, GA USA
> lfoley.uga.edu
> http://glycam.ccrc.uga.edu
>
> ________________________________________
> From: Jodi Ann Hadden [jodih.uga.edu]
> Sent: Sunday, December 11, 2011 3:58 PM
> To: AMBER Developers Mailing List
> Cc: Xiaocong Wang; Matthew Tessier
> Subject: Re: [AMBER-Developers] [AMBER] ATTN GLYCAM users: CY atom type work around
>
> So this problem is relatively easy to fix -- one of us just has to change our CY atom type to a different 2-letter code. Our hesitation for that someone to be us is that GLYCAM-06 was published with atom type CY, that is the 2-letter code "CY" and all its associated parameters are literally laid out in tables in the GLYCAM-06 paper. It's not listed anywhere in the parm99 paper, which is apparently when it was first actually used on the protein side.
>
> This brings us to the bigger problem: As our existing FFs expand and new ones are developed, we need to be extra careful to avoid atom type overlaps so that FFs that are intended to work together to parameterize mixed systems can actually be used this way. We now have AMBER FFs for proteins, carbohydrates, lipids, DNA, small/drug molecules, all of which could theoretically be used in combination, yes?
>
> The only atom type standard I'm aware of right now is that GAFF atom types are all lower case so that they never overlap with protein, etc. atom types, which are all upper case. Maybe it is useful going forward to extend standards to other FFs based on the class of molecules they parameterize. So proteins get all upper case (XX), GAFF gets all lower case (xx), then maybe as per Yong's suggestion, carbohydrates get upper first, lower second (Xx) and then maybe lipids get lower first, upper second (xX), etc. Something like that. Perhaps even numbers in one column or even in some cases symbols? Or perhaps extend atom types to 3-letter codes instead of 2? Just sort of just scheming out loud here...
>
> Anyway, molecule class specific atom type standards would just make everything straightforward and convenient for everyone because, when you develop your FF, you can choose whatever atom type codes you want as long as they match the appropriate molecule class format (XX, xx, Xx, xX, etc.) Mixed systems that combine different FFs for different classes of molecules no longer have to worry about overlaps in atom type because each molecule class has its own atom type format standard. It doesn't matter if two FFs for the same molecule class overlap because you wouldn't mix those. Does that make sense?
>
> The best part is that it avoids the communication lapses that lead to issues like the current one with CY where we have to go about organizing a wiki or something and call dibs on 2-letter codes, which are already very limiting even with case sensitivity. Not to mention the problem just crops up again every time you want to add a new atom type to your FF, you have to go to this theoretical wiki and see if the atom type is taken (and hope that the wiki's info is accurate and up to date) or you have to go fishing around in .dat files or mailing the list etc. and because everyone is so busy, communication lapses continue to happen and we end up eventually with another CY fiasco.
>
> Ok, so there is my idea. I'm now going to throw Xiaocong under the bus, because as our resident FF dev, he is supposed to be organizing resolution of this CY issue. I don't think he's actually on the dev list, so I'm CCing him to get him in on the conversation. With the developers meeting coming up, this seems like something worth getting the discussion started on -- What to do about CY and how to avoid this problem in the future.
>
> Jodi
>
>
> On Dec 11, 2011, at 12:52 PM, B. Lachele Foley wrote:
>
>> Parm91 had *something* assigned to almost every C[A-Z] plus some others. But, they weren't all used. For example, there are no actual parameters assigned to CY.
>>
>> $ grep -w CY parm91.dat
>> CY 13.02 ?
>> C CX CY
>>
>> So, it was, from an outsider's point of view, rather gratuitously reserved. It was also not used in parm94:
>>
>> $ grep -w CY parm94.dat
>> C C* CA CB CC CN CM CK CQ CW CV CR CA CX CY CD
>>
>> In both these cases, CY was just assigned as an equivalent to C for vdW info, which seems a pretty dangerous thing to do in general. But, more importantly, CY wasn't *actually* being used. So, it seemed a fair target.
>>
>> Glycam was first developed to work with Parm94. Around that time, the type CG (also previously used) was reassigned to mean "glycan carbon". I can't possibly comment on the relative timings, but at some point, both glycam and 99 started actually using CY. I do not know if either announced to the other. I'm pretty sure this all happened before I got here (and certainly before I knew enough to know what was happening).
>>
>> Making use of case-sensitivity is a reasonable way to go. But, it would be nice not to have to worry about conflicting atom types. And, perhaps we need only have a list of "check these other force fields before choosing". But, it might be nice, for this and other reasons, to have a central list of type names, the places where they are used and what they are used for.
>>
>> :-) Lachele
>>
>> Dr. B. Lachele Foley
>> Complex Carbohydrate Research Center
>> The University of Georgia
>> Athens, GA USA
>> lfoley.uga.edu
>> http://glycam.ccrc.uga.edu
>>
>> ________________________________________
>> From: Yong Duan [duan.ucdavis.edu]
>> Sent: Sunday, December 11, 2011 11:50 AM
>> To: AMBER Developers Mailing List
>> Subject: Re: [AMBER-Developers] [AMBER] ATTN GLYCAM users: CY atom type work around
>>
>> Hi Lachele,
>>
>> I thought CY is a rather ancient atom type and was in parm91.dat (and is
>> still there).
>>
>> With several groups working on various flavors of force fields and the
>> need of ever increasing level of sophistication, some collisions of
>> atom-type definitions are almost inevitable. It's probably a sign of
>> convergence of ideas (or collisions of ideas, depending how you look at
>> it). It is great that this type of problems are known before we send it to
>> the users (or, was it the case??) and a work-around is available.
>>
>> Testing by tleap/gleap/leap is one way to find things like this. But I
>> don't think it is realistic to have a bunch of tleap files that contain
>> all available information in parmxx.dat (and various versions). So,
>> something like what Scott's suggested will probably work better.
>>
>> One potential solution is to use lower case as in GAFF which uses lower
>> case 2-letter strings for atom types. You may imagine mixed lower-upper
>> cases for GLYCAM. So, a CY, if there is a need to re-parameterize/rename
>> in GLYCAM, you'd call it Cy or cY.
>>
>> --
>> Yong Duan, Ph.D, Professor
>> UC Davis Genome Center and
>> Department of Biomedical Engineering
>> University of California at Davis
>> Davis, CA 95616
>> 530-754-7632
>>
>>
>>
>>
>>
>>
>> On 12/10/11 10:26 AM, "B. Lachele Foley" <lfoley.uga.edu> wrote:
>>
>>> We've been trying to figure out a way to bring this type naming thing up.
>>> We would like to implement some sort of new-type-generation procedure
>>> that will help keep force fields from introducing duplicate types. This
>>> doesn't matter, of course, with two similar force fields, say for
>>> proteins. But, for force fields that might get mixed, it totally
>>> matters. When we chose CY, it almost certainly wasn't being used by
>>> anyone, and that's why we chose it. But, there isn't a good way to
>>> declare that, so other folks don't know to avoid it. We were planning to
>>> suggest a procedure rather than just complain, but... we are all so very
>>> overwhelmed.
>>>
>>> We did check ours against the other force fields. In fact, we have a
>>> program that will check for overlap between any two force fields, so we
>>> can do it automatically. I don't recall the most recent results, but
>>> there were more of them. I'll check into how hard it will be to share
>>> the program.
>>>
>>> Yeah, will fix the repo. Actually, we have a version h poised to pounce,
>>> too. We still haven't made tleap tests, either. Sigh.
>>>
>>> Oh.... last I recall... it works because... and I might have some of this
>>> backwards, but essentially: Leap takes the first atom type assignment
>>> but the last force field parameter assignment. So, we load leaprc for
>>> our stuff to set the atom types. Duplicate types in 99SB get ignored,
>>> but the params from SB overwrite ours. So, we reload our leaprc to set
>>> the params right, without affecting the types, which are all ignored.
>>>
>>> Got that info from observed behavior, not from inspecting code.
>>>
>>> :-) Lachele
>>>
>>> Dr. B. Lachele Foley
>>> Complex Carbohydrate Research Center
>>> The University of Georgia
>>> Athens, GA USA
>>> lfoley.uga.edu
>>> http://glycam.ccrc.uga.edu
>>>
>>> ________________________________________
>>> From: case [case.biomaps.rutgers.edu]
>>> Sent: Saturday, December 10, 2011 9:24 AM
>>> To: amber-developers.ambermd.org
>>> Subject: Re: [AMBER-Developers] [AMBER] ATTN GLYCAM users: CY atom type
>>> work around
>>>
>>> On Sat, Dec 10, 2011, Jodi Ann Hadden wrote:
>>>
>>>> Currently there is an issue with the CY atom type which overlaps in
>>>> ff99SB and GLYCAM.
>>>
>>> Thanks for looking into this. Note that ff10 and ff11r4 also have some
>>> new atom types for proteins. Can you check for additional name overlaps.
>>> (You need to switch the "ff11" branch to get ff11 parameters.)
>>>
>>>> - Go to www.glycam.org/params and download the most up-to-date version
>>>> of the parameters.
>>>
>>> It looks like Glycam_06g-1.dat is slightly different from the corresonding
>>> file (GLYCAM-06g.dat) in the git repo. Can you bring the latter up to
>>> date, probably putting the full id in the header line, removing references
>>> to "Amber 8" etc.
>>>
>>>> - If you are building a system that contains a carbohydrate, load
>>>> GLYCAM, ff99SB, then GLYCAM again.
>>>>
>>>> source leaprc.GLYCAM_06
>>>> source leaprc.ff99SB
>>>> source leaprc.GLYCAM_06
>>>
>>> Can you explain why this works? Is there a fix going forward that
>>> eliminates
>>> the need for this workaround?
>>>
>>> ...thanks!....dave
>>>
>>>
>>> _______________________________________________
>>> AMBER-Developers mailing list
>>> AMBER-Developers.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>>
>>>
>>>
>>> _______________________________________________
>>> AMBER-Developers mailing list
>>> AMBER-Developers.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>>
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>>
>>
>> _______________________________________________
>> AMBER-Developers mailing list
>> AMBER-Developers.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber-developers
>>
>>
>
>
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
>
>
> _______________________________________________
> AMBER-Developers mailing list
> AMBER-Developers.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber-developers
>
>



_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Sun Dec 11 2011 - 14:00:02 PST
Custom Search