Re: [AMBER-Developers] Latest configure script seems to want to install tons of third party dependencies?

From: Jason Swails <jason.swails.gmail.com>
Date: Thu, 28 Jan 2016 11:24:33 -0500

On Thu, Jan 28, 2016 at 10:32 AM, Ross Walker <ross.rosswalker.co.uk> wrote:

> >
> > This workflow is currently utilized by many automated services and is
> > fairly robust. It is also secure -- Continuum's business depends on it.
> >
>
> There is no guarantee this is secure and certainly without some kind of
> contract in place (which could be as simple as someone going and
> registering with them) it definitely does not count as secure.
>

Well I hate to break it to you, but there are a lot of potential security
holes in Amber a heck of a lot bigger than downloading miniconda.


> > It also polutes the git tree and a make clean / make dist clean does not
> >> clear this up.
> >>
> >
> > ​Everything it does is put inside $AMBERHOME/miniconda. This is hardly
> > polluting. We can have make clean/make distclean clear this stuff out
> > quite easily ("make clean" should definitely not touch this, but there's
> an
> > argument to be made that distclean should). I can expand on why I
> *didn't*
> > have distclean remove this if you want my explanation.
> >
>
> make clean probably should not remove this but make distclean should in my
> opinion. Ultimately I'd expect make distclean to take me back to
> effectively immediately after I untarred things.
>

​If that's a popular sentiment, we can make that change. I viewed the
point of "distclean" to return you to a state where you can use another
compiler, which necessitates removing everything *except* miniconda. Since
miniconda (and everything inside it) is a sizeable download, I chose not to
have distclean remove it to save bandwidth. I see arguments for both ways,
and would probably lean slightly toward having distclean remove miniconda.
I went ahead and made the change, which can be easily undone if enough
people have strong opinions about it (but I suspect that won't be the case).

> In my opinion we really really really should not be depending on tons of
> >> external libraries and the configure script in AMBER should definitely
> NOT
> >> be connecting to any external sites EXCEPT ambermd.org.
> >>
> >
> > You're certainly entitled to your opinion, but mine is that this stance
> is
> > rather arbitrarily purist. The reason I say it's arbitrary is because it
> > would be easy to simply host those files on ambermd.org -- but why not
> take
> > advantage of their (i.e., Continuum's) better web infrastructure (more
> > reliable web service, more bandwidth, faster connection to more areas,
> ...
> > etc.)?
> >
>
> It is definitely not arbitrarily purist. It is perfectly reasonable and
> widely used business practice. This is connecting to something and
> downloading without the users permission.


​It's *not* doing this without users' permission (anymore). Also, many
large businesses and institutions rely heavily on third-party
infrastructure to power their own services. In fact, if you look at the
tech community these days there are increasingly more pushes to outsource
some services to dedicated providers (Python moving to Github, many
universities powering email through Google, Apple/Google/Microsoft all
moving significant projects to Github, etc.) AMBER is hardly a large
business with lots of resources. Also, the conda license is quite
permissive and allows source-code and binary redistribution:
https://github.com/conda/conda/blob/master/LICENSE.txt.



> This is very bad practice (and possible illegal in a number of countries).
> We should be very careful here. We can't assume that an internet connection
> is available for installing AMBER and even if it is we can't be downloading
> something that requires us to rely on a third party. Yum install blah etc
> is fine because it uses repositories that users explicitly agreed to when
> installing the operating system. Note most major companies host their own
> repositories internally and require there installations to have passed
> internal security audits. This circumvents that and would raise a lot of
> red flags at any pharma company for example that uses AMBER.
>

​Amber relies heavily on system calls, and a large number of users install
Amber using root privileges. The updating mechanism also provides a hook
to change filesystem contents. Even I, with no experience (or desire) to
compromise peoples' computers, could pretty easily penetrate a fair number
of users' computers through standard channels. The (optional) miniconda
install is simply not a sensible vector for attack (and Continuum products
are used extensively in industry -- from big Pharma to wall street).

I haven't updated in a few days because I am traveling with extremely
> limited internet connectivity - the other problem I ran across here when
> tethered to my phone it tried to download a ton of stuff just from me
> running ./configure. If the new version asks for permission this better -
> although I'd hope that the description of what it is going to do and what
> is involved (and what it involves the user explicitly consenting to) should
> be very clear.
>

​My hope is that it is clear. Any feedback indicating that it's not (and
how it's unclear) will help me improve clarity.

> It's also circumvented entirely by specifying a python to use with Amber.
>
> Can it not be the reverse? This would seem much better to me. E.g. test
> for all different combinations etc and only if it finds something that
> doesn't work does it then ask one to specify the python to use or offers to
> download miniconda? E.g. I have no idea which python's are on my machine,
> where they are etc. I expect it to just find python in the path and work.
> This is how it should be by default unless I explicitly specify otherwise.
>

​I agree, and this is how things behave now. configure will analyze your
current Python for version and prerequisite compatibility with the needs of
our Python code base. If it is insufficient to satisfy basic needs, then
configure will politely ask if you would like it to fill this prerequisite
for you.

Up until now, our philosophy has been "write your Python code to be
compatible with any system Python a user might have". This was OK even 3
or 4 years ago, but it's no longer a sustainable path (which is clear to
anybody doing sustained Python development). ParmEd dropped support for
anything older than Python 2.7, which by extension removed Python
2.4/2.5/2.6 support for programs that depend on it, like MMPBSA.py,
cpinutil.py, and parmed (required for 12-6-4, pmemd-TI, and implementing
some force fields like the Garcia-Chen RNA FF and CHARMM and GROMACS
conversions to Amber). It also dropped support for Python installations
without numpy. Since many users want to do their canned MD-MMPBSA.py
calculations, I want a solution that will prevent these users from suddenly
having to be "experts" to install Amber in a way that will work. And I
want a solution that will not cripple my ability to drive ParmEd
development forward (so please no comments like "just restore Python 2.4
compatibility").

> For instance, if you do `./configure --with-python /usr/bin/python gnu`,
> it
> > will simply use /usr/bin/python and never attempt to build its own.
>
> Why can't we just do 'which python' and have that be the default and then
> you can override it if you are an expert user?
>

​As long as people follow the instructions detailed in
http://ambermd.org/ubuntu.html, installing all packages listed there
(including the Python ones), this is what configure will do.

> I will also fix this to not be a fatal issue when connectivity is a
> problem.
> >
>
> That's a definite requirement since there are may systems where people
> install AMBER that are not freely connected to the outside world. Granted
> rarely in academia, where things tend to be loose security wise, but in a
> commercial environment this is the norm.
>

​I think this is actually a bit of a moot point now. Since configure asks
"Do you want me to go online and get a working Python for your OS that will
work with all of Amber", people without connectivity should obviously say
"no". And this question is only asked if the default Python doesn't
satisfy basic requirements.

I really do hope I've eased your biggest concerns with the latest version
in the git repo, so please check that out when you have a chance and see if
there are lingering issues. IMO, this kind of feedback is helping me make
this whole thing a lot better.

Thanks,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER-Developers mailing list
AMBER-Developers.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber-developers
Received on Thu Jan 28 2016 - 08:30:05 PST
Custom Search