Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document AUTOBAHN_STRIP_XBR install-time option #1371

Open
jameshilliard opened this issue Apr 24, 2020 · 24 comments
Open

Document AUTOBAHN_STRIP_XBR install-time option #1371

jameshilliard opened this issue Apr 24, 2020 · 24 comments

Comments

@jameshilliard
Copy link
Contributor

jameshilliard commented Apr 24, 2020

The current behavior for autobahn is to install a bunch of xbr modules even when the user does not request them to be installed.

There are a number of reason this is a bad idea:

  1. Packages should not install modules that are not relevant to the typical use case for the package without strong technical justification as this can significantly increase attack surface, especially for a package like autobahn which already has significant network attack surface(in one of my projects it's effectively the only package doing any remote network communications over public networks other than the usual DNS/DHCP/VPN client daemons).

  2. Over time the demarcation point between the xbr modules and autobahn may become less clear which increases the security audit complexity, already this seems to be starting to happen as there are xbr modules in multiple different autobahn paths. By splitting the modules into a separate package it becomes very obvious that they are not relevant to the typical use case and thus can be ignored when auditing security.

  3. In regards to the security of xbr specifically, it is well known in the cryptocurrency community that the Ethereum project has a very poor track record in regards to security in general, in fact I have personally discovered security vulnerabilities in the design of some ETH network protocols in the past. Being able to fully remove potentially risky Ethereum project related code makes it easier to audit the security of autobahn.

  4. The maintenance of the conditional loading of imports is more complex than using conditional installation of the xbr modules.

The approach I took in #1369 to splitting out the xbr package from autobahn has a number of advantages:

  1. It cleanly separates out the xbr feature from the rest of autobahn so that one doesn't need to audit the security of the xbr side to ensure it can't accidentally cause runtime side effects when using base autobahn functionality.

  2. It still allows for xbr and autobahn to be developed in a tightly coupled way, for example by always having both packages require exact versions of each other in the setup.py(you can just script the package version bump to always bump and upload autobahn and xbr at the same time with matching versions). They will still share and install to the same autobahn site-packages namespace as before and can be installed in the same way by the end user, the main difference is that the development tree and distribution packages have cleanly separated functionality and that autobahn can be installed without xbr.

  3. It allows for simplification of the xbr parts of the codebase as one can entirely remove all the conditional dependency based import logic currently used throughout the xbr components as xbr would only ever be installed when all necessary dependencies are also installed.

@jameshilliard
Copy link
Contributor Author

jameshilliard commented Apr 24, 2020

I just realized the install/distribution size is a major issue, without this change the tar.gz compressed autobahn distribution package is roughly 1MB, uncompressed it is a massive 11.2MB(which is very large for resource constrained embedded systems like the one I use autobahn with). It seems the bundled ETH contracts are the main cause of this huge size increase as they are 8.8MB alone.

With #1369 autobahn is about 300KB when tar.gz compressed and 2.2MB uncompressed.

IMO this alone is a good enough reason to split the xbr modules out.

@oberstet
Copy link
Contributor

Packages should not install modules that are not relevant to the typical use case for the package without strong technical justification as this can significantly increase attack surface,

there are many aspects here .. one would be: in your app, you only need a strict subset of the code in autobahn. why not remove all code in autobahn that you don't need in your app?

you could even do that in the build step for your app, and it the app will have the minimal (internal) attack surface, as it only code that is strictly needed for the app is there anyways

Over time the demarcation point between the xbr modules and autobahn may become less clear which increases the security audit complexity

yes, we will evolve this together, as quite simply: XBR builds on top of WAMP which (can) build on top of WebSocket. All 3 are working as an integrated stack. hence we want close integration by design.

for example, we deliberately did not split into 2 packages for WebSocket and WAMP

if you don't need WAMP, but only WebSocket, you can have a custom build step downstream that removes all files you don't need

it is well known in the cryptocurrency community that the Ethereum project has a very poor track record in regards to security in general

is it? pls provide support for this claim (or its 2 parts). further: if you think so, don't use autobahn or any of my code then;) I am a big Ethereum fan, and it is a perfect complement to WAMP, as it takes the decentralization/decoupling to the multi-party level .. just my view. Security wise, there surely had been issues, and there is learning required how to write secure contracts etc. expected, as this is breaking new land ...

anyways: I agree to your main point: any code that is not required and not there cannot originate a security issue. so the goal is to minimize the code that does "the job" (=app). why not then go full circle and remove all code that your app doesn't need?

@oberstet
Copy link
Contributor

oberstet commented Apr 25, 2020

It seems the bundled ETH contracts are the main cause of this huge size increase as they are 8.8MB alone.

ok, yes, I can see how this sucks (if you don't need it).

mmh. thinking ... this is sth that we might be able to fix for itself without doing the big namespace thing:

we could split out the contracts bundle (the whole 8.8MB) into a separate (python) package (basically only containing the JSON files), but keep everything else untouched (besides then adding the new dependency on the abi package)

would that work for you? this would be much less intrusive and fits the current approach in autobahn (as with websocket and wamp, not split up. that is xbr py code stays in the one package, only the JSON files are in a 2nd package)

@oberstet
Copy link
Contributor

so an alternative to the full PEP420 approach would split out only the JSON files into a new xbr package. this would solve the size issue.

security wise, my argument would be like this:

  • a pip install autobahn without the xbr flavor will not install any of the "bad ethereum dependencies" (other packages)
  • it will still install our (autobahn internal) python code specific to xbr
  • all that code is in modules below autobahn.xbr and confined to this subdir https://github.com/crossbario/autobahn-python/tree/master/autobahn/xbr
  • any calling into that code specifically from the app side requires a respective import on the app side - which is easy to audit for (grep .. mostly)

@oberstet
Copy link
Contributor

btw, rgd the argument: for security, installed code should be absolutely minimized.

this will collide for example with users that required us to follow the (granted, best) practice of also including the tests in the package, eg https://github.com/crossbario/autobahn-python/tree/master/autobahn/twisted/test => is packaged up!

I think it was Debian which have a policy for tests to be included in a package.

now what? include? not include? =)

I am just trying to make a point here: if you wanna optimize for max security, then why not remove all code an app doesn't need. in fact, why not remove all drivers and OS parts not needed as well: http://rumpkernel.org/

@jameshilliard
Copy link
Contributor Author

jameshilliard commented Apr 25, 2020

there are many aspects here .. one would be: in your app, you only need a strict subset of the code in autobahn. why not remove all code in autobahn that you don't need in your app?

To a large degree I actually do this, I use buildroot which generates a heavily stripped down operating system image with only minimal dependencies, by doing that I can get my entire production OS image(bootloader+kernel+rootfs) including embedded web browser down to ~100MB compressed.

yes, we will evolve this together, as quite simply: XBR builds on top of WAMP which (can) build on top of WebSocket. All 3 are working as an integrated stack. hence we want close integration by design.

Yeah, that's why I took this approach since it should still allow for that development model, ie by cross pinning xbr and autobahn versions in setup.py so that you then don't have to worry about internal API breakage between the two.

if you don't need WAMP, but only WebSocket, you can have a custom build step downstream that removes all files you don't need

I use the WAMP functionality extensively.

further: if you think so, don't use autobahn or any of my code then;)

Heh, I haven't found any security flaws so far in autobahn, seeing a bunch of ethereum libraries kinda scares me though as I've seen the type of code written by Vitalik, I actually rewrote the transaction parsing code for a wallet project(joinmarket) that had unfortunately used vitalik buterin's pybitcointools which had regexes in the transaction parser, that was not an easy rewrite.

Security wise, there surely had been issues, and there is learning required how to write secure contracts etc. expected, as this is breaking new land ...

I'm quite skeptical writing a secure contract in with eth is possible for anything other than the most trivial one, especially when the company founded by the former ETH foundation CTO couldn't even write a multisig wallet contract that didn't get hacked, not once but twice.

@jameshilliard
Copy link
Contributor Author

jameshilliard commented Apr 25, 2020

so an alternative to the full PEP420 approach would split out only the JSON files into a new xbr package. this would solve the size issue.

At the start it would, but I assume the xbr module will grow significantly over time. If you're already splitting out the JSON files I don't really see why it would be much more difficult to also split the python parts as both packages would still need to be installed for xbr to function.

now what? include? not include? =)

Personally I would be on the not include side, but I mostly work with buildroot for OS development which makes an effort to remove tests from application builds.

I am just trying to make a point here: if you wanna optimize for max security, then why not remove all code an app doesn't need. in fact, why not remove all drivers and OS parts not needed as well: http://rumpkernel.org/

Heh, of course one must weigh the maintenance burden as well, I find buildroot generally does a pretty good job in regards to making good tradeoffs when it comes to that sort of thing. Part of the reason size is so critical for me is that to update my app/apps I essentially re-image the entire system every time, this way updates are effectively atomic and I don't have to worry about applications on the device getting incompatible versions, this sort of update method is only feasible if you strip everything down, I can even send the ~100mb compressed update images over 3g connections since they are small enough(which many of the devices I work with use for internet connectivity).

@oberstet
Copy link
Contributor

thanks a lot for your comments! interesting. also important for me to understand where you come from / how you look at this. because we need to strike a balance .. as the "include vs not include" example demonstrates, with different user groups come different priorities.

we have to strike a balance between the different priorities (our own as well .. and the resources we have).

of course one must weigh the maintenance burden as well,

yes, absolutely. applies to us as well;)


rgd the "size issue" specifically, pls have a look at this:

(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ du -hs autobahn
11M	autobahn
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ rm -rf autobahn/xbr/contracts
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ du -hs autobahn
2,4M	autobahn
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ rm -rf autobahn/wamp/gen
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ du -hs autobahn
2,0M	autobahn
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ rm -rf autobahn/nvx
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ du -hs autobahn
1,9M	autobahn
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ find autobahn -type d -name test*
autobahn/test
autobahn/twisted/test
autobahn/twisted/testing
autobahn/asyncio/test
autobahn/websocket/test
autobahn/rawsocket/test
autobahn/wamp/test
autobahn/xbr/test
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ find autobahn -type d -name test* -exec rm -rf {} \;
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ find autobahn -type d -name test*
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ du -hs autobahn
1,5M	autobahn
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ rm -rf autobahn/xbr/
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ du -hs autobahn
1,3M	autobahn
(cpy382_1) oberstet@intel-nuci7:~/scm/crossbario/autobahn-python$ 

so:

  • splitting out the ABI files alone would bring it down from 11M to 2,4M
  • splitting out the whole XBR thing would additionally save only 200K
  • however, if you would remove everything that you don't need, you could bring it down to 1,3M
  • there is no sane way for us to publish a package exactly with ^ (1,3M) - this should be done IMO downstream while baking your buildroot image ..

summary:

at this point of the discussion, I would be +1 on splitting out the ABI files into a separate package only. lemme know if you are cool with that.

if so, next question is: should we have to do the baking and publishing of that new package in this repo anyways?

maybe just publish the ABI file (only) to a py package directly from https://github.com/crossbario/xbr-protocol

after that, it'll just be a matter of changing a few import lines here to point to the new package ..

@jameshilliard
Copy link
Contributor Author

splitting out the whole XBR thing would additionally save only 200K

Well this is only at the moment, my assumption is that you will be significantly expanding the features for this over time resulting it causing significant size increases down the line.

at this point of the discussion, I would be +1 on splitting out the ABI files into a separate package only. lemme know if you are cool with that.

if so, next question is: should we have to do the baking and publishing of that new package in this repo anyways?

Well I'd prefer if the python files were also split out, maybe I'm missing something but I don't really see how splitting out the python files would be more difficult than just splitting out the json files, you effectively end up with having to maintain 2 separate pypi packages either way. If both projects are in the same development tree as in #1369 then you avoid the maintainence burden of having to keep 2 separate repos in sync for development.

The idea is that this way the development process for autobahn/xbr doesn't have to really change at all, just the pypi packaging process is changed effectively.

@oberstet
Copy link
Contributor

maybe I'm missing something but I don't really see how splitting out the python files would be more difficult than just splitting out the json files

we have a quite complex CI/CD pipeline (also include private repos), and I just think it will have less fallout onto this. it will require us to touch half a dozen repos in any case.

anways: we won't do the namespace thing. we would do the abi file thing - if you want. otherwise I would close this, as we don't care about size.

@jameshilliard
Copy link
Contributor Author

There's a setuptools example on how to exclude tests by the way. From my understanding it's best practices for python packages to have tests be part of the git tree but not the pypi packages.

@jameshilliard
Copy link
Contributor Author

we have a quite complex CI/CD pipeline (also include private repos), and I just think it will have less fallout onto this. it will require us to touch half a dozen repos in any case.

Ok, so if I'm understanding this correctly the issue is that they are building directly from master on this repository using pip and it would introduce extra complexity there during development due to having to install 2 packages?

If that's the case I think I have another option which would work better, it would be to just create 2 autobahn pypi packages, the origional autobahn without xbr and a separate autobahn-xbr or something like that which includes all the xbr stuff, with the xbr variant being the default one that gets built when doing pip install from the git repo, that way there should be no changes to the development process. This would be just be a change in the build process essentially.

@oberstet
Copy link
Contributor

From my understanding it's best practices for python packages to have tests be part of the git tree but not the pypi packages.

maybe? I wouldn't care that much. I also actually don't care about Debian policies - which definitely state the opposite!

my prios are simple: make it practically work great and secure for users - while making progress on lots of other things on my plate;)


OT: I had a look at

https://github.com/vbuterin/pybitcointools/blob/87806f3c984e258a5f30814a089b5c29cbcf0952/bitcoin/transaction.py#L46

mmmmh, granted, this looks a bit scary and obscure (non-pythonic) ..


If that's the case I think I have another option which would work better, it would be to just create 2 autobahn pypi packages ...

no, sorry, that's not what we want. you can fork the whole repo and remove anything you don't need

anyways, sorry, I have to work on other stuff now;)

@jameshilliard
Copy link
Contributor Author

no, sorry, that's not what we want. you can fork the whole repo and remove anything you don't need

I wasn't suggesting to create 2 repos, just to tweak the build process so that we generate 2 different pypi packages from the same codebase.

@oberstet
Copy link
Contributor

I wasn't suggesting to create 2 repos, just to tweak the build process so that we generate 2 different pypi packages from the same codebase.

the only change I would support is:

  • 1 repo and 1 package that includes the xbr code (as today, no change), but
  • split out the JSON files into a separate package created/published from xbr-protocol

this solves the size issue practically and addresses a real and valid user concern (that you raised) in my eyes.

it will result in some (limited) work on our side, but that's the balance with user concerns we're hoping to strike;)

maybe the actual issue you are trying to address (just a question) is a security hardening requirement:

"Our shipped product must only contain the minimum, stripped, strictly-required code"?

if so, then IMO this needs to be solved systematically during product image baking in a respective "remove/strip files" step.

trying to solve that on the library level is the wrong place ..

@jameshilliard
Copy link
Contributor Author

split out the JSON files into a separate package created/published from xbr-protocol

It doesn't really help a whole lot by itself as it wouldn't really simplify the buildroot packaging process for autobahn at all. I'm mostly looking for an upstream solution to avoid having to patch autobahn so heavily in buildroot as stripping lines like that is fairly brittle.

  • 1 repo and 1 package that includes the xbr code (as today, no change), but
  • split out the JSON files into a separate package created/published from xbr-protocol

One thing I can think of that may simplify the packaging while still staying within those parameters would be to add an environmental variable or something like that which can be passed to setup.py to tell it to exclude xbr modules from installation, that way the pypi repo would still contain xbr modules but there would be an option to strip the modules at install time essentially.

if so, then IMO this needs to be solved systematically during product image baking in a respective "remove/strip files" step.

The typical approach for having buildroot remove/strip files is to try and upstream flags/options to the respective upstream project that allow for excluding features/files, these flags would then be passed at build time to the package to tell it to not install the features/files, the reason this is the preferred approach is because removing/stripping files from applications is usually a process that is very specific to the application being stripped and is difficult to generalize, see my buildroot patch as en example for what it looks like when you don't have these flags/options.

@oberstet
Copy link
Contributor

It doesn't really help a whole lot by itself as it wouldn't really simplify the buildroot packaging process for autobahn at all.

I don't understand how that is, because if we split out the JSON files, adding only autobahn (and not the then new xbr ) would reduce the size by 8MB.

size problem solved.

what other problem are you refering to?

@jameshilliard
Copy link
Contributor Author

I think I found a non-intrusive buildroot compatible way to strip xbr modules in #1374 using an environmental variable. This technique could also be trivially modified to generate multiple flavors of autobahn pypi packages if desired.

what other problem are you refering to?

Currently to strip the xbr modules entirely one needs to hack up the setup.py file and purge directories as I did in my buildroot patch, this is a bit of a maintenance problem on the buildroot side.

@oberstet
Copy link
Contributor

I think I found a non-intrusive buildroot compatible way ..

yeah, awesome! I merged that. it indeed addresses all concerns=) happy we finally found one (well, you;) and thanks for your time+work!

@oberstet
Copy link
Contributor

fixed via #1374

@oberstet oberstet changed the title Autobahn should not install xbr modules unless requested Document AUTOBAHN_STRIP_XBR install-time option Apr 25, 2020
@oberstet
Copy link
Contributor

we should also document what AUTOBAHN_STRIP_XBR does (and is promising to continue to do):

  • not install autobahn.xbr modules

eg setting the env var while doing pip install autobahn[xbr] will install the xbr dependencies ...

@jameshilliard
Copy link
Contributor Author

eg setting the env var while doing pip install autobahn[xbr] will install the xbr dependencies ...

Think it's a good idea to have AUTOBAHN_STRIP_XBR disable xbr dependencies entirely when set since it's not really a sensible configuration to install autobahn[xbr] with a stripped autobahn?

I wish there was a way to programmatically detect which flavor of autobahn was requested, I poked through setuptools/pip code a bit looking for one but couldn't figure out a way.

@jameshilliard
Copy link
Contributor Author

@oberstet OT: If you're interested in chatting more about crypto use cases for autobahn I have a number of ideas/possible synergies with what I'm working on, I'm usually around in the #autobahn irc channel under the username Lightsword.

@oberstet
Copy link
Contributor

I wish there was a way to programmatically detect which flavor of autobahn was requested ..

yeah, if that would be available, we should just raise an exception at install time when a user had set AUTOBAHN_STRIP_XBR but also provided the xbr or all install flavors. that would be best.

if it is not possible to get the install flavor at run-time (reflect the value in use) - which I'm also not aware of - then the deps will be there, but not the modules. so xbr won't be effectively usable, but still added attack surface due to installed deps. well, that's life;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants