Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork multyvac #17

Open
rgbkrk opened this issue Dec 31, 2014 · 10 comments
Open

Fork multyvac #17

rgbkrk opened this issue Dec 31, 2014 · 10 comments

Comments

@rgbkrk
Copy link
Member

rgbkrk commented Dec 31, 2014

As we went along implementing what level of spec we could glean, there are some known limitations of multyvac:

  • Doesn't support Python 3, which is too bad because pickling has gotten more advanced
  • Fails at pickling Pandas DataFrames
  • Built under a different time, with different assumptions (now we have Docker!)
  • The source is not open
  • There are no signs that point to multyvac ever coming online
@rgbkrk
Copy link
Member Author

rgbkrk commented Dec 31, 2014

Instead of me continually editing above, here are more salient points.

If we go our own direction, we can have independent clients that do proper serialization for other languages like node, Ruby, etc. This is sounding awfully familiar to the Jupyter kernel spec as well as sense engines...

@smashwilson
Copy link
Member

👍 I'm definitely interested in doing this. I think the protocol is solid enough as a starting point (and we can always extend it with additional calls if we wanted to support more advanced use cases, like streaming results). I do want to keep going and support as much of multyvac as we can, because then existing code that's written against multyvac will work against cloudpipe without changes, which is pretty cool.

It looks like "cloudpipe" is available on pypi, rubygems, and npm (npm has a "cloud-pipe" package though). How does pip install cloudpipe sound to talk to cloudpipe?

If we go our own direction, we can have independent clients that do proper serialization for other languages like node, Ruby, etc. This is sounding awfully familiar to the Jupyter kernel spec as well as sense engines...

It would be awesome to provide a bridge to Jupyter kernels. We'd get instant serverside support for a ton of languages. We could add a "result_source" of "zeromq:" and expose that in the container... hmm. 💡

@rgbkrk
Copy link
Member Author

rgbkrk commented Dec 31, 2014

As I finally re-noticed, multyvac isn't actually the same interface as PiCloud's cloud library for Python. Docs for PiCloud. Since little was written about multyvac we have no idea why they made changes to how jobs were created (and where cloud.map went).

@rgbkrk
Copy link
Member Author

rgbkrk commented Dec 31, 2014

As I continue to dig here, it looks like PiCloud had a few configurables for alternate clusters of piclouds. This is in src/transport/network.py within the current release on PyPI:

__api_default_url = 'http://api.picloud.com/servers/list/'
server_list_url = cc.account_configurable('server_list_url',
                                       default=__api_default_url,
                                       comment="url to list of PiCloud servers",hidden=False)
#hack for users utilizing old api
if server_list_url == 'http://www.picloud.com/pyapi/servers/list/':
    server_list_url = __api_default_url

@rgbkrk
Copy link
Member Author

rgbkrk commented Dec 31, 2014

Makes me think they started with the "Let's do a pickle python code and run it remotely" idea then pivoted to "let's run whatever with a more generic API".

@rgbkrk
Copy link
Member Author

rgbkrk commented Dec 31, 2014

From that server_list_url, they find an access point:

    def resolve_by_serverlist(self):
        self.serverlist_resolved_url = True

        try:
            # Must not call send_request as that can re-enter this function on failure
            resp = self.send_request_helper(self.server_list_url, {})
        except CloudException:
            self.diagnose_network() # raises diagnostic exception if diagnosis fails
            raise # otherwise something wrong with PiCloud

        for accesspoint in resp['servers']:
            try:
                cloudLog.debug('Trying %s' % accesspoint)
                # see if we can connect to the server
                req = urllib2.Request(accesspoint)
                resp = urllib2_file.urlopen(req, timeout = 30.0)
                resp.read()
            except Exception:
                cloudLog.info('Could not connect to %s', exc_info = True)
                pass
            else:
                self.url = str(accesspoint)
...

@rgbkrk
Copy link
Member Author

rgbkrk commented Jan 22, 2015

Welp, I got multyvac to crap out on a simple job that submitted too quickly:

import multyvac
import os

api_url = os.environ['CLOUDPIPE_URL']
multyvac.config.set_key(api_key=os.environ['OS_USERNAME'],
                        api_secret_key=os.environ['OS_PASSWORD'],
                        api_url=api_url)

print(multyvac.get(multyvac.shell_submit('ls -la')).get_result())
Traceback (most recent call last):
  File "pi.py", line 12, in <module>
    print(multyvac.get(multyvac.shell_submit('ls -la')).get_result())
  File "/usr/local/lib/python2.7/site-packages/multyvac/job.py", line 342, in shell_submit
    return r['jids'][0]
KeyError: 'jids'

I'll look into this later, but I think there needs to be better handling on multyvac's side.

@rgbkrk
Copy link
Member Author

rgbkrk commented Jan 22, 2015

Weird. Sometimes. multyvac.shell_submit('ls -la') gets back jids and sometimes it doesn't. Guess I'll need to dig into that.

@rgbkrk
Copy link
Member Author

rgbkrk commented Jan 22, 2015

In [1]: %run pi.py
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/Users/kyle6475/go/src/github.com/rgbkrk/cloudpipe-rackspace-demo/pi.py in <module>()
     10                         api_url=api_url)
     11
---> 12 jid = multyvac.shell_submit('ls -la')
     13 print(jid)
     14

/usr/local/lib/python2.7/site-packages/multyvac/job.pyc in shell_submit(self, cmd, _name, _core, _multicore, _layer, _vol, _env, _result_source, _result_type, _max_runtime, _profile, _restartable, _tags, _depends_on, _stdin)
    340                                data=payload,
    341                                content_type_json=True)
--> 342         return r['jids'][0]
    343
    344     def _get_auto_module_volume_name(self):

KeyError: 'jids'

In [2]: !cat pi.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import multyvac
import os

api_url = os.environ.get('CLOUDPIPE_URL', 'http://api.cloudpi.pe/v1/')
multyvac.config.set_key(api_key=os.environ['OS_USERNAME'],
                        api_secret_key=os.environ['OS_PASSWORD'],
                        api_url=api_url)

jid = multyvac.shell_submit('ls -la')
print(jid)



In [3]: %paste
import multyvac
import os

api_url = os.environ.get('CLOUDPIPE_URL', 'http://api.cloudpi.pe/v1/')
multyvac.config.set_key(api_key=os.environ['OS_USERNAME'],
                        api_secret_key=os.environ['OS_PASSWORD'],
                        api_url=api_url)

jid = multyvac.shell_submit('ls -la')
print(jid)

## -- End pasted text --
12

@rgbkrk
Copy link
Member Author

rgbkrk commented Jan 22, 2015

On a separate note, make sure to use fig rm -v instead of fig rm so you don't have loose volumes abound.

@smashwilson smashwilson added this to the v0.0.3 milestone Feb 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants