You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been trying to test dvc with ssh remotes, but have yet to get it to work. I keep getting paramiko exceptions that seem to indicate that it (or dvc's usage of it) is unable to handle a lot of common ssh use cases (proxy commands, cert-authorities, etc.). I think paramiko's ssh model is fundamentally flawed, since it seems to require the user to re-implement a lot of the connection/authentication logic that the ssh CLI interface handles. That puts a huge burden on it's users, given how many use cases they'll have to try to figure out and support, and will lead to a support nightmare for products that use it, since you'll never be able to rely on the ssh CLI as a means for debugging.
I strongly suggest you try to abstract the remote backend to allow for swapping out the use of paramiko with direct calls to the ssh CLI. I think any backend abstraction will be time well spent since it will probably also allow you to support other remotes more easily.
I suggest looking at what git-annex is doing, since they support SSH remotes out-of-the-box with no issues whatsoever (even with my peculiar ssh config). They appear to just be exec'ing the ssh client directly (you can see the ssh commands in their verbose logging output). They also support a wide variety of special remotes.
For the record: we've discussed before that we might support different backends as we do with git https://github.com/iterative/dvc/tree/master/dvc/scm/git/backend . Likely it will be a similar situation here too: CLI ssh, pure python paramiko and libssh. Or maybe we could use CLI ssh for auth and then reuse the channel with another library, since that's where majority of the problems usually occur.
A very long time ago we used to use ssh CLI in dvc, but switched to paramiko because it was much easier to use programmatically and it works just fine in the majority of (simple)cases.
efiop
changed the title
many problems with ssh remotes, suggest replacing paramiko with direct usage of ssh
ssh: support different backends (CLI ssh, paramiko, libssh, etc)
Feb 3, 2021
I've been trying to test dvc with ssh remotes, but have yet to get it to work. I keep getting paramiko exceptions that seem to indicate that it (or dvc's usage of it) is unable to handle a lot of common ssh use cases (proxy commands, cert-authorities, etc.). I think paramiko's ssh model is fundamentally flawed, since it seems to require the user to re-implement a lot of the connection/authentication logic that the ssh CLI interface handles. That puts a huge burden on it's users, given how many use cases they'll have to try to figure out and support, and will lead to a support nightmare for products that use it, since you'll never be able to rely on the ssh CLI as a means for debugging.
I strongly suggest you try to abstract the remote backend to allow for swapping out the use of paramiko with direct calls to the ssh CLI. I think any backend abstraction will be time well spent since it will probably also allow you to support other remotes more easily.
I suggest looking at what git-annex is doing, since they support SSH remotes out-of-the-box with no issues whatsoever (even with my peculiar ssh config). They appear to just be exec'ing the ssh client directly (you can see the ssh commands in their verbose logging output). They also support a wide variety of special remotes.
Good luck!
Discord context: https://discord.com/channels/485586884165107732/485596304961962003/806310490589757450
The text was updated successfully, but these errors were encountered: