Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider making the sdk package a separate module #657

Closed
sagikazarmark opened this issue Jan 2, 2022 · 29 comments
Closed

Consider making the sdk package a separate module #657

sagikazarmark opened this issue Jan 2, 2022 · 29 comments

Comments

@sagikazarmark
Copy link
Contributor

It's a common practice to make packages that are supposed to be consumed by third party code (eg. plugins) separate modules. (Example: api packages in most Hashicorp software).

This would make the SDK package lighter in terms of dependencies (it doesn't need all deps of SFTPGo).

At the same time, until plugins are experimental, it might be easier to keep it in the same module (although making an existing package a module later has it's own quirks).

@drakkan
Copy link
Owner

drakkan commented Jan 2, 2022

Yes, this is the intended behaviour, I have to find the time to move some code around to make it happen.

@drakkan
Copy link
Owner

drakkan commented Jan 3, 2022

I made a quick attempt:

0001-WIP-sdk-separate-package.zip

The sdk does not import the kms package anymore but json deserialization fails because kms Secrets are now an interface and not a struct. After removing kms imports we should remove logger imports (this should be easier) to have a separate package.

For now I give up, I have other priorities and the Go compiler is smart enough to only include the real imported packages in the plugin binaries and not all the sftpgo deps.

@drakkan
Copy link
Owner

drakkan commented Jan 4, 2022

This is basically done with the latest commit. As expected there is no change in the plugins binary size (go.sum is lighter).

My first idea was to move the sdk package to github.com/sftpgo/sdk but if you look at vault they use a different approach

https://github.com/hashicorp/vault

for example sdk is a separate module but it lives in the same repo

https://github.com/hashicorp/vault/tree/main/sdk

They set tags such as sdk/v0.3.0. google-cloud-go does the same. So probably I have to leave the sdk package in this repo. Do you have experience with this? Thanks

@sagikazarmark
Copy link
Contributor Author

Using the sdk/v0.3.0 style (which is basically a "submodule" of the main module) is not really encouraged by Go unless absolutely necessary. Still, it works quite well in situations when you want to provide a client/SDK for your software and some of the code is shared between the two.

It makes evolution of the two modules easier (eg. breaking changes are easier this way).

This is what I meant when I originally suggested to make it a separate module.

If necessary/feasible, it can always be moved out to a separate repo as well later.

@drakkan
Copy link
Owner

drakkan commented Jan 4, 2022

Thanks, I have come to the same conclusion, there is some shared code and it will remain so in the future, eg. the user structure is defined in the sdk package and then used in sftpgo as well. I'll be using the same repo for now

@drakkan drakkan closed this as completed in 72fadd9 Jan 4, 2022
@drakkan
Copy link
Owner

drakkan commented Jan 4, 2022

uhm ...there are issues importing the module from external projects and the Docker build fails. I'll revert for now. Maybe I have to add a tag, no more time for today

@drakkan drakkan reopened this Jan 4, 2022
@sagikazarmark
Copy link
Contributor Author

Maybe I have to add a tag, no more time for today

You either have to tag or ensure to use the latest commit. Making an existing package a submodule is technically not backwards compatible, because go sometimes works with several versions of the same package at the same time (eg. pulled in as transitive dependency) and gets confused.

The solution is to tag a version and make sure that nothing imports the old version of the package.

@sagikazarmark
Copy link
Contributor Author

eg. the user structure is defined in the sdk package and then used in sftpgo as well

On a side note: unless it's necessary to use the same type from the same package (eg. in the plugin interface that plugins implement and the server uses), it might be better to duplicate structs and keep them in sync. Typically, structs that get encoded and transmitted that way between callers and the service can be duplicated.

Bottom line is: if you don't need to import a specific type from a specific package, it doesn't necessarily have to be shared code.

@drakkan
Copy link
Owner

drakkan commented Jan 5, 2022

The sdk package contains now only structs definition and the kms base implementation. This is required because the current kms plugins work this way:

If a master key is provided we first encrypt the plaintext data using the SFTPGo local provider and then we encrypt the resulting payload using the Cloud provider and store this ciphertext.

Maybe I can remove something from the sdk/kms package too.

Regarding the structs duplication I'm trying to minimize them, see for example here: VirtualFolders and FsConfig are duplicated to avoid to import the vfs package within the sdk.

I want to think a bit more about this but I'll probably make a separate package, for example github.com/sftpgo/sdk this seems cleaner and I want to use a different license for the sdk (LGPL or Apache) to allow users to develop proprietary plugins. Having multiple licenses in the same repo could be confusing

@drakkan drakkan closed this as completed in 1f619d5 Jan 6, 2022
@sagikazarmark
Copy link
Contributor Author

For what it's worth I tested the new SDK with the latest build of SFTPGo and a rewritten version of a hook into a plugin: works like a charm.

@drakkan
Copy link
Owner

drakkan commented Jan 7, 2022

Thanks for the confirmation, the SDK is now what it was originally intended. Sometime I need an open issue to find enough motivations to complete a task :). I'm a little lazy I suppose ...

As for plugins there are two more things in my TODO for a long time:

  1. understand if it is possible to have a single executable that implement multiple plugin interfaces. For example a single executable that implement both the eventsearcher and the notifiers plugins. I've done some quick tests in the past and this doesn't seem possible, but my tests haven't been accurate enough to be absolutely sure
  2. explore the best way to support io.Reader/Writer in plugins (GRPCBroker usage I think). This is required to expose vfs implementations as plugins. However browsing the go-plugins issues it seems there is a size limit for the trasmitted data that could be an issue for this usage

If you have time/interest it would be really helpful if you could help with these tasks, thanks

@sagikazarmark
Copy link
Contributor Author

Sometime I need an open issue to find enough motivations to complete a task :). I'm a little lazy I suppose ...

There is nothing wrong with that. :)

understand if it is possible to have a single executable that implement multiple plugin interfaces.

Good question. Not sure if hashicorp ever does that. It sounds like a nice to have.

This is required to expose vfs implementations as plugins

I'm not sure I'd do that or at least not for the majority of implementations, purely for performance reasons. Hooks that are async are not that important to be "fast", even pre-hooks can have a little bit of penalty, because they are called once.

I can see why extensibility could be important though. One thing that comes to mind is that the S3 API is kinda standard these days. A bunch of different storage solutions implement it to make integrations easier, so if someone needs a really custom storage solution integrated, chances are the S3 API will work for them.

Honestly, until there isn't any use case for supporting more file systems, I'm not sure I'd put efforts into generalizing the file system support.

If you want to open separate issues for these topics, we can probably discuss them more in-depth there and it'd be easier to track.

I'll definitely play a little with plugins, so chances are I will have some time looking into the first one.

@drakkan
Copy link
Owner

drakkan commented Jan 7, 2022

Sometime I need an open issue to find enough motivations to complete a task :). I'm a little lazy I suppose ...

There is nothing wrong with that. :)

understand if it is possible to have a single executable that implement multiple plugin interfaces.

Good question. Not sure if hashicorp ever does that. It sounds like a nice to have.

This is required to expose vfs implementations as plugins

I'm not sure I'd do that or at least not for the majority of implementations, purely for performance reasons. Hooks that are async are not that important to be "fast", even pre-hooks can have a little bit of penalty, because they are called once.

I can see why extensibility could be important though. One thing that comes to mind is that the S3 API is kinda standard these days. A bunch of different storage solutions implement it to make integrations easier, so if someone needs a really custom storage solution integrated, chances are the S3 API will work for them.

Honestly, until there isn't any use case for supporting more file systems, I'm not sure I'd put efforts into generalizing the file system support.

If you look at the SFTPGo forks you can see things like these

https://github.com/mever/sftpgo/blob/cli/vfs/clifs.go
https://github.com/Smithx10/sftpgo/blob/add_manta/vfs/manta.go

some users asked for OneDrive support etc.. I don't want to add all the possible storage providers in the SFTPGo core, I think the current vfs are enough.

This is a very low priority task for me and if I ever add a vfs plugin I have to make sure it works for all use cases and with acceptable performance

If you want to open separate issues for these topics, we can probably discuss them more in-depth there and it'd be easier to track.

I'll definitely play a little with plugins, so chances are I will have some time looking into the first one.

great, thanks

@sagikazarmark
Copy link
Contributor Author

Yeah, I'm not saying it shouldn't be done one way or another, I'm saying the current providers should not be converted into plugins.

One obvious solution for people is to keep forking and adding their own solutions. That will always be the most performant, so if a plugin based solution ever gets implemented, forking should probably be kept around and documented as the best solution from a performance perspective.

As for adding plugin support for other vfs solutions: maybe it doesn't have to be the same plugin based solution, but a simplified, custom protocol that's easy enough to implement.

An HTTP based solution is probably a good example, although it might still not be the most performant one. But users could easily just implement an interface, run their own "proxy" service in front of the storage that implements sftpgo's internal protocol.

Might be a huge venture though and I'm not sure it's worth the effort.

@drakkan
Copy link
Owner

drakkan commented Jan 7, 2022

Yes, the current providers will remain builtin, plugins will be eventually used for providers that I don't want to add directly to SFTPGo.

An HTTP/WebSocket based solution is exactly what I had in my mind, but not sure if I'll ever implement it.

It mostly depends on whether someone wants to sponsor such a development.

@sagikazarmark
Copy link
Contributor Author

It mostly depends on whether someone wants to sponsor such a development.

Yeah, that sounds like a good approach. Maybe you could set up a project board for ideas like this: features that are not likely to land, unless someone sponsors it. Enterprise oriented features, like custom VFS, auditing, etc.

@sagikazarmark
Copy link
Contributor Author

@drakkan any plans for tagging a new release? Anything I can do to help?

@drakkan
Copy link
Owner

drakkan commented Jan 13, 2022

I want to ship the UI improvement before v2.3 so I'll backport them and the sdk patch in v2.2.x branch and these changes will be included in v2.2.2. However making a new release still requires several time consuming manual steps (aws marketplace package, ubuntu ppa packages etc.).

I'm looking to finalize a major sponsorship these days and I'm focused on that. I think it will take a few weeks before the next update, sorry

@sagikazarmark
Copy link
Contributor Author

That's absolutely fine. Having an estimated timeline is more than enough for me to plan my next steps. Thanks!

@sagikazarmark
Copy link
Contributor Author

Also, congrats on the sponsorship! Good to hear news like this in times like this.

drakkan added a commit that referenced this issue Jan 13, 2022
The SFTPGo SDK now is at the following URL

https://github.com/sftpgo/sdk

Fixes #657

Signed-off-by: Nicola Murino <nicola.murino@gmail.com>
@drakkan
Copy link
Owner

drakkan commented Jan 13, 2022

Also, congrats on the sponsorship! Good to hear news like this in times like this.

Thanks, I just backported the related changes to 2.2.x branch, let's see if some bug will be reported in the next weeks

@sagikazarmark
Copy link
Contributor Author

Hm, that branch doesn't publish a Docker image, does it?

@drakkan
Copy link
Owner

drakkan commented Jan 13, 2022

Hm, that branch doesn't publish a Docker image, does it?

no, it only runs CI

@drakkan
Copy link
Owner

drakkan commented Jan 13, 2022

I enabled docker on that branch 8cd9e88

if this does not work can you please provide a patch? We should have the branch name in the resulting image, thank you

@sagikazarmark
Copy link
Contributor Author

Looks great, thanks!

@sagikazarmark
Copy link
Contributor Author

Friendly ping: is there a release coming our way? I've been running the current version in a dev environment and it works great.

@drakkan
Copy link
Owner

drakkan commented Feb 3, 2022

Maybe the next weekend, ideally I would like to see this merged, otherwise I have to use a replace. Currently SFTPGo always open a socket connection even if logging to journald is not enabled (it can be enabled only for the subsystem mode)

@sagikazarmark
Copy link
Contributor Author

Thanks for the update!

@drakkan
Copy link
Owner

drakkan commented Feb 3, 2022

Hi,

please give a try to the latest update (once CI ends), there are a lot of dependency updates. A notable update for your use case is the go storage library to v1.19. The current version will be likely tagged as v2.2.0 this weekend. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants