Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-884632: Move default config files to a "standard" location #1678

Open
sfc-gh-jhollan opened this issue Aug 2, 2023 · 13 comments
Open
Labels
feature status-triage_done Initial triage done, will be further handled by the driver team

Comments

@sfc-gh-jhollan
Copy link

What is the current behavior?

Today we write config files to
~/Library/Application Support/snowflake/filename
and the AppData directory in Windows

What is the desired behavior?

This is a normal place for GUI app config, but seems very strange for SDK place. Usually I'm used to:

~/.snowflake/filename
or %HOME_DIRECTORY%/.snowflake/filename

Similar to where like AWS and Azure store config files

How would this improve snowflake-connector-python?

When I go poking around to edit my config (e.g. update password) I'd know where to look :D

References and other background

https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html#Where-are-configuration-settings-stored

@github-actions github-actions bot changed the title Move default config files to a "standard" location SNOW-884632: Move default config files to a "standard" location Aug 2, 2023
@sfc-gh-sfan
Copy link
Contributor

This feels like a big behavior change so we need to handle it carefully, I'm a bit worried that the benefit might not worth the effort 🤔 cc @sfc-gh-mkeller who worked on config file improvements.

@sfc-gh-jhollan
Copy link
Author

Do we know how long the file has been here? If it's a breaking change I am less passionate. I thought this was a newer functionality as part of our unified config story

@sfc-gh-bwarsaw
Copy link
Contributor

@sfc-gh-mkeller can confirm, but I think we only use platform dirs if ~/.snowflake/ doesn't exist. If it does, then we use that. I think also we don't create ~/.snowflake/ if it doesn't exist yet.

@sfc-gh-jhollan
Copy link
Author

that's very possible. Any reason we don't just create ~/.snowflake/ if it doesn't exist and we'd default to the app location?

@sfc-gh-sfan
Copy link
Contributor

Oh if we already prefer ~/.snowflake but just fallback to app location then the scope of the change would be much smaller than I thought :)

@sfc-gh-mkeller
Copy link
Collaborator

sfc-gh-mkeller commented Aug 3, 2023

Any reason we don't just create ~/.snowflake/ if it doesn't exist and we'd default to the app location?

@sfc-gh-jhollan this is a library, we leave doing that to actual applications. It'd be a really weird experience if importing a Python module would create directories and config files. Also because if that folder doesn't exist we fall back to the other places, so this doesn't really work well with what we already have.

@sfc-gh-mkeller
Copy link
Collaborator

However; note that this folder is configurable with the SNOWFLAKE_HOME environment variable. So if you set that variable and that folder exists then we use that for everything! Cache files, config files and so on

@sfc-gh-jhollan
Copy link
Author

It'd be a really weird experience if importing a Python module would create directories and config files

In general agree. I could see us wanting to push this to the apps themselves. But I thought part of the goal here was that we wanted all apps (Snowpark, CLI, Snowpark ML) to be able to use a unified file. I could pivot this issue and say "hey SnowCLI, when you create a config file (which I do think makes sense, especially when running something like snow connection add), you should set SNOWFLAKE_HOME to ~/.snowflake and create the folder if it doesn't exist. My only concern is then the other apps that build on this may have to implement the same logic, or else we should just unify on them doing this Library directory.

At the end of the day a folder and file will need to be created (ideally not by just import, but a gesture or import of this CONFIG_FILE will be used to help in doing that), so just trying to think through how we can make that consistent across all apps and this feels like the one common denominator.

@sfc-gh-jhollan
Copy link
Author

sfc-gh-jhollan commented Aug 3, 2023

ah wait I see code here - maybe this is standard python behavior to use PlatformDirs instead of just another spot for config. dunno. Will yield on you all / @sfc-gh-bwarsaw for that. If this is the case and the code stays as follows than having the CLI create the folder if it doesn’t exist does seem like it should work / be picked up by other clients using the shared connector (minus the fact that the default if SNOWFLAKE_HOME not defined I think would fail on Windows)

snowflake_home = os.path.expanduser(
os.environ.get("SNOWFLAKE_HOME", "~/.snowflake/"),
)
if os.path.exists(snowflake_home):
return SFPlatformDirs(
snowflake_home,
**platformdir_kwargs,
)
else:
# In case SNOWFLAKE_HOME does not exist we fall back to using
# platformdirs to determine where system files should be placed. Please
# see docs for all the directories defined in the module at
# https://platformdirs.readthedocs.io/
return PlatformDirs(**platformdir_kwargs)

Though still a little stuck on the fact that either way a folder gets created. It's really just "how obscure / out of the way is the folder that's being created" - which isn't necessarily ideal when the folder / file that gets created we do want people to be able to edit / update / find.

@sfc-gh-bwarsaw
Copy link
Contributor

To follow up here from Slack: I think it's fine (maybe even desired) for the config library to provide a function which creates the directory if it doesn't exist, perhaps even to populate it, but we definitely do not want this to happen implicitly at import time. In general, you want to keep module globals as lean as possible to imports have the least effect on start up times. It's usually bad practice to do a lot of work at import time (e.g. stat'ing a directory, creating it if missing, populating it, etc.) because that effects every downstream dependent library and application. It's better for consuming applications to decide when to do that check (e.g. snow could defer that so it doesn't happen at --help time).

@sfc-gh-jhollan
Copy link
Author

sfc-gh-jhollan commented Aug 3, 2023

ok maybe we're all aligned roughly - I think my proposal would then: could we just remove this if/else statement always resolve snowflake dirs to snowflake home? just removing the if os.path.exists?

(this is partially based on the assumption that even today doing an import connector doesn't go and create a bunch of folders / files, it looks like there's just a method to provide a hint to clients where config files could be read from -- this CONFIG_FILE that snowcli is using)

@sfc-gh-bwarsaw
Copy link
Contributor

@sfc-gh-jhollan

I think my proposal would then: could we just remove this if/else statement always resolve snowflake dirs to snowflake home?

Are you suggesting we do not support platform dirs, including $XDG_CONFIG_HOME etc. on Linux? That might break other assumptions users expect. OTOH, $SNOWFLAKE_HOME might serve our needs although that kind of mixes files like config and cache files.

It's just that if we aren't going to support platform dirs, I want that to be a deliberate decision.

@sfc-gh-jhollan
Copy link
Author

I don't know enough about what XDG_CONFIG_HOME is.. but maybe stepping back, here's my hope:

  1. A new user who installs the Snowflake CLI and does snow connection add will have a config file at ./snowflake/config.toml that has the info. If they need to edit / update it, they can edit / update it there
  2. The same ideally would be true for the VS code extension and other client side tools that require a connection to Snowflake
  3. If a user wants to use any of our connectors (Python / Snowpark for now), they could follow 1/2 above, or they could just manually create ./snowflake/config.toml - either way, when I do something like snowflake.connect(connection='my_connection') and it will look at that file and connect.

I think in general all of these potentially could work today (I don't know what #3 looks like yet, but I assume we're headed that way). The main issue is that for #1, the CLI has logic that says "import CONFIG_FILE from this library, and that's where to create the config.toml". If ~/.snowflake doesn't exist, CONFIG_FILE returns this ~/Library... directory. I would love to standardize this config.toml in the ~/.snowflake location.

So I don't know the best way to do that. Maybe we need a new global var for client tools to know where to create / look for config? Maybe we remove the else statement? I don't know. I don't really want to mess up any other parts of the library or any other config / files we may or may not write, but for this config data it feels weird to me to have in such a nested / undiscoverable location

@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage_done Initial triage done, will be further handled by the driver team and removed needs triage labels Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

5 participants