New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Airflow doesn't re-use a secrets backend instance when loading configuration values #25555
Comments
Fixes apache#25555 This uses `functools.lru_cache` to re-use the same secrets backend instance between the `conf` global when it loads configuration from secrets and uses outside the `configuration` module like variables and connections. Previously, each fetch of a configuration value from secrets would use its own secrets backend instance.
This is not possible due to distributed nature of Airlfow. Secret client can be potentially used by scheduler and by multiuple workers - even if you just use a Local Executor you have potentially many independent processes and you cannot share client between the processes (even if they are forked, it is problematic, but when they are spawned it's impossible). Solving the problem would require basically to implement some kind of agent that would shield connection from Airflow. Lucklily Vault has a ready-to use solution like that solves your problem - you just need to deploy it https://www.vaultproject.io/docs/agent/caching |
I understand that the instance will not be used across processes, but aren't there cases where the same process will either need to look up multiple secret configuration values (for example, the webserver secret key and the sql alchemy conn) or need to look up both a secret configuration value and a variable or connection? My understanding is that this is the case sometimes, but I could be wrong. It's also possible that even though it might help in a few cases it isn't worth the code change/potential bugs in the cases where a bad value gets cached for some reason which I totally understand. |
This already happens. The Secret Backend client is reused in single process. |
I don't believe it does when looking up secrets for configuration values. If you look at I assume (perhaps wrongly) that this is either an oversight or due to some circular issue where |
Have you seen the
|
That is the change from my PR at #25556. Looks like I was accidentally linking to the git sha from my commit where I added caching there (sorry). If you look at main it looks like this: def get_custom_secret_backend() -> Optional[BaseSecretsBackend]:
"""Get Secret Backend if defined in airflow.cfg"""
secrets_backend_cls = conf.getimport(section='secrets', key='backend')
if secrets_backend_cls:
try:
backends: Any = conf.get(section='secrets', key='backend_kwargs', fallback='{}')
alternative_secrets_config_dict = json.loads(backends)
except JSONDecodeError:
alternative_secrets_config_dict = {}
return secrets_backend_cls(**alternative_secrets_config_dict)
return None |
Yeah. This is messy. But do you know what happens if you put that cache in case you have recursive retrieval ? how will it behave ? Seems that extracting this did not change the recursion. Do. you know how lru_cache() works when it is hit several times during the same method call stack? |
Yeah, the recursion between conf and secrets isn't eliminated by this change. I thought it seemed relatively safe to cache based on the key of def get_custom_secret_backend() -> Optional[BaseSecretsBackend]:
"""Get Secret Backend if defined in airflow.cfg"""
secrets_backend_cls = conf.getimport(section='secrets', key='backend')
if secrets_backend_cls:
try:
backends: Any = conf.get(section='secrets', key='backend_kwargs', fallback='{}')
alternative_secrets_config_dict = json.loads(backends)
except JSONDecodeError:
alternative_secrets_config_dict = {}
return _custom_secrets_backend(secrets_backend_cls, alternative_secrets_config_dict)
return None
@functools.lru_cache
def _custom_secrets_backend(secrets_backend_cls, alternative_secrets_config_dict):
"""Separate function to create secrets backend instance to allow caching"""
return secrets_backend_cls(**alternative_secrets_config_dict) That way the cache only cares about the values from conf, without using An
If you are asking about thread safety and |
Did you test if it is really needed (i.e. is Vault client not caching connection internally) ? |
@mock.patch("airflow.providers.hashicorp._internal_client.vault_client.hvac")
@conf_vars(
{
("secrets", "backend"): "airflow.providers.hashicorp.secrets.vault.VaultBackend",
("secrets", "backend_kwargs"): '{"url": "http://127.0.0.1:8200", "token": "token"}',
}
)
def test_config_from_secret_backend_caches_instance(self, mock_hvac):
"""Get Config Value from a Secret Backend caches the instance"""
test_config = '''[test]
sql_alchemy_conn_secret = sql_alchemy_conn
secret_key_secret = secret_key
'''
test_config_default = '''[test]
sql_alchemy_conn = airflow
secret_key = airflow
'''
mock_client = mock.MagicMock()
mock_hvac.Client.return_value = mock_client
def fake_read_secret(path, mount_point, version):
if path.endswith('sql_alchemy_conn'):
return {
'request_id': '2d48a2ad-6bcb-e5b6-429d-da35fdf31f56',
'lease_id': '',
'renewable': False,
'lease_duration': 0,
'data': {
'data': {'value': 'fake_conn'},
'metadata': {
'created_time': '2020-03-28T02:10:54.301784Z',
'deletion_time': '',
'destroyed': False,
'version': 1,
},
},
'wrap_info': None,
'warnings': None,
'auth': None,
}
if path.endswith('secret_key'):
return {
'request_id': '2d48a2ad-6bcb-e5b6-429d-da35fdf31f56',
'lease_id': '',
'renewable': False,
'lease_duration': 0,
'data': {
'data': {'value': 'fake_key'},
'metadata': {
'created_time': '2020-03-28T02:10:54.301784Z',
'deletion_time': '',
'destroyed': False,
'version': 1,
},
},
'wrap_info': None,
'warnings': None,
'auth': None,
}
mock_client.secrets.kv.v2.read_secret_version.side_effect = fake_read_secret
test_conf = AirflowConfigParser(default_config=parameterized_config(test_config_default))
test_conf.read_string(test_config)
test_conf.sensitive_config_values = test_conf.sensitive_config_values | {
('test', 'sql_alchemy_conn'),
('test', 'secret_key'),
}
assert 'fake_conn' == test_conf.get('test', 'sql_alchemy_conn')
mock_hvac.Client.assert_called_once()
assert 'fake_key' == test_conf.get('test', 'secret_key')
mock_hvac.Client.assert_called_once() |
…25556) * Cache the custom secrets backend so the same instance gets re-used Fixes #25555 This uses `functools.lru_cache` to re-use the same secrets backend instance between the `conf` global when it loads configuration from secrets and uses outside the `configuration` module like variables and connections. Previously, each fetch of a configuration value from secrets would use its own secrets backend instance. Also add unit test to confirm that only one secrets backend instance gets created.
…25556) * Cache the custom secrets backend so the same instance gets re-used Fixes #25555 This uses `functools.lru_cache` to re-use the same secrets backend instance between the `conf` global when it loads configuration from secrets and uses outside the `configuration` module like variables and connections. Previously, each fetch of a configuration value from secrets would use its own secrets backend instance. Also add unit test to confirm that only one secrets backend instance gets created. (cherry picked from commit 5863c42)
Apache Airflow version
main (development)
What happened
When airflow is loading its configuration, it creates a new secrets backend instance for each configuration backend it loads from secrets and then additionally creates a global secrets backend instance that is used in
ensure_secrets_loaded
which code outside of the configuration file uses. This can cause issues with the vault backend (and possibly others, not sure) since logging in to vault can be an expensive operation server-side and each instance of the vault secrets backend needs to re-login to use its internal client.What you think should happen instead
Ideally, airflow would attempt to create a single secrets backend instance and re-use this. This can possibly be patched in the vault secrets backend, but instead I think updating the
configuration
module to cache the secrets backend would be preferable since it would then apply to any secrets backend.How to reproduce
Use the hashicorp vault secrets backend and store some configuration in
X_secret
values. See that it logs in more than you'd expect.Operating System
Ubuntu 18.04
Versions of Apache Airflow Providers
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: