New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request:load the config should log the load source by debug level #63
Comments
pydantic-settings loads settings from all data sources defined in settings_customise_sources. the value for a field can be loaded in multiple sources but one of them will be selected based on priority defined in What is your request:
|
logging the final value for a field has been selected from which source I tried to submit the code to pydantic v1, and later found that the development team separated the setting from pydantic into a sub-project. My implementation method needs to modify the deep_update code, and according to the source code, this function is referenced from the _utils file of v2. Since I found out that this function was only used in BaseSettings during the initial pr submission process, I modified it directly, but the v2 project keeps the function in the pydantic project. I don’t know if there are some problems in this way, because other functions The deep_update function is not used |
In order to reduce code changes as much as possible and introduce functions, I only think of this idea of changing deep_update. But since the code is not in a project, I can't modify and submit pr |
Yeah. Yeah I think even by changing the a: int = Field(validation_alias='a_alias') and two sources with these values: s1 = {'a': 1}
s2 = {'a_alias': 2}
At the end |
My understanding is that based on the configuration of custome_source, deep_update will return a unified list. This list will contain two types of keys, one is member variables, the other is alias, each member variable must have its own variable key, there may be alias, you can use debug when you get a specific key for the first time Form for log processing. Of course, this form is only convenient for the debugging process of the setting. It is also possible to talk about the source of alias and the source of its own variable key for log output. Maybe I can try to migrate the deep_update in v2's _utuls.py file to the utils of this project and make a brief pull request? |
Perhaps we can dive deeper into this feature and determine how this idea should be implemented |
I've checked your old PR in pydantic and specially the change that you made in
At some point, we might need to move this function here in |
Actually, initially I wanted to add a sentinel class to specifically detect sources, but with my current knowledge, it seems difficult to add this feature with as little code change as possible, and I am afraid that other bugs will be introduced due to the addition of classes. So I choose to modify deep_update, because in theory this function is to aggregate all keys. The loop used in deep_update currently only uses two generators, consider copying the list of keys fields of the setting instance, and removing the key from the list if the relevant value is obtained. As for aliased setting members, this situation was not considered before, and I also need to think about how to solve the ambiguity problem based on the specific code. Other ideas such as using the chain_map in python to replace the original {} may also be considered, but the impact of the change will be relatively large |
Because that source is actually multiple configuration sources, if the previous configuration source has obtained the property, you can directly delete the key that has obtained the value, so that the list can be narrowed, or deleted after sorting, maybe you can get a part of the performance improvement? As for how to map in the case of alias, this needs to be determined according to the specific logic of pydantic. In order to increase performance, consider not performing these field judgment operations if the incoming setting is None, which can also optimize the complexity of some of the runs |
We may can log values loaded from sources and at the end log the final values that has to be passed to pydantic. something like:
we can easily have it by diff --git a/pydantic_settings/main.py b/pydantic_settings/main.py
index 88478f1..05a7f5a 100644
--- a/pydantic_settings/main.py
+++ b/pydantic_settings/main.py
@@ -1,5 +1,6 @@
from __future__ import annotations as _annotations
+import logging
from pathlib import Path
from typing import Any
@@ -18,6 +19,8 @@ from .sources import (
env_file_sentinel: DotenvType = Path('')
+log = logging.getLogger(__name__)
+
class SettingsConfigDict(ConfigDict):
case_sensitive: bool
@@ -111,7 +114,14 @@ class BaseSettings(BaseModel):
file_secret_settings=file_secret_settings,
)
if sources:
- return deep_update(*reversed([source() for source in sources]))
+ sources_values = []
+ for source in reversed(sources):
+ source_values = source()
+ log.debug(f'{source_values} loaded from {source.__class__.__name__}')
+ sources_values.append(source_values)
+ values = deep_update(*sources_values)
+ log.debug(f'{values} is the final settings values')
+ return values
else:
# no one should mean to do this, but I think returning an empty dict is marginally preferable
# to an informative error and much better than a confusing error But let's wait for @samuelcolvin opinion here |
Thank you for your ideas, but I think it should be specific to which data source each value is obtained from. For example, foo may exist in different data sources. But according to the priority, the final value of foo comes from a single data source, which is not easy to locate. This is often better positioned in the case of configuration loading errors. If you simply output different source results, this is actually easier to achieve. What do you think? |
It seems that this can't solve the alias situation. For a configuration with an alias, where is the configuration loaded from? |
In fact, I think it is only necessary to use the log module to record the specific data sources of the member variables of the configuration class. It is not necessary to log all data sources. For most users, at least for me, this does not improve the debugging experience of the configuration much. Users can actively call these configuration sources in other places. provided value. However, the specific configuration is obtained from the first few data sources, but it seems that it has not been well helped |
like what I did by my pr |
alias may also need to be considered |
Considering for example if we have a settings model like: class Settings(BaseSettings):
x: int = Field(validation_alias='alias_x') and env variables:
and call the settings like: Settings(**{'x': 3, 'alias_x': 4}) Then, BTW, I still think the logging shouldn't be too complicated and we only need to log values loaded from sources based on priority. |
When it comes to aliases, we only need to briefly test them. If x has an alias_x, consider the following situations: 1. Whether x and alias_x appear at the same time, and can only be assigned when they are equal at the same time; 2. and 1 is similar, but if it is not equal, is a certain value preferred? 3. Can x and alias_x be assigned as long as they appear in one data source? Haha, about whether the log is too complicated, in fact, my pr is not much different from your code above, just take the intersection of the two lists, and you can know what appears in those data sources without excessively introducing complicated methods The corresponding key, but maybe some case processing is required? |
#68 is my pr |
After talking with our team(Pydantic team), we decided to go for a simple logging approach. The one that I suggested here. Please open a PR with the proposed diff and provide some test for it if you want to work on it. |
Ok It depends on what your team thinks, and you can simply log the data source. |
When pydantic loads the configuration, it records what data source the configuration is read from, and provides debug-level log output.i think if the feature is added, this library will work better.
By reading the source code, I found that it is currently unable to support recording the configuration source. This can be achieved by encapsulating the chainmap, or by adding an additional dict to record. Due to my limited level, I am afraid that direct modification of the source code will not completely cover it. testing, thus introducing more bugs, so I hope the development team can add this feature.
The text was updated successfully, but these errors were encountered: