Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engineering/scientific notation is incorrectly deserialized as string #91

Closed
bjtho08 opened this issue May 16, 2020 · 2 comments
Closed

Comments

@bjtho08
Copy link

bjtho08 commented May 16, 2020

I want to use confuse for configuring applications that use TensorFlow/Keras for deep learning. Part of that is reading various numerical values from the config file. For most types, this is painless, but I encountered the following issue.

Consider a config.yaml that looks like this:

appName: MyMLApp

...

arch:
  lr: 1e-4

In python, the following will happen:

import confuse

config = confuse.Configuration('MyMLApp', __name__)
model_params = config['arch'].get()
assert isinstance(model_params['lr'], float)

----
Traceback (most recent call last):
  File "<pyshell#66>", line 5, in <module>
    assert  isinstance(model_params['lr'], float)
AssertionError

The expected behavior is of course that this assertion does not fail.

OS and info:
Ubuntu 16.04
Python 3.8.2
confuse 1.1.0

@sampsyo
Copy link
Member

sampsyo commented May 16, 2020

Hello! Sounds interesting—but this is a YAML specification or PyYAML issue, rather than a Confuse issue.

To start investigating, let's use PyYAML directly:

>>> import yaml
>>> type(yaml.safe_load("123"))
<class 'int'>
>>> type(yaml.safe_load("1e4"))
<class 'str'>

So, yep, there's the problem. To dig deeper, maybe let's look at the YAML spec for floating point literals:
https://yaml.org/spec/1.2/spec.html#id2804092

According to that regex, it seems like the + or - is required. So maybe this works?

>>> type(yaml.safe_load("1e+4"))
<class 'str'>

Sadly, no. A little googling leads to yaml/pyyaml#173 indicating that this is a bug. PyYAML needs a dot in the base part:

>>> type(yaml.safe_load("1.0e+4"))
<class 'float'>

That works! So as a workaround, I suggest using that literal format. According to the aforelinked issue thread, however, ruamel.yaml fixes this bug. So Confuse could "fix" it to by doing #52, for which I would be eternally grateful for a pull request! 😇

@bjtho08
Copy link
Author

bjtho08 commented May 20, 2020

Thanks for the reply! I never considered that it needed to be that explicit, but in any case this is much less of a hassle than to write out the numbers as a decimal, so I will accept it as closed for now 👍 If I get the chance later, I will look into the fix that ruamel.yaml uses and see if I can implement something similar in confuse :)

@bjtho08 bjtho08 closed this as completed May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants