Introduce parameters configuration #1650

markusdregi · 2021-05-09T15:06:33Z

Issue
Since the early days of ert3 it has remained the engines responsibility to parse and load the available parameters and their distributions. All other elements of the configuration have been loaded further up in the call-stack using Pydantic. This PR changes this so that also the parameters are loaded early, using Pydantic and later passed to the engine.

This has multiple benefits! The first is that ert3.engine no longer has any knowledge of the layout of a workspace. Due to this we are also able to remove quite some file operations in the corresponding tests.

Furthermore, this PR is necessary preparation for #1340.

Consider this PR payback ™️

Approach
The first two commits is prep work in the stats module. Afterwards, the Pydantic schema and the corresponding tests for the parameter configuration is introduced. Afterwards, a debateable decision is made in that a configured parameter group has the capability to spawn the corresponding distribution. I don't see a better place to put this responsibility as of now, but it might very well be that this is moved in the future ™️ The remaining commits besides the last one propagates the config down into the engine, all the way down to where the parameters are currently loaded, swaps the implementation such that it is loaded both from disk and config and compared, then removing the dependency on the utils before last the utils is completely removed. The very last commits just removes the necessary file toggling that is currently done in the engine tests, such that the config is only present in-memory.

This is not how I typically structure my commits, but in this particular case I belive it is relevant to display "how it was done" in a step by step transformation and not only the "polished" history for the future reader. In particular iterating between changing iterface and tests and swapping underlying implementation. As a consequence, some of these commits should be squashed before merging...

New issues to be made

Make record index public and enforce uniform datatype within an index (currently the index type of a record is present both in the parameter configuration, the distribution implementation and in the actual record implementation. There should be unified).
Introduce constant distribution (the SPE1 setup is leveraging that a uniform distribution can have lower_bound == upper_bound). I think this is a bad way of communicating the intent of constant data and hence think (as the original implementation in this PR was doing) that we should require lower_bound < upper_bound. However, that has consequences beyond the scope of this PR and hence I think it should be turned into an issue for later.

ert3/config/_parameters_config.py

jondequinor · 2021-05-11T11:59:51Z

ert3/config/_parameters_config.py

+    if not all(c.isalpha() or c == "_" for c in name):
+        raise ValueError(
+            "Names are expected to only contain characters and `_`, was: {name}"
+        )
+
+    return name


I think this is a bit too early to be concerned with what characters are allowed, but I'm fine with it.

Normally I would expect the name to be whatever, but if the name was to be used in URLs or file names, then it would be slugified (https://docs.djangoproject.com/en/dev/glossary/#term-slug) by the application as an implementation detail.

I agree! However, there is already a concept of referencing variables from either storage or the workspace in ensemble.yml that uses . as delimiter. I think the format should be up for discussion, but by not allowing them to be included in the variable names we avoid some odd pitfalls in the imminent future. Slugging is not really an option when the user is referencing these variables later in the configuration file (storage.my_record).

jondequinor

Code looks good, it's sane to move this stuff out of engine (if it is to live up to its name, then the fuel needs to be provided, not created inside it)—good tests! 👏

sondreso · 2021-05-11T13:01:02Z

ert3/config/_parameters_config.py

+    type: Literal["gaussian", "uniform"]
+    input: _DistributionInput
+
+    @root_validator(pre=True)
+    def _ensure_gaussian_required_fields(cls, values):  # type: ignore
+        dist_type = values["type"]
+        input_fields = values["input"].keys()
+        if dist_type == "uniform":
+            expected = {"lower_bound", "upper_bound"}
+            actual = set(input_fields)
+            if expected != actual:
+                raise AssertionError(
+                    f"Expected distribution inputs: {expected}, was: {actual}"
+                )
+        elif values["type"] == "gaussian":
+            expected = {"mean", "std"}
+            actual = set(input_fields)
+            if expected != actual:
+                raise AssertionError(
+                    f"Expected distribution inputs: {expected}, was: {actual}"
+                )
+        else:
+            raise ValueError(f"Validating unknown distribution: {dist_type}")
+
+        return values


Would it be possible to do this the same way as suggested in this issue? 🤔 #1571

Probably yes! I'll give it a try...

I did the changes now and my conclusion is that:

it is definitively neater within the Pydantic framework. It is probably usage as intended and that is always good 👍🏻

the error messages becomes completely useless when one relies on Union, as Pydantic tries to match against the different options. But, this is something we can investigate and see whether there are fixes for later...

I was afraid that would be the case for the last part yes 😕 But as you say, that is something we can investigate later!

Also, if this issue is resolved, this would be a lot simpler (and faster parsing): pydantic/pydantic#619

Maybe this is also of interest: pydantic/pydantic#503 (comment)

markusdregi self-assigned this May 9, 2021

markusdregi added payback ert3 labels May 9, 2021

markusdregi force-pushed the parameters-config branch 2 times, most recently from 8398c6b to b146236 Compare May 9, 2021 15:15

markusdregi marked this pull request as draft May 9, 2021 15:17

markusdregi force-pushed the parameters-config branch from 75556dd to 3f7cb3c Compare May 11, 2021 08:28

markusdregi added 2 commits May 11, 2021 10:35

Make distributions expose their arguments

0c56478

Fix wrongly typed distribution index

d671800

markusdregi force-pushed the parameters-config branch 4 times, most recently from a06f53d to 3f4ee15 Compare May 11, 2021 09:07

markusdregi marked this pull request as ready for review May 11, 2021 09:45

jondequinor reviewed May 11, 2021

View reviewed changes

ert3/config/_parameters_config.py Outdated Show resolved Hide resolved

jondequinor reviewed May 11, 2021

View reviewed changes

ert3/config/_parameters_config.py Outdated Show resolved Hide resolved

jondequinor reviewed May 11, 2021

View reviewed changes

jondequinor approved these changes May 11, 2021

View reviewed changes

sondreso reviewed May 11, 2021

View reviewed changes

markusdregi force-pushed the parameters-config branch from 390a66a to 4e70ec3 Compare May 12, 2021 05:50

sondreso approved these changes May 12, 2021

View reviewed changes

Introduce parameters config

8142133

markusdregi force-pushed the parameters-config branch from 4e70ec3 to 8142133 Compare May 13, 2021 06:07

markusdregi merged commit 0e41a8a into equinor:master May 13, 2021

markusdregi deleted the parameters-config branch May 13, 2021 08:41

markusdregi mentioned this pull request May 21, 2021

Constant uniform distribution in SPE1 example #1698

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce parameters configuration #1650

Introduce parameters configuration #1650

markusdregi commented May 9, 2021 •

edited

jondequinor May 11, 2021

markusdregi May 12, 2021

jondequinor left a comment

sondreso May 11, 2021

markusdregi May 12, 2021

markusdregi May 12, 2021

sondreso May 12, 2021

sondreso May 12, 2021

sondreso May 12, 2021

Introduce parameters configuration #1650

Introduce parameters configuration #1650

Conversation

markusdregi commented May 9, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jondequinor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markusdregi commented May 9, 2021 •

edited