Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics dashboard api #6

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

MiroDudik
Copy link
Member

No description provided.

Signed-off-by: Miro Dudik <mdudik@gmail.com>
Signed-off-by: Miro Dudik <mdudik@gmail.com>
Signed-off-by: Miro Dudik <mdudik@gmail.com>
Signed-off-by: Miro Dudik <mdudik@gmail.com>
Signed-off-by: Miro Dudik <mdudik@gmail.com>
Signed-off-by: Miro Dudik <mdudik@gmail.com>
@riedgar-ms
Copy link
Member

I've been thinking a bit about this over the weekend, and there are two slightly different sets of changes which are required.

The first is on the level of the entire dashboard JSON. I think that we should merge the modelNames and predictedY portions into a single list of dictionaries, similar to the way the sensitive features are handled. I'm guessing that the current separation is just a historical accident. At this time, we can also decide on the keys we want - basically, camel or snake case. These are fairly mechanical changes, but would impose a very non-zero cost to rollout.

The second is the trickier one, and involves the contents of those dictionaries in the precomputedMetrics (2d) array. I've not got very far with my thoughts yet, but essentially.... if we want to make things less brittle (and also enable third parties to contribute their own views of the statistics), we also need to make sure that we devise something queryable. Basically, a 'view' would list its required statistics, and the 'engine' inside the dashboard would be responsible for matching the available views up with the available statistics. How all that works.... I'm not sure yet.

Signed-off-by: Miro Dudik <mdudik@gmail.com>
Signed-off-by: Miro Dudik <mdudik@gmail.com>

```python
{
"prediction_type": "binary_classification" or "probabilistic_binary_classification" or "regression",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we're omitting multiclass classification?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we don't have any support for it yet, but we can definitely add other prediction types in future.

"name": "y_true",
"values": [0, 1, 1, 1, 0],
},
"sample_weight": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is user-provided, not the one set within ExponentiatedGradient, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, perhaps it's worth documenting this with a short comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do. this is just an example of an array that we may want to pass to the metrics--since many metrics work with this kind of an argument.

"sensitive_feature gender" : { # an example feature
"name": "gender",
"values": [0, 1, 0, 0, 2],
"value_names": ["female", "male", "non-binary"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a 'type' field in here, so things like 'prediction' and 'sensitive_feature' don't have to go into the key?

},
"cache" : [
{
"function": string, # python function name; we could either limit to fairlearn.metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use fully qualified names for sure.

"return_value": {
"overall": 0.11,
"by_group": {
"keys": [0, 1, 2],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the 'keys' necessary, if we required all categoricals to be integer-encoded?

"<array_key>" : { # the keys can be arbitrary strings; not sure we need to force any convention, but see examples below
"name": string, # the name of a feature would be the feature name, of a prediction vector would be the model name
"values": number[],
"value_names": string[], # an optional field to encode categorical data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we also specify that extra keys (e.g. inserted by AzureML) are to be preserved.

@riedgar-ms
Copy link
Member

This is something I've started cogitating on again, in the context of the AzureML - now MLFlow - integration. We do want to enable composability, but also avoid saving lots copies of y_true etc. in the composable pieces.

Then again, we also need an API which allows users to 'mess around in a notebook' without having to set up a bunch of prerequisites. A small example of this is how the dashboard doesn't require model and sensitive feature names, but will generate them itself if invoked without them. In contrast _create_group_metric_set() absolutely requires these names, to minimise the chances of users shooting themselves in the foot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants