Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize numpy generics in keys #4146

Merged
merged 7 commits into from Sep 1, 2022
Merged

Sanitize numpy generics in keys #4146

merged 7 commits into from Sep 1, 2022

Conversation

raubitsj
Copy link
Member

@raubitsj raubitsj commented Aug 19, 2022

Fixes WB-9464

Description

We allow users to log integers as non top level keys in nested dictionaries, but we didn't allow those integers to be numpy types.

Changes:

  • Detect np.generic keys in nested dictionaries
  • Coerce them into python types when found

Some interesting things to note:

  • Handles dictionaries with cycles
  • Our output converts all numeric keys to strings because of json / javascript
  • Passing dictionaries with cycles break other parts of the code
  • I tried to optimize for detecting np keys fast

Testing

Unittests and yea tests

@raubitsj raubitsj marked this pull request as ready for review August 19, 2022 23:58
@raubitsj raubitsj requested a review from a team August 20, 2022 00:14
@kptkin
Copy link
Contributor

kptkin commented Aug 20, 2022

I was thinking about this and I think we can use functools.singledispatch and register all the transformations.
The only ugly part is that some of the dispatches will be conditional on import... but the code will be much more readable:

from functools import singledispatch
import numpy as np

@singledispatch
def json_serializable(object):
      raise NotImplementedError

@json_serializable.register(dict)
def _(d):
      return {json_serializable(k): json_serializable(v) for k, v in d.items()}
 
@json_serializable.register(np.generic)
def _(value):
     return value.item()

@codecov
Copy link

codecov bot commented Aug 20, 2022

Codecov Report

Merging #4146 (c6aeda6) into master (07de99d) will increase coverage by 0.04%.
The diff coverage is 96.77%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4146      +/-   ##
==========================================
+ Coverage   82.62%   82.66%   +0.04%     
==========================================
  Files         256      256              
  Lines       32574    32600      +26     
==========================================
+ Hits        26913    26949      +36     
+ Misses       5661     5651      -10     
Flag Coverage Δ
functest 55.98% <90.32%> (+0.06%) ⬆️
unittest 73.42% <96.77%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/util.py 87.24% <96.77%> (+0.35%) ⬆️
wandb/integration/tensorboard/monkeypatch.py 87.65% <0.00%> (-2.47%) ⬇️
wandb/sdk/lib/git.py 79.01% <0.00%> (ø)
wandb/sdk/wandb_setup.py 88.94% <0.00%> (+0.50%) ⬆️
wandb/sdk/internal/internal_api.py 86.98% <0.00%> (+0.51%) ⬆️
wandb/sdk/lib/sock_client.py 94.17% <0.00%> (+1.05%) ⬆️
wandb/sdk/internal/meta.py 90.79% <0.00%> (+3.06%) ⬆️

@raubitsj raubitsj added this to the sdk-2022-09.1 milestone Aug 26, 2022
Copy link
Contributor

@kptkin kptkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@raubitsj raubitsj merged commit facbc17 into master Sep 1, 2022
@raubitsj raubitsj deleted the sanitize-numpy-keys branch September 1, 2022 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants