Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow specifying a fill value per variable #4165

Closed
keewis opened this issue Jun 19, 2020 · 1 comment · Fixed by #4237
Closed

allow specifying a fill value per variable #4165

keewis opened this issue Jun 19, 2020 · 1 comment · Fixed by #4237

Comments

@keewis
Copy link
Collaborator

keewis commented Jun 19, 2020

While working on #4163 I noticed that the fill value parameter for align (but maybe also reindex, concat, merge and combine_*?) will be used for all variables (except dimension coordinates) which is obviously not ideal when working with quantities. Would it make sense to optionally allow fill_value to be a dict which maps a fill value to a variable name?

Consider this:

In [2]: a = xr.Dataset( 
   ...:     data_vars={"a": ("x", [12, 14, 13, 10, 8])}, 
   ...:     coords={"x": [-2, -1, 0, 1, 2], "u": ("x", [-20, -10, 0, 10, 20])},  
   ...: ) 
   ...: b = xr.Dataset( 
   ...:     data_vars={"b": ("x", [7, 9, 3])}, 
   ...:     coords={"x": [0, 3, 4], "u": ("x", [0, 30, 40])}, 
   ...:  
   ...: ) 
   ...:  
   ...: xr.align(a, b, join="outer", fill_value=-50)
Out[2]: 
(<xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -20 -10 0 10 20 -50 -50
 Data variables:
     a        (x) int64 12 14 13 10 8 -50 -50,
 <xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -50 -50 0 -50 -50 30 40
 Data variables:
     b        (x) int64 -50 -50 7 -50 -50 9 3)

I'd like to be able to do something like this instead:

In [3]: xr.align(a, b, join="outer", fill_value={"a": -30, "b": -40, "u": -50})
Out[3]: 
(<xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -20 -10 0 10 20 -50 -50
 Data variables:
     a        (x) int64 12 14 13 10 8 -30 -30,
 <xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -40 -40 0 -40 -40 30 40
 Data variables:
     b        (x) int64 -50 -50 7 -50 -50 9 3)

I could get there by passing the default (dtypes.NA) and then using fillna, but that only seems to work with data variables so coordinates would need to pass through a reset_coords / set_coords cycle. Also, with this the dtype is changed to float.

@keewis keewis changed the title allow specifying a fill value per allow specifying a fill value per variable Jun 19, 2020
@shoyer
Copy link
Member

shoyer commented Jun 20, 2020

Would it make sense to optionally allow fill_value to be a dict which maps a fill value to a variable name?

Yes, this sounds like a welcome improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants