First Pass Zonal Average #82

mgrover1 · 2021-03-18T20:28:31Z

Add zonal averaging and regridding tool
Update environment files
Closes Add function to compute zonal mean #33

pop_tools/zonal_average.py

dcherian

Thanks @mgrover1

Mostly looks good.

One issue is how to handle TLONG/TLAT vs ULONG/ULAT. It would be nice if there were standard attributes that told us that one of them are the corner points for the other. I don't know that this exists but it could be worth digging around the sgrid conventions to see if there's something we can use. (https://sgrid.github.io/sgrid/)

Second, with this approach, the lowest order operation is the regrid to uniformly spaced grid (ds → regridded). Operations after that are real easy regridded.mean("latitude") and also let you do things other than the average (e.g. regridded.std("latitude")).

Should we instead have users do

uniform = pop_tools.to_uniform_grid(ds, dx=0.1, dy=0.1)
zonal_average = uniform.mean("latitude")

dcherian · 2021-03-18T21:02:42Z

pop_tools/zonal_average.py

+        self.method_gen_grid = method_gen_grid
+
+        # If the user does not input a mask, use default mask
+        if not mask:


applying a mask when mask is None is counter-intuitive =)

Should the default be mask=True

dcherian · 2021-03-18T21:08:15Z

pop_tools/zonal_average.py

+    ).rename({'TLAT': 'lat', 'TLONG': 'lon'})
+
+    # Inlcude only points that will have surrounding corners
+    data_ds = data_ds.isel({'nlon': data_ds.nlon[1:], 'nlat': data_ds.nlat[1:]})


This won't work for datasets returned by to_xgcm_grid_and_dataset which renames to nlon_t, nlon_u, nlat_t, nlat_u.

I think the solution here is to use cf_xarray

data_ds = data_ds.cf.isel(X=slice(1,None), Y=slice(1, None))

Do you have a suggestion for adding a new X/Y coordinate? I noticed that the only cf axis within the POP grids is time

Coordinates: - CF Axes: * T: ['time'] X, Y, Z: n/a - CF Coordinates: longitude: ['TLONG', 'ULONG'] latitude: ['TLAT', 'ULAT', 'lat_aux_grid'] * vertical: ['z_t'] * time: ['time'] - Cell Measures: area, volume: n/a - Standard Names: n/a Data Variables: - Cell Measures: area, volume: n/a - Standard Names: n/a

ugh yeah I forgot. We need to make nlat, nlon dimension coordinates for this to work. That allows us to assign attributes like axis: "X"

ds["nlon"] = ("nlon", np.arange(ds.sizes["nlon"]), {"axis": "X"}) ds["nlat"] = ("nlon", np.arange(ds.sizes["nlon"]), {"axis": "Y"})

and similarly for nlat_*, nlon_* in to_xgcm_grid_dataset. This too seems like a followup PR>

dcherian · 2021-03-18T21:11:26Z

pop_tools/zonal_average.py

+    data_ds = data_ds.isel({'nlon': data_ds.nlon[1:], 'nlat': data_ds.nlat[1:]})
+
+    # Use ulat and ulong values as grid corners, rename variables to match xESMF syntax
+    grid_corners = grid_ds[['ULAT', 'ULONG']].rename(


this is wrong for UVEL for which the centers are ULAT, ULONG and corners are TLAT, TLONG

EDIT: I see you've fixed this :) but a good solution would be to use UVEL.cf["longitude"] to get ULON and in theory there should be some standard attribute to lead you to TLONG from there. I do not know what that attribute is. Might have to look in to the sgrid conventions (https://sgrid.github.io/sgrid/).

alternately this kind of inference requires knowledge of the underlying grid. so xgcm.Grid object would be useful input.

We can hardcode this for POP now and move on, but solving UVEL regridding vs TEMP regridding by interpreting some standard attribute or using the xgcm object allows us to robustly solve it for the POP & MOM & (insert-NCAR-model-here) regridding. SO some brainstorming would be useful...

dcherian · 2021-03-19T15:18:33Z

pop_tools/zonal_average.py

+            else:
+                self.mask_labels = self.grid['region_name']
+
+        else:


Do we never need no mask? I guess a global average across all basins is somewhat useless but would be required for the atmosphere.

pop_tools/zonal_average.py

dcherian · 2021-03-19T15:23:35Z

pop_tools/zonal_average.py

+                    and ('velocity' not in obj[var].long_name.lower())
+                ):
+                    ds_list.append(obj[var])
+            return xr.merge(ds_list).map(self._regrid_dataarray, keep_attrs=True, **kwargs)


instead build a list of variables names and then ds[list_of_varnames].map(...)

dcherian · 2021-03-19T15:25:13Z

pop_tools/zonal_average.py

+        for region in tqdm(np.unique(mask)):
+
+            if region != 0:
+                ds_list.append(data.where(mask == region).groupby('lat').mean())


Why groupby? Just a simple .mean("lat").

instead of this loop, I would recommend taking a 2D mask variable and expanding out to 3D with a new region dimension (so that each slice along region has 1's only for the appropriate region) and then multiply. That way you're doing where on a small array rather than a giant one. It also avoids the concat down below

Matt and were discussing, making use of the pop_tools.region_mask_3d tool would be helpful with this, replacing "0" with nan values...

# Read in the 3D mask mask3d = pop_tools.region_mask_3d(grid_name, mask_name='default') # Apply the 3D mask data_masked = mask3d.where(mask3d > 0) * data[list_relavant_variables]

where the dimensions of data_masked would be ('region', 'nlat', 'nlon', 'time', 'z_t')

dcherian · 2021-03-19T15:30:46Z

pop_tools/zonal_average.py

+        if vertical_average:
+
+            # Run the vertical, weighted average
+            out = out.weighted(out['z_t'].fillna(0)).mean(dim=['z_t'])


why weight by z_t and not DZT?

Could this be calculated by using the following?

DZT = out.z_t.diff(dim='z_t')

Or am I misunderstanding what DZT is?

out.z_t.diff(dim='z_t') will return the distance between layer centers, and for 60 levels it will only return an array of length 59:

>>> import pop_tools >>> gx1v7_grid = pop_tools.get_grid("POP_gx1v7") >>> gx1v7_grid.z_t.diff(dim='z_t') <xarray.DataArray 'z_t' (z_t: 59)> array([ 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1009.8404 , 1038.0646 , 1081.22175, 1136.90105, 1205.11015, 1286.69055, 1383.0544 , 1496.13345, 1628.40275, 1782.946 , 1963.55735, 2174.8772 , 2422.5496 , 2713.41105, 3055.7061 , 3459.30485, 3935.90165, 4499.1272 , 5164.489 , 5948.97315, 6870.06835, 7943.9187 , 9182.2754 , 10588.062 , 12149.8325 , 13836.2715 , 15593.0264 , 17344.9219 , 19005.6846 , 20494.0957 , 21751.2334 , 22751.50195, 23502.97165, 24038.333 , 24401.99805, 24638.8965 , 24787.6709 , 24878.15135, 24931.6328 , 24962.4424 , 24979.77735, 24989.31735, 24994.45895, 24997.17675]) >>>

You want the cell thickness - the difference between layer interfaces, which is z_w_bot - z_w. Those arrays should be the same length although the dimension names are different, but here's a kludgy example of at least returning a DataArray (the coordinates need to be cleaned up):

>>> gx1v7_grid.z_w_bot.swap_dims(z_w_bot="z_t") - gx1v7_grid.z_w.swap_dims(z_w="z_t") <xarray.DataArray (z_t: 60)> array([ 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1019.6808, 1056.4484, 1105.9951, 1167.807 , 1242.4133, 1330.9678, 1435.141 , 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521, 2556.2471, 2870.575 , 3240.8372, 3677.7725, 4194.0308, 4804.2236, 5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658, 12967.1992, 14705.3438, 16480.709 , 18209.1348, 19802.2344, 21185.957 , 22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137, 24844.3281, 24911.9746, 24951.291 , 24973.5938, 24985.9609, 24992.6738, 24996.2441, 24998.1094]) Coordinates: z_w_bot (z_t) float64 1e+03 2e+03 3e+03 4e+03 ... 5e+05 5.25e+05 5.5e+05 z_w (z_t) float64 0.0 1e+03 2e+03 3e+03 ... 4.75e+05 5e+05 5.25e+05 Dimensions without coordinates: z_t >>>

I think out might already have a dz data array, though, which would be the best thing to use:

>>> gx1v7_grid["dz"] <xarray.DataArray 'dz' (z_t: 60)> array([ 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1000. , 1019.6808, 1056.4484, 1105.9951, 1167.807 , 1242.4133, 1330.9678, 1435.141 , 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521, 2556.2471, 2870.575 , 3240.8372, 3677.7725, 4194.0308, 4804.2236, 5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658, 12967.1992, 14705.3438, 16480.709 , 18209.1348, 19802.2344, 21185.957 , 22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137, 24844.3281, 24911.9746, 24951.291 , 24973.5938, 24985.9609, 24992.6738, 24996.2441, 24998.1094]) Coordinates: * z_t (z_t) float64 500.0 1.5e+03 2.5e+03 ... 5.125e+05 5.375e+05 Attributes: units: cm long_name: thickness of layer k >>>

I adjusted this to utilize the dz dataarray within the grid

out = out.weighted(self.grid.dz).mean(dim='z_t')

Which seems to work well.

pop_tools/zonal_average.py

mgrover1 · 2021-03-19T20:26:28Z

One issue is how to handle TLONG/TLAT vs ULONG/ULAT. It would be nice if there were standard attributes that told us that one of them are the corner points for the other. I don't know that this exists but it could be worth digging around the sgrid conventions to see if there's something we can use. (https://sgrid.github.io/sgrid/)

I will look into that.. for now I was just using TLONG/TLAT and ULONG/ULAT since we read the grid directly from pop-tools, but moving toward this would definitely make this more extensible.

Second, with this approach, the lowest order operation is the regrid to uniformly spaced grid (ds → regridded). Operations after that are real easy regridded.mean("latitude") and also let you do things other than the average (e.g. regridded.std("latitude")).

uniform = pop_tools.to_uniform_grid(ds, dx=0.1, dy=0.1)
zonal_average = uniform.mean("latitude")

I like the that suggested approach - the reason for the current implementation is to reproduce the previously existing ZA code, but following this would make more sense.

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

Co-authored-by: Anderson Banihirwe <axbanihirwe@ualr.edu>

dcherian · 2021-03-20T01:46:08Z

. for now I was just using TLONG/TLAT and ULONG/ULAT since we read the grid directly from pop-tools, but moving toward this would definitely make this more extensible.

👍 we could hardcode it for now, choose the bounds based on dataarray.attrs["coordinates"], to move quickly and adapt later

andersy005 · 2021-03-22T14:28:58Z

pop_tools/zonal_average.py

+
+import numpy as np
+import xarray as xr
+import xesmf as xe


Should we make xesmf a soft dependency i.e.

try: import xesmf as xe except ImportError: message = ("Zonal averaging requires xesmf package.\n\n" "Please conda install as follows:\n\n" " conda install -c conda-forge xesmf>=0.4.0" ) raise ImportError(msg)

??

mgrover1 · 2021-03-26T15:59:38Z

So... I went back and took a look at Xoak which is another way to approach this problem, allowing the user to index based on coordinates not within the default dimensions... below is the function I used to accomplish a zonal average, and the nice thing about this approach is inputting the output latitudes is easier than the previous approach using xESMF, where in this example, I use the lat_aux_grid .

def zonal_average(ds, out_lats):
    
    # Set the index to be TLAT and TLONG
    ds.xoak.set_index(['TLAT', 'TLONG'], 'sklearn_geo_balltree')
    
    # output lons
    lons = np.arange(-180, 180, 1)
    
    # Build a meshgrid with the lats and lons
    nlons, nlats = np.meshgrid(lons, lats)
    
    # Build an output dataset
    ds_data = xr.Dataset({
        'lat': (('y', 'x'), nlats),
        'lon': (('y', 'x'), nlons),
    })
    
    # Use xoak to select lats/lons
    out = ds.xoak.sel(TLAT=ds_data.lat, TLONG=ds_data.lon)
    
    # Reset the coordidinates
    out = out.reset_coords()
    
    # Add one-dimensional lats and lons
    out['lon'] = ('x', lons)
    out['lat'] = ('y', lats)
    
    # Swap the dimensions
    out = out.swap_dims({'x':'lon',
                         'y':'lat'})
    
    return out

za_temp = zonal_average(ds[['TEMP']], ds.lat_aux_grid.values).mean('lon')

Where ds is the input dataset

mgrover1 · 2021-04-01T16:21:36Z

Second, with this approach, the lowest order operation is the regrid to uniformly spaced grid (ds → regridded). Operations after that are real easy regridded.mean("latitude") and also let you do things other than the average (e.g. regridded.std("latitude")).

Should we instead have users do
uniform = pop_tools.to_uniform_grid(ds, dx=0.1, dy=0.1)
zonal_average = uniform.mean("latitude")

Should we allow users to specify which pop-grid to use (ex. ''POP_gx1v7'')? Also, with this approach, would we automatically apply the region masking so the zonal average would include this? These are reasons why I am currently using the approach similar to xESMF where one sets up the regridder (using the associated pop-grid), then apply the regridder to some data.

mgrover1 · 2021-04-01T18:10:32Z

Any suggestions on how to speed up this calculation? Currently, the calculation a few minutes on the text files in pop-tools, regridding to 0.25 degree grid... this is the current zonal average section

def zonal_average(self, obj, vertical_average=False, **kwargs):

        data = self.regrid(obj, **kwargs)
        mask = self.regrid(self.mask, regrid_method='nearest_s2d', **kwargs)

        # Attach a name to the mask
        mask.name = self.mask.name

        # Replace zeros with nans and group into regions
        out = mask.where(mask > 0) * data

        # Check to see if a weighted vertical average is needed
        if vertical_average:

            # Calculate the vertical weighte average based on depth of the layer
            out = out.weighted(self.grid.dz).mean(dim='z_t')

        return out

I think the main slow-down may be that it is separating into 13 regions?

dcherian · 2021-04-01T18:56:12Z

Should we allow users to specify which pop-grid to use (ex. ''POP_gx1v7'')?

It looks like you only need *LAT, *LONG, dz. Seems like we should just expect the input dataset to have all these variables.

Re: speed. What does snakeviz say? It looks like you're regenerating the destination grid for every variable. That could be avoided.

mgrover1 · 2021-04-08T22:38:15Z

After some discussion this week with @matt-long and @klindsay28 , I think it would be best to implement a destination grid that matches the input grid.. here is a function that is hard coded for now in terms of dealing with POP, I am not sure how to generalize this since the grid corner information is specific to POP.

Also, @klindsay28, how did you deal with the cyclic point in the previous implementation? Using conservative regridding, there are missing values near the origin of the grid. Using another interpolation method (ex. bilinear) would fix this, but in order to conserve areas/values, it would be good to stick with conservative regridding.

The resultant dataset matches the input for xESMF, which makes it easy to implement into the current development.

def flip_coords(ds):
    """
    Given a set of longitudes and latitudes, flips s-hemisphere into northern hemisphere
    """
    
    tlat = ds.TLAT.values
    ulat = ds.ULAT.values
    nlat = ds.nlat.values
    
    nlat_shem_t = nlat[np.where(tlat[:, 0] < 0)]
    nlat_shem_u = nlat[np.where(ulat[:, 0] < 0)]
    
    new_ds_t = ds.sel(nlat=nlat_shem_t)
    new_ds_u = ds.sel(nlat=nlat_shem_u)
    
    # Pull out tlats/tlons
    tlat = new_ds_t.TLAT.values[1:]
    tlon = new_ds_t.TLONG.values
    
    # Pull out ulats/ulons
    ulat = new_ds_u.ULAT.values
    ulon = new_ds_u.ULONG.values
    
    # Use the grid spacing from the farthest south
    grid_spacing = np.diff(tlat[:, 0])[0]
    
    out_tlats = np.append(np.append(tlat[:, 0], -tlat[:, 0][::-1]), np.arange(np.max(-tlat[:, 0])+grid_spacing, 90, grid_spacing))
    out_tlons = sorted(tlon[0, :])
    
    tlons, tlats = np.meshgrid(out_tlons, out_tlats)
    
    out_ulats = np.append(np.append(np.append(ulat[:, 0], 0), -ulat[:, 0][::-1]), np.arange(np.max(-ulat[:, 0])+grid_spacing, 90, grid_spacing))
    out_ulons = sorted(ulon[0, :])
    
    ulons, ulats = np.meshgrid(out_ulons, out_ulats)
    
    
    ds_out = xr.Dataset(coords=dict(
        lon=(["x"], out_tlons[1:]),
        lat=(["y"], out_tlats[1:]),
        lon_b=(["x_b"], out_ulons),
        lat_b=(["y_b"], out_ulats)))
    
    
    return ds_out

mgrover1 · 2021-04-13T21:01:08Z

I am struggling with finding the correct bounds for the pop-grid... here is a function I have that uses the grid corner function in pop-tools... would this be the right way to go about this?

def gen_corner_calc(ULAT, ULONG):
    """
    Generates corner information and creates single dataset with output
    """
    
    # Use the function in pop-tools to get the grid corner information
    corn_lat, corn_lon = pop_tools.grid._compute_corners(ULAT, ULONG)

    lon_shape, lat_shape = corn_lon[:, :, 0].shape
    out_shape = (lon_shape + 1, lat_shape + 1)
    
    # Generate numpy arrays to store destination lats/lons
    out_lons = np.zeros((out_shape))
    out_lats = np.zeros((out_shape))
    
    # Assign the northeast corner information
    out_lons[1:, 1:] = corn_lon[:, :, 0]
    out_lats[1:, 1:] = corn_lat[:, :, 0]
    
    # Assign the northwest corner information
    out_lons[:-1, 1:] = corn_lon[:, :, 1]
    out_lats[:-1, 1:] = corn_lat[:, :, 1]
    
    # Assign the southwest corner information
    out_lons[:-1, :-1] = corn_lon[:, :, 2]
    out_lats[:-1, :-1] = corn_lat[:, :, 2]
    
    # Assign the southeast corner information
    out_lons[1:, :-1] = corn_lon[:, :, 3]
    out_lats[1:, :-1] = corn_lat[:, :, 3]
    
    ds_out = xr.Dataset(coords=dict(lon_b=(["nlat_b", "nlon_b"], out_lons),
                                    lat_b=(["nlat_b", "nlon_b"], out_lats)))
    
    return ds_out

pop_tools/regrid.py

dcherian · 2021-08-13T18:47:20Z

pop_tools/regrid.py

+    print(data, dst_grid)
+
+    # Convert the mask to a uniform grid, using nearest neighbor interpolation
+    mask_regrid = to_uniform_grid(


can we have this support DataArrays directly? Call .to_dataset() on the dataarray in to_uniform_grid

pop_tools/regrid.py

dcherian · 2021-08-13T18:55:53Z

pop_tools/regrid.py

+        else:
+            raise TypeError('Missing defined longitude axis bounds')
+
+        lons, lats = np.meshgrid(sorted(lon), lat)


is the sorting really required?

Not necessarily - I can go ahead and remove it

dcherian · 2021-08-13T18:56:13Z

pop_tools/regrid.py

+        for var in obj:
+            if 'nlat' in obj[var].dims and 'nlon' in obj[var].dims:
+
+                if obj[var].cf['latitude'].name == 'TLAT':


mrege this with the above?

So adjust that to this?

if 'nlat' in obj[var].dims and 'nlon' in obj[var].dims and obj[var].cf['latitude'].name

dcherian · 2021-08-13T18:57:16Z

pop_tools/regrid.py

+    print(np.unique(mask_regrid))
+
+    # Add a mask to the regridding
+    dst_grid['mask'] = (('y', 'x'), (mask_regrid.where(mask_regrid == 0, 1, 0)))


does np.logical_not do what you want here?

What do you mean?

You're inverting the mask here:

(mask_regrid.where(mask_regrid == 0, 1, 0))

I think np.logical_not(mask_regrid) could be equivalent and possibly faster?

dcherian · 2021-08-13T19:07:03Z

pop_tools/regrid.py

+    ds_list = [
+        data_regrid.where(mask_regrid == region)
+        .weighted(weights_regrid.where(mask_regrid == region).fillna(0))
+        .mean('x')
+        for region in tqdm(np.unique(mask_regrid))
+        if region != 0
+    ]


A few comments here:

This function should have one job: compute a zonal average. Why should it bring in these regions.

Do you really need the where on the weights. Also .where(..., 0) will save an extra copy.

This could be a lot faster if you made mask_regrid a 3D variable: (lat, lon, region) and then passed it in as a weight.

Yeah...

From the user API perspective, I thought it would be helpful to be able to specify the grid name and handle the logic within here... where would that be better placed? Or what is the higher level api that we would want the user to specify the region information?

I can remove the where on the weights

That is true; this might also be a better way of dealing with the fractional areas of the masked grid cells...

One way would be to expect users to pass in ds * ds.region_mask if region_mask was a 3D variable (region, nlat, nlon). Would this be equivalent?

Does the order of operations here: regrid, then multiply by region mask and mean, follow what's in Keith's FORTRAN program?

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

for more information, see https://pre-commit.ci

pop_tools/regrid.py

andersy005 · 2021-08-13T21:55:05Z

pop_tools/regrid.py

+    elif method == 'lat_axis':
+        out_ds = xr.Dataset()
+
+        if lat_axis_bnds is not None:


It appears that lat_axis_bnds is required for this function to properly work. Instead of having it as a keyword argument, could we just make it a required argument and remove this check?

andersy005 · 2021-08-13T21:58:21Z

pop_tools/regrid.py

+    regrid_method = str
+         Regridding method for xESMF, default is conservative
+
+    **kwargs


If I understand this correctly, these kwargs are meant to be passed to

def _regrid_dataset(da_in, dst_grid, regrid_method=None)

This function only has one keyword argument regrid_method. Could we just expose this argument in to_uniform_grid signature instead of using the catch-all kwargs?

pop_tools/regrid.py

andersy005 · 2021-08-13T22:00:51Z

pop_tools/regrid.py

+    if isinstance(ds, xr.DataArray):
+        grid_center_lat = ds.cf['latitude'].name
+        grid_center_lon = ds.cf['longitude'].name
+        ds = ds.to_dataset()
+
+    elif isinstance(ds, xr.Dataset):
+        var = list(ds.variables)[0]
+        grid_center_lat = ds[var].cf['latitude'].name
+        grid_center_lon = ds[var].cf['longitude'].name


We are using cf_xarray accessor here, but I don't see cf_xarray in the imports. Is this working as expected without the cf_xarray import?

also list(ds.variables)[0]?

I think the only sane way to do this is to look at loop over DataArrays and check "TLAT" in DataArray.attrs["coordinates"].So take an input dataset then split it in to two: those with coordinates TLAT, TLONG; and then those with ULAT, ULONG; process the two datasets separately then combine them to a single dataset and return that.

@dcherian This logic is handled within the to_uniform_grid function...

@mgrover1, as Deepak pointed out, this line is going to result in errors when the "first" grabbed variable doesn't have spatial dimensions...

var = list(ds.variables)[0]

Could we make this more robust?

I can go ahead and remove the check to see if it is a dataset since it should be a data array, and this is a function that is meant to used internally... in to_uniform_grid, we check to see if nlat, nlon, and TLAT are in the dimensions, creating a list of variables. These variables are then passed into prep_for_xesmf function, iterating over those variables.

pop_tools/regrid.py

dcherian · 2021-08-13T22:14:42Z

pop_tools/regrid.py

+        for var in obj:
+            if 'nlat' in obj[var].dims and 'nlon' in obj[var].dims:
+
+                if obj[var].cf['latitude'].name == 'TLAT':


dcherian · 2021-08-13T22:16:47Z

pop_tools/regrid.py

+    if isinstance(ds, xr.DataArray):
+        grid_center_lat = ds.cf['latitude'].name
+        grid_center_lon = ds.cf['longitude'].name
+        ds = ds.to_dataset()
+
+    elif isinstance(ds, xr.Dataset):
+        var = list(ds.variables)[0]
+        grid_center_lat = ds[var].cf['latitude'].name
+        grid_center_lon = ds[var].cf['longitude'].name


also list(ds.variables)[0]?

I think the only sane way to do this is to look at loop over DataArrays and check "TLAT" in DataArray.attrs["coordinates"].So take an input dataset then split it in to two: those with coordinates TLAT, TLONG; and then those with ULAT, ULONG; process the two datasets separately then combine them to a single dataset and return that.

clyne · 2021-08-13T23:14:34Z

@mgrover1 is this function something that would be generally useful for geoscience applications, or is it specific to POP? Just curious. Thanks!

Co-authored-by: Anderson Banihirwe <axbanihirwe@ualr.edu>

for more information, see https://pre-commit.ci

Co-authored-by: Anderson Banihirwe <axbanihirwe@ualr.edu>

mgrover1 · 2021-08-16T19:42:43Z

@mgrover1 is this function something that would be generally useful for geoscience applications, or is it specific to POP? Just curious. Thanks!

In its current state, it is fairly POP specific... but it should be relatively straight forward to make this more generic to the point where it could be applied to other use cases.

pop_tools/regrid.py

Co-authored-by: Anderson Banihirwe <axbanihirwe@ualr.edu>

andersy005 · 2021-08-16T22:41:45Z

tests/test_zonal_average.py

+    assert isinstance(out, xr.Dataset)
+    assert 'region' in out.dims
+    assert 'TEMP' in out.variables
+    assert 14 == len(out.region)


Could we produce some output with keith's za functionality, and then do some numerical comparison as part of the test?

I am planning on holding off on this until we can figure out why it's next lining up exactly... It has something to do with the region masks, but I can't nail it down

pop_tools/regrid.py

dcherian · 2021-08-17T15:17:23Z

pop_tools/regrid.py

+    Parameters
+    ----------
+    method: str
+       Method to use for generating the destination grid, options include 'regular_grid' or 'define_lat_aux'


It's lat_aux_grid down below.

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

dcherian · 2022-01-14T18:11:04Z

An ASP grad student is asking about regridding POP to uniform grid. Can we merge theto_uniform_grid bit and wait on the zonal_average bit?

mgrover1 · 2022-01-14T18:16:47Z

@dcherian that makes sense to me!

First pass zonal average

e2a6d50

andersy005 reviewed Mar 18, 2021

View reviewed changes