Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TabulatedPhase: better eval method #226

Merged
merged 1 commit into from May 21, 2022
Merged

Conversation

nollety
Copy link
Contributor

@nollety nollety commented May 18, 2022

Description

Resolves eradiate/eradiate-issues#165

I improved how TabulatedPhase.eval maps the input phase function data to a regular grid of scattering angle cosine (mu) when the input data does not already map to a regular mu grid.

Changes

I removed the hidden _n_mu class attribute and updated eval_mono in the following way:

  • if the input phase function data maps to a mu regular grid, interpolate in wavelength and return the values
  • else, compute regular mu grid based on the smallest mu step found in the input phase function data, and interpolate in wavelength (using xarray.DataArray.interp) and on that regular mu grid (using numpy.interp for performance) and return the values. For further details, refer to a discussion about the choice of this regular mu grid in the comments down below.

I added 2 unit tests for this new implementation (hence the data submodule update).

If the user wants to control the number of points used along the mu axis to discretize a tabulated phase function, they would simply have to provide phase function data mapping directly to a regular mu grid, since in that case the input DataArray is left unchanged.

I added a converter for the data attribute, that ensures the mu coordinate is monotonically increasing.

I added a validator for the data attribute to check that the mu coordinate in the input data set goes from -1.0 to 1.0.

I added TabulatedPhaseFunction to the API Reference.

Checklist

  • The code follows the relevant coding guidelines
  • The code generates no new warnings
  • The code is appropriately documented
  • The code is tested to prove its function
  • The feature branch is rebased on the current state of the main branch
  • I updated the change log if relevant
  • I give permission that the Eradiate project may redistribute my contributions under the terms of its license

@nollety nollety added the enhancement 🦾 New feature or request label May 18, 2022
@nollety
Copy link
Contributor Author

nollety commented May 19, 2022

I found no need to update the regression tests with these changes ; they still pass.

# compute the smallest regular grid that does not loose information
# in the sense of Shannon theorem
dmu = np.abs(self.data.mu.diff(dim="mu")).values.min()
nmu = 2 * int(np.ceil(2.0 / dmu))
Copy link
Contributor Author

@nollety nollety May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion with @schunkes and @lucio-f, we could consider removing the outside factor 2 (from Shannon's theorem) which would result in twice lower memory footprint and lower interpolation times

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts about that? @leroyvn

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a signal processing expert so I don't have strong arguments to oppose: if you guys have proper arguments in favour of tuning down the number of grid points, go for it. Just keep that in mind when analysing results ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well Lucio made the point that the data is already sampled at a maximum frequency of dmu, so we don't really gain anything from resampling it at 2*dmu.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, my reasoning is that the smallest mu step is likely to be found at the extreme right of the [-1, 1] mu interval since this is where the forward scattering peak is located. Therefore, if we buid the regular mu grid using:

dmu_min = np.abs(self.data.mu.diff(dim="mu")).values.min()
nmu = int(np.ceil(2.0 / dmu_min)) + 1
mu = np.linspace(-1, 1, nmu)

the two far-right mu points in [-1, 1] will exactly match the mu points in the input data. From these points on, the data to the left will be interpolated, therefore we incur a loss accuracy. Arguably, the loss of accuracy is minimised where it is important to be accuracy, namely close to mu=1.0.

Copy link
Contributor Author

@nollety nollety May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schunkes I am not convinced that:

dmu is the minimal mu step

implies that:

1 / dmu is the maximal frequency in the phase function signal

If you take the Fourrier transform of a realistic particle phase function in the mu-space, you will find an infinite number of harmonics in my opinion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more discussion with the whiteboard, we found that assuming the following:

  • the input data is propertly discretised, namely the smallest mu-step is found where the phase function has the sharpest variations
  • the sharpest feature occurs either at mu=-1 or at mu=1.0, or if it is located somewhere in the middle of the interval, the input data is well enough sampled in the mu space
    the smallest delta mu regularisation method is the best.

As illustrated by the figure below, in the presumably most common case where the phase function has sharpest variations around mu=1.0, interpolating on the regular grid prescribed by the Shannon's theorem or on the regular grid using the smallest mu-step lead to the same result:

phase_mu_interp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a rule of thumb, the smallest the mu step in the input data, the lowest the interpolation errors.

We note that this regular grid tabulated phase function approach is not best suited for phase function with very sharp features, generally speaking, because of the overhead in memory and in interpolation time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this approach is the only available to us at the moment...

@nollety
Copy link
Contributor Author

nollety commented May 19, 2022

I'll add a couple of tests

@nollety nollety marked this pull request as ready for review May 19, 2022 13:14
@nollety nollety force-pushed the 165-tabphase branch 2 times, most recently from 5b3d62f to 4264958 Compare May 19, 2022 13:53
Copy link
Member

@leroyvn leroyvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nollety, this is a nice PR. I made a couple of comments which I'd like to see addressed.

src/eradiate/scenes/phase/_tabulated.py Outdated Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Outdated Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Outdated Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Outdated Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Outdated Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Outdated Show resolved Hide resolved
src/eradiate/scenes/phase/_tabulated.py Show resolved Hide resolved
tests/02_eradiate/01_unit/scenes/phase/test_tabulated.py Outdated Show resolved Hide resolved
@nollety
Copy link
Contributor Author

nollety commented May 19, 2022

Thanks for the feedback! @leroyvn

@nollety
Copy link
Contributor Author

nollety commented May 20, 2022

Alright, we are ready for another review round I think :)

@nollety nollety requested review from leroyvn and schunkes May 20, 2022 13:23
@leroyvn
Copy link
Member

leroyvn commented May 21, 2022

Thanks @nollety, we're good to go: I'll merge this.

@leroyvn leroyvn merged commit 4b759c5 into eradiate:main May 21, 2022
@nollety nollety deleted the 165-tabphase branch October 7, 2022 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🦾 New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants