Inconsistencies between `nn.functional.interpolate` and `tf.image.resize` #18811

sayakpaul · 2022-08-30T08:50:43Z

System Info

Upsampling of intermediate feature maps is common for computer vision models. In PyTorch, it's usually implemented using nn.functional.interpolate. For TensorFlow, it's usually tf.image.resize.

But there is an inconsistency between what these two methods yield (Colab Notebook). Sometimes the differences in their outputs are small enough to ignore (ViT, for example). But when interpolation is repeated several times in a model (MobileViT, for example), these small differences can add up and shake the final outputs of a model quite a bit. More details here.

@hollance wrote an amazing blog post discussing this issue.

Who can help?

@amyeroberts @gante @Rocketknight1

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

The Colab Notebook mentioned in the description.

Expected behavior

We should work on a TF utility that yields the same output as nn.functional.interpolate.

The text was updated successfully, but these errors were encountered:

hollance · 2022-08-30T11:31:27Z

Note that align_corners=None does give the same result as tf.image.resize, to an absolute tolerance of 1e-6 or so:

upsampled_logits_pt = torch.nn.functional.interpolate(
    dummy_logits_pt, size=interp_shape, mode="bilinear", align_corners=None
)

ariG23498 · 2022-08-31T01:57:56Z

I had to face a similar probelem.

With the torch.nn.functional.interpolate:

On align_corners=True, the best option is to use the tf.compat.v1.image.resize
On align_corners=False, the tf.image.resize does the trick.

Here is a colab notebook that details the solution that I proposed.

@amyeroberts and @gante were against using the tf.compat.v1.image.resize for obvious reasons. @amyeroberts did come up with a solution which is documented in this comment. I hope this provides some value to this thread.

sayakpaul · 2022-08-31T02:03:30Z

Thanks @hollance and @ariG23498 for chiming in.

When there's one / two interpolation ops, the small differences are fine but when they are done several times (as mentioned earlier), these differences compound up which creates mismatches in the final outputs.

github-actions · 2022-09-29T15:02:27Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Rocketknight1 · 2022-09-29T16:29:08Z

Pinging this to keep it open - we've implemented TF versions of Torch ops like AdaptivePool, so this might be something to explore as well. I hope a performant solution can be implemented with native ops (with or without XLA compilation) without needing to write our own CUDA, though!

sayakpaul · 2022-09-29T16:32:39Z

Pinging this to keep it open - we've implemented TF versions of Torch ops like AdaptivePool, so this might be something to explore as well.

Could you point me to something relevant? Would love to see the new implementations.

Rocketknight1 · 2022-09-29T16:36:23Z

Hi @sayakpaul, I'm extremely sorry! I thought we'd implemented it in data2vec2 already, but checking the code it seems like we still have the old version there. Here's a much, much more performant version.

Rocketknight1 · 2022-09-29T16:37:53Z

The new version will also allow XLA compilation, unlike the original sparse implementation

sayakpaul · 2022-09-29T16:42:15Z

Oh yeah, I remember this one. I need to update the data2vec code with your latest implementation. Reminded me of that.

Thanks, Matt!

Rocketknight1 · 2022-09-29T17:08:41Z

Although, checking out that notebook, it seems like Pytorch's interpolate and TF's resize are using the same algorithm, and the differences are mostly numerical; when I switch the dtype to float64, the max absolute differences is 1e-7.

I think this will make it extremely hard to bring the accuracies any closer - we would have to rewrite the internals of the algorithm so that they're not just mathematically equivalent, but they lose precision in the same places as well. I think we're stuck with the issue, unfortunately!

sayakpaul · 2022-09-29T17:11:18Z

Fair enough. The tiny differences just add up when there's multiple such interpolations. Otherwise, it's not a big deal.

github-actions · 2022-10-24T15:05:19Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

ariG23498 · 2022-10-25T07:27:26Z

Ping.

I don't think it is resolved yet.

sayakpaul · 2022-10-25T07:29:53Z

I doubt it will be apples-to-apples resolved ever, considering #18811 (comment).

We need to be aware when comparing predictions of models (PT vs. TF) with stacks of interpolations, especially focusing on what tolerances we're using.

github-actions · 2022-11-18T15:02:28Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul added the bug label Aug 30, 2022

github-actions bot closed this as completed Nov 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistencies between `nn.functional.interpolate` and `tf.image.resize` #18811

Inconsistencies between `nn.functional.interpolate` and `tf.image.resize` #18811

sayakpaul commented Aug 30, 2022

hollance commented Aug 30, 2022

ariG23498 commented Aug 31, 2022

sayakpaul commented Aug 31, 2022

github-actions bot commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

sayakpaul commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

sayakpaul commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

sayakpaul commented Sep 29, 2022

github-actions bot commented Oct 24, 2022

ariG23498 commented Oct 25, 2022

sayakpaul commented Oct 25, 2022

github-actions bot commented Nov 18, 2022

Inconsistencies between nn.functional.interpolate and tf.image.resize #18811

Inconsistencies between nn.functional.interpolate and tf.image.resize #18811

Comments

sayakpaul commented Aug 30, 2022

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

hollance commented Aug 30, 2022

ariG23498 commented Aug 31, 2022

sayakpaul commented Aug 31, 2022

github-actions bot commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

sayakpaul commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

sayakpaul commented Sep 29, 2022

Rocketknight1 commented Sep 29, 2022

sayakpaul commented Sep 29, 2022

github-actions bot commented Oct 24, 2022

ariG23498 commented Oct 25, 2022

sayakpaul commented Oct 25, 2022

github-actions bot commented Nov 18, 2022

Inconsistencies between `nn.functional.interpolate` and `tf.image.resize` #18811

Inconsistencies between `nn.functional.interpolate` and `tf.image.resize` #18811