make the tensor continuous when passing numpy object to tensor #2483

dddzg · 2020-07-17T13:16:56Z

The to_tensor will return noncontiguous tensor if we pass the NumPy image (e.g. reading image by cv.imread). It is very common usage. If we pass a PIL object, however, it will be contiguous (https://github.com/pytorch/vision/blob/master/torchvision/transforms/functional.py#L93).
I think It is necessary to make them consistent.

Check the code
torch.from_numpy(np.random.rand(299,299,3).transpose(2,0,1)).is_contiguous()
it will return False.

vfdev-5

Thanks for the PR @dddzg !
This makes sense.

PS: sorry for delay.

vfdev-5 · 2020-09-17T14:03:55Z

@dddzg could you please update your branch to the current master. Thanks !

dddzg · 2020-09-17T14:46:39Z

@vfdev-5 Hi, thanks for your reply. I have updated the branch.

vfdev-5 · 2020-09-17T14:49:41Z

Hi @dddzg could you please update your branch such that we see only your commit. Probably, merge instead of rebase.

dddzg · 2020-09-17T15:01:03Z

oops. Is there any chance I can make it right?

vfdev-5 · 2020-09-17T15:04:44Z

@dddzg thanks, it is OK now !

dddzg · 2020-09-17T15:11:10Z

@vfdev-5 It seems to be OK now with force-push?

vfdev-5 · 2020-09-17T15:13:58Z

@dddzg looks good !

codecov · 2020-09-17T15:41:43Z

Codecov Report

Merging #2483 into master will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #2483   +/-   ##
=======================================
  Coverage   72.57%   72.57%           
=======================================
  Files          95       95           
  Lines        8249     8249           
  Branches     1309     1309           
=======================================
  Hits         5987     5987           
  Misses       1855     1855           
  Partials      407      407

Impacted Files	Coverage Δ
torchvision/transforms/functional.py	`82.13% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e4a9f6...bf0ad73. Read the comment docs.

dddzg · 2020-09-17T17:02:26Z

anything else shall I do?

vfdev-5 · 2020-09-17T17:11:26Z

@dddzg it is OK for now. I'll talk about this PR tomorrow with @fmassa

vfdev-5 · 2020-09-18T13:50:49Z

@dddzg thanks again for your help !

…2483) Co-authored-by: vfdev <vfdev.5@gmail.com>

moi90 · 2021-07-06T07:06:18Z

I'm surprised that you decided to make contiguous the default. I would argue that it would be better to leave them non-contiguous in the general case, because .contiguous() incurs a mandatory copy which is a dramatic performance penalty (and superfluous in most cases as the data needs to be copied a second time when multiple images are assembled into a batch).

fmassa · 2021-08-13T13:34:24Z

I agree with @moi90 comment above

@vfdev-5 do you remember the reasons why we made the tensor contiguous in the first place?

fmassa · 2021-08-13T13:34:58Z

FYI, our read_image functions don't return contiguous tensors specifically for the reasons mentioned by @moi90

vfdev-5 · 2021-08-13T13:43:12Z

I think it was to keep it consistent with PIL input: https://github.com/dddzg/vision/blob/bf0ad73b9610fac8f117810a3e30c7389c727e4f/torchvision/transforms/functional.py#L100

fmassa · 2021-08-13T14:09:20Z

Oh I see. Sounds good then. The ToTensor does too many things internally anyway, and in previous versions of PyTorch performing the / 255 normalization would cast the tensor to be contiguous as well, so let's keep things as is.

moi90 · 2021-08-19T15:21:19Z

I think it was to keep it consistent with PIL input: https://github.com/dddzg/vision/blob/bf0ad73b9610fac8f117810a3e30c7389c727e4f/torchvision/transforms/functional.py#L100

If consistency is important, I would rather remove the .contiguous() in both places.
This way, the conversion can take place implicitly when needed and no overhead for memory copies is incurred otherwise.

But if you think this is the correct behavior, I don't feel too strongly about this.

datumbox · 2021-08-19T15:38:37Z

Ok, taking a step back...

@fmassa @vfdev-5 pil_to_tensor() does not call contiguous while to_tensor() calls contiguous...

Few thoughts:

Shall we align the two?
Can we make pil_to_tensor() call to_tensor() and cast to right type instead of having to separate similar implementations?

fmassa · 2021-08-26T11:40:24Z

@datumbox

Can we make pil_to_tensor() call to_tensor() and cast to right type instead of having to separate similar implementations?

I would actually do it the other way around, and let to_tensor() call pil_to_tensor(). to_tensor is too overloaded IMO, and has brought us many problems when passing numpy arrays internally (i.e., should it normalize or not).

Shall we align the two?

The .contiguous() call has been in to_tensor() since the beginning, but I think there is no strong reason why we couldn't remove the .contiguous() call there nowadays/

dddzg · 2021-08-26T12:08:55Z

@moi90 I agree with you, but I think it is still very necessary to call .contiguous() in the to_tensor().

For example, you are working on model inferencing in terms of single batch image (a very common usage).
The code looks like:

img = read(path)
tensor = normalize(to_tensor(img)).unsqueeze(0)
result = model(tensor)

If the tensor is not contiguous here, the result will be unexpected. The problem is very difficult to find out. (I have met this problem T.T)
Because during training, multiple images are assembled into a contiguous batch tensor. So in most cases, we will not check the contiguous in the model inference and preprocessing.

fmassa · 2021-08-26T12:27:51Z

@dddzg when you say

If the tensor is not contiguous here, the result will be unexpcted

Do you mean that the code will run much slower than if your input tensor was contiguous?

moi90 · 2021-08-26T13:29:32Z

If the tensor is not contiguous here, the result will be unexpected.

If the result is different than with a contiguous tensor, this is very likely a bug.

Because during training, multiple images are assembled into a contiguous batch tensor.

This is exactly my point: Data is copied anyways when the data is assembled into a (contiguous) batch. No need for an additional copy before that just to make the data contiguous.

dddzg · 2021-08-26T14:02:44Z

Hi @moi90 , @fmassa . As for my experience, the model is not a standard pytorch model, it is a TRTModule. It is a pytorch style wrapper of TensorRT. So, the result is totally different than contiguous tensor. And this problem is very difficult to notice if the tensor is not contiguous.

This is exactly my point: Data is copied anyways when the data is assembled into a (contiguous) batch. No need for an additional copy before that just to make the data contiguous.

I agree with you for batch training.

datumbox · 2021-08-26T14:30:17Z

I would actually do it the other way around, and let to_tensor() call pil_to_tensor(). to_tensor is too overloaded IMO, and has brought us many problems when passing numpy arrays internally (i.e., should it normalize or not).

@fmassa As you know I'm all up for simplifying pil_to_tensor() but this would break BC. Given that the functionality of to_tensor() is a superset of pil_to_tensor(), can you clarify your proposal?

fmassa · 2021-08-27T08:56:01Z

@datumbox to_tensor casts the input to float in 0-1 range and handles PIL / numpy data types, and we've had many issues in the past with users using to_tensor on numpy arrays and getting unexpected results. I would have liked to deprecate it at some point.

We decided to split to_tensor in two functions with well-defined scope: pil_to_tensor, which does only PIL -> Tensor and preserves the original memory layout and datatypes (so it's super efficient), and convert_image_dtype which does the casting + normalization following well-documented rules.

My proposal above was to use pil_to_tensor inside to_tensor when the input is a PIL image, and apply the normalization outside as it's currently done. We do save some lines of code, although not much.

@dddzg this seems to indicate that TRT can't handle non-contiguous inputs. In general, I would open an issue in TRT to let them know this so that they can add a .contiguous() call inside their functions, in order to let their codebase be more robust.

datumbox · 2021-08-27T11:27:07Z

I had an offline discussion with @fmassa and also did a small prototype to check if calling pil_to_tensor() from to_tensor() is even possible. Turns out it's not, see #4326. The to_tensor() handles specific corner-cases that pil_to_tensor() does not. So replacing them leads to some tests failing with:

TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.

If you want to read more about some of the inconsistencies of the to_tensor() check the above PR in the notes I left. For me the to_tensor() was "hate at first sight", so I might be biased, but I think we need to deprecate it as soon as possible in favour of using pil_to_tensor() and convert_image_dtype(). Before doing this, we also need to look for cases in our transforms package where we make assumptions over the maximum value of the tensor (usually it's either 1.0 or 255.0) and ensure that those will still work.

theahura · 2021-10-03T16:22:47Z

Not sure if this falls in the realm of 'unexpected results', but wanted to flag #4529 which runs into process hangs specifically on the contiguous() call.

make the tensor continuous when pass numpy object to tensor

67b9a17

dddzg mentioned this pull request Jul 17, 2020

tensor is noncontinuous when passing numpy object to to_tensor #2484

Closed

vfdev-5 approved these changes Sep 17, 2020

View reviewed changes

dddzg force-pushed the master branch from 10f9dc8 to adab3b7 Compare September 17, 2020 15:05

Merge commit '67b9a1733e21ff0ab6974f43025287cc0d47f329'

9b738cd

dddzg force-pushed the master branch from adab3b7 to 9b738cd Compare September 17, 2020 15:09

Merge branch 'master' into master

bf0ad73

vfdev-5 merged commit 7f5b2c4 into pytorch:master Sep 18, 2020

bryant1410 pushed a commit to bryant1410/vision-1 that referenced this pull request Nov 22, 2020

make the tensor continuous when pass numpy object to tensor (pytorch#…

9587bfb

…2483) Co-authored-by: vfdev <vfdev.5@gmail.com>

vfdev-5 added a commit to Quansight/vision that referenced this pull request Dec 4, 2020

make the tensor continuous when pass numpy object to tensor (pytorch#…

cf0b9d4

…2483) Co-authored-by: vfdev <vfdev.5@gmail.com>

NicolasHug mentioned this pull request Aug 19, 2021

Document the subtleties of to/from tensor conversions in a gallery example #4295

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make the tensor continuous when passing numpy object to tensor #2483

make the tensor continuous when passing numpy object to tensor #2483

dddzg commented Jul 17, 2020 •

edited

vfdev-5 left a comment

vfdev-5 commented Sep 17, 2020

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

codecov bot commented Sep 17, 2020 •

edited

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

vfdev-5 commented Sep 18, 2020

moi90 commented Jul 6, 2021

fmassa commented Aug 13, 2021

fmassa commented Aug 13, 2021

vfdev-5 commented Aug 13, 2021

fmassa commented Aug 13, 2021

moi90 commented Aug 19, 2021

datumbox commented Aug 19, 2021

fmassa commented Aug 26, 2021

dddzg commented Aug 26, 2021 •

edited

fmassa commented Aug 26, 2021

moi90 commented Aug 26, 2021

dddzg commented Aug 26, 2021 •

edited

datumbox commented Aug 26, 2021

fmassa commented Aug 27, 2021

datumbox commented Aug 27, 2021

theahura commented Oct 3, 2021

make the tensor continuous when passing numpy object to tensor #2483

make the tensor continuous when passing numpy object to tensor #2483

Conversation

dddzg commented Jul 17, 2020 • edited

vfdev-5 left a comment

Choose a reason for hiding this comment

vfdev-5 commented Sep 17, 2020

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

codecov bot commented Sep 17, 2020 • edited

Codecov Report

dddzg commented Sep 17, 2020

vfdev-5 commented Sep 17, 2020

vfdev-5 commented Sep 18, 2020

moi90 commented Jul 6, 2021

fmassa commented Aug 13, 2021

fmassa commented Aug 13, 2021

vfdev-5 commented Aug 13, 2021

fmassa commented Aug 13, 2021

moi90 commented Aug 19, 2021

datumbox commented Aug 19, 2021

fmassa commented Aug 26, 2021

dddzg commented Aug 26, 2021 • edited

fmassa commented Aug 26, 2021

moi90 commented Aug 26, 2021

dddzg commented Aug 26, 2021 • edited

datumbox commented Aug 26, 2021

fmassa commented Aug 27, 2021

datumbox commented Aug 27, 2021

theahura commented Oct 3, 2021

dddzg commented Jul 17, 2020 •

edited

codecov bot commented Sep 17, 2020 •

edited

dddzg commented Aug 26, 2021 •

edited

dddzg commented Aug 26, 2021 •

edited