Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

VideoClips.subset() does not reflect output_format of the original VideoClips #6696

Closed
Simcs opened this issue Oct 4, 2022 · 3 comments
Closed
Assignees
Labels

Comments

@Simcs
Copy link

Simcs commented Oct 4, 2022

馃悰 Describe the bug

When new VideoClips is created using VideoClips.subset() function, output_format of the original VideoClips is not taken into account.

Because of this, passing output_format="TCHW" argument to torchvision.datasets.hmdb51.HMDB51 does not work.
It seems that HMDB51 creates full_video_clips and video_clips for each fold of the dataset using subset() function.

Code snippet

Below is the code snippet for reproducing the bug:

import os
from torchvision.datasets import HMDB51

dataset_root = '~/workspace/dataset/hmdb51'
dataset_root = os.path.expanduser(dataset_root)
annotation_path = '~/workspace/dataset/testTrainMulti_7030_splits'
annotation_path = os.path.expanduser(annotation_path)

hmdb51_thwc = HMDB51(
    root=dataset_root,
    annotation_path=annotation_path,
    frames_per_clip=16,
    step_between_clips=1,
    frame_rate=2,
    fold=1,
    train=False,
    num_workers=8,
    output_format="THWC",
)
video, audio, class_index = hmdb51_thwc[0]
print(video.shape)

hmdb51_tchw = HMDB51(
    root=dataset_root,
    annotation_path=annotation_path,
    frames_per_clip=16,
    step_between_clips=1,
    frame_rate=2,
    fold=1,
    train=False,
    num_workers=8,
    output_format="TCHW",
)
video, audio, class_index = hmdb51_tchw[0]
print(video.shape)

Output:

torch.Size([16, 240, 320, 3])
torch.Size([16, 240, 320, 3])

Possible solution:

Passing output_format of the original VideoClips when instantiating new VideoClips in VideoClips.subset() function.

Versions

Collecting environment information...
PyTorch version: 1.12.1+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.10.6 (main, Aug 30 2022, 03:24:51) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.4.0-126-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: NVIDIA TITAN RTX
GPU 1: NVIDIA TITAN RTX

Nvidia driver version: 515.65.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.2.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.2
[pip3] pytorch-lightning==1.3.8
[pip3] pytorchvideo==0.1.5
[pip3] torch==1.12.1+cu113
[pip3] torchaudio==0.12.1+cu113
[pip3] torchmetrics==0.5.1
[pip3] torchvision==0.13.1+cu113
[conda] Could not collect
@Simcs
Copy link
Author

Simcs commented Oct 4, 2022

Currently, I am bypassing this issue by hardcoding video_clips.output_format = "TCHW".

For example,

hmdb51_tchw = HMDB51(
    root=dataset_root,
    annotation_path=annotation_path,
    frames_per_clip=16,
    step_between_clips=1,
    frame_rate=2,
    fold=1,
    train=False,
    num_workers=8,
    output_format="TCHW",
    _precomputed_metadata=hmdb51_test_metadata,
)
hmdb51_tchw.video_clips.output_format = "TCHW"
video, audio, class_index = hmdb51_tchw[0]
print(video.shape)

Output:

torch.Size([16, 3, 240, 320])

@YosuaMichael YosuaMichael self-assigned this Oct 4, 2022
@YosuaMichael
Copy link
Contributor

Thanks for the report @Simcs , I confirm that this is indeed a bug. I will create a PR to fix this.

@YosuaMichael
Copy link
Contributor

I believe I have fixed for this issue with #6700 (will be in torchvision v0.14) and I will close the issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants