Vision Problem Solvings

Solutions for Computer Vision problems.

1. ViViT

1.1. Video classification with ViViT

This notebook is a trial of ViViT(Video Vision Transformer) model.

The dataset used is SynapseMNIST3D of MedMNIST3D, where each data is a sequence of synapse images representing 3D volume. The data samples are displayed inside the notebook with Jupyter Widget.

Inference

These are some of the samples from inference.

1.2. ViViT with/without Token Learner

This notebook explores the effect of Token Learner put in ViViT.

The datasets used for training are from MedMNIST 3D, which contains medical 3D images with different types of classes. The model structure was tested on patch size 8 and 16, and token learner was put in the middle (half point of the transformer blocks). AdamW optimization method was used for regulralization and the learning rate was reduced on plateau.

The Result

The overall performance of the model with token learner was better than the naive model in validation accracy and loss over epochs. Also, there was no signs of overfitting with token learner even though the training time was shortened. The result shows that with token learners models learn faster, without significant risk of overfitting.

All of the result graphs are displayed on TensorBoard.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
ViViT		ViViT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ViViT

ViViT

README.md

README.md

Repository files navigation

Vision Problem Solvings

1. ViViT

1.1. Video classification with ViViT

Inference

1.2. ViViT with/without Token Learner

The Result

About

Releases

Packages

Languages

nearnear/vision-studies

Folders and files

Latest commit

History

ViViT

ViViT

README.md

README.md

Repository files navigation

Vision Problem Solvings

1. ViViT

1.1. Video classification with ViViT

Inference

1.2. ViViT with/without Token Learner

The Result

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages