Skip to content

Dev-Khant/tell-what-a-video-does

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Video Understanding and Q&A Tool

This project allows you to input a YouTube video link, and it provides a comprehensive understanding of the video's content through audio transcription and image captioning. LLM is used to combine audio and video context. Additionally, you can ask questions and it will provide responses according to video content 🚀

Features ✨

👉 Video Understanding: The tool utilizes the Transformer model for audio transcription, converting spoken words into textual format. It also employs image captioning techniques to extract text from images within the video. Image embeddings are also used to compare images and only use images unique for extracting info. Video and Audio are processed parallelly.

👉 Question & Answer: Users can ask questions about the video's content. The tool leverages the power of Chromadb as a vector database to provide accurate and contextually relevant answers.

How to Use ⚙️

• Clone this repository: git clone https://github.com/Dev-Khant/tell-what-a-video-does.git

• Install the required dependencies: pip install -r requirements.txt

• Run the streamlit app: streamlit run app.py

• Provide YouTube video with your OpenAI token, Huggingface token, SerpAPI token

Technical 🖥️

Hugging Face: Utilized to access the OpenAI Whisper model for audio transcription.

SerpApi: Used it to access Google Lens API for getting image information.

Streamlit: Used to create the interactive web interface for the project.

Chromadb: The vector database used for storing and retrieving Q&A information.

Work in Progress 🚧

  1. Add Weaviate and let the user select their VectorDB.
  2. Internet access to chatbot.
  3. Option to upload video.
  4. Store video explanations so they can be used later.

About

Know what a youtube video is about with the help of LLMs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages