This is my personal movie prediction repository.
First Install PyTorch, example:
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
Then install this repository:
pip install git+https://github.com/jhe921/movie_prediction.git
Training a principal prediction model requires runniing the principal_prediction
notebook. In order to run the notebook you will need:
- jupyter installed
- The cornell movie dialog corpus downloaded to
data/movie_prediction
- The IMDb movies extensive dataset downloaded to
data/imdb_movie_meta
To run this language model training without the notebook, export the Utterances
column of your actor_lines.tsv
file to a .txt file and run:
python run_mlm.py --model_name_or_path distilbert-base-uncased \
--train_file <PATH-TO-YOUR-.txt-FILE> --do_train \
--output_dir movie_prediction/models/distilbert-base-uncased-movie-tuned \
--line_by_line --max_seq_len 75
To run the interactive streamlit app:
- Copy a principal prediction model to
movie_prediction/models/principal-prediction-tuned-inference
- Type
streamlit run app.py
into your shell
To serve a model with fastapi:
- Copy a principal prediction model to
movie_prediction/models/principal-prediction-tuned-inference
- Type
uvicorn main:app
into your shell