Topic-Modeling-Book-Descriptions

This is an LDA Topic Model trained with a book descriptions dataset. The frontend allows entry of a book description to predict its topic.

We used the popular topic modeling technique LDA. If you want to learn more about it, visit https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation.

Following is a graph that shows log likelyhood score for models with different parameters. We chose 15 topics with a learning decay of 0.6 to maintain a healthy number of topics without sacrificing topic coherence.

The dataset used in this project can be acquired from https://www.kaggle.com/datasets/dylanjcastillo/7k-books-with-metadata

Installation (may vary based on OS)

Clone this repository
Create a Python virtual environment and activate it

python3 -m venv newvenv

cd to backend/services/model directory
Install python dependencies: pip3 install -r requirements.txt
Train the model by running trainModel.py: python trainModel.py
Set BENTOML_CONFIG environment variable

// Windows:
set BENTOML_CONFIG=./config.yaml
// Linux or Unix:
export BENTOML_CONFIG=./config.yaml

Start bentoML dev frontend to test post requests: bentoml serve service.py
Open http://localhost:3001 to see bentoML swagger UI.

Frontend

Change into the frontend folder in a different CLI and install node dependencies: npm install
Start frontend using npm start
Open http://localhost:3000 to see frontend simple webapp

Demo

We need to provide the model with a book description. Let's choose description of a book that came out recently. For this demo, we will use the book Tomorrow, and Tomorrow, and Tomorrow by Gabrielle Zevin. The description is copied from here: https://www.goodreads.com/book/show/58784475-tomorrow-and-tomorrow-and-tomorrow
Clicking Get Topics results in the following output: The book we used in the demo is labeled with the these genres: Fiction, Contemporary, Romance, Audiobook, Literary Fiction, Historical Fiction, Adult. Based on the output, topics 2, 10, 13 and 4 have the highest frequency. Looking at the words for these high frequency topics, we can infer that the model accurately predicts the topic of the book from it's description.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Topic-Modeling-Book-Descriptions

Installation (may vary based on OS)

Frontend

Demo

Demo Gif

Files

README.md

Latest commit

History

README.md

File metadata and controls

Topic-Modeling-Book-Descriptions

Installation (may vary based on OS)

Frontend

Demo

Demo Gif