Use GPT to classify a bunch of your browser tabs!
Parse bookmarks file to BookmarkFolder class and serialize to JSON. Print result to STDOUT.
-
Export bookmarks to an HTML file. Instructions:
-
Run bookmark_utils
python bookmark_utils.py path/to/bookmarks.html
Process a file of URLs and print to stdout a file with the metadata. Input file should contain one URL per line.
-
Export tabs to a file
- Firefox: use Export Tab URLs extension
-
Run preprocess_urls
python preprocess_urls.py path/to/tabs_file.txt --output_format=[json|yaml|yml]
Currently this file does not classify tabs. It generates the next tokens in a sequence.
-
generate continuation from command line:
python classify_tabs.py gen "lorem ipsum"
-
generate continuation from a file
prompt.txt
:python classify_tabs.py gen_file
- basic prompted prototype
- basic extraction of page metadata given a URL
- parsing of bookmark files
- small set of manually classified data
- generation of prompts from manual data + tag list
- extraction of tags from a generated response
- finetuned model
- larger dataset -- augment via manual review of automated classification?
- finetune GPT-2
- exporting of classified tabs -- input needed from users, below are simply a few ideas
- export to Dendron vault
- export to Notion
- re-add bookmarks to browser
- long-term
- better metadata extraction (keywords via
nltk_rake
, etc) - compatibility with openAI API -- finetune GPT-3?
- browser extension (probably will use openAPI, might be a separate project)
- better metadata extraction (keywords via
-
Install PyTorch to your global python
-
Create venv and allow it to access global packages
python -m venv venv_path --system-site-packages
-
Activate venv
source venv_path/bin/activate
-
Install dependencies
pip install -r requirements.txt
- formatter (black and isort) via
make format
- mypy via
make mypy
- all of the above via
make check