Skip to content

svlandeg/prodigy-openai-recipes

 
 

Repository files navigation

Prodigy OpenAI recipes

This repository contains example code on how to combine zero- and few-shot learning with a small annotation effort to obtain a high-quality dataset with maximum efficiency. Specifically, we use large language models available from OpenAI to provide us with an initial set of predictions, then spin up a Prodigy instance on our local machine to go through these predictions and curate them. This allows us to obtain a gold-standard dataset pretty quickly, and train a smaller, supervised model that fits our exact needs and use-case.

openai_prodigy.mp4

Setup and Install

Make sure to install Prodigy as well as a few additional Python dependencies:

python -m pip install prodigy -f https://XXXX-XXXX-XXXX-XXXX@download.prodi.gy
python -m pip install -r requirements.txt

With XXXX-XXXX-XXXX-XXXX being your personal Prodigy license key.

Then, create a new API key from openai.com or fetch an existing one. Record the secret key as well as the organization key and make sure these are available as environmental variables. For instance, set them in a .env file in the root directory:

OPENAI_ORG = "org-..."
OPENAI_KEY = "sk-..."

ner.openai.correct: NER annotation with zero- or few-shot learning

This recipe marks entity predictions obtained from a large language model and allows you to flag them as correct, or to manually curate them. This allows you to quickly gather a gold-standard dataset through zero-shot or few-shot learning. It's very much like using the standard ner.correct recipe in Prodi.gy, but we're using GPT-3 as a backend model to make predictions.

python -m prodigy ner.openai.correct dataset filepath labels [--options] -F ./recipes/openai_ner.py
Argument Type Description Default
dataset str Prodigy dataset to save annotations to.
file_path Path Path to .jsonl data to annotate. The data should at least contain a "text" field.
labels str Comma-separated list defining the NER labels the model should predict.
--lang, -l str Language of the input data - will be used to obtain a relevant tokenizer. "en"
--segment, -S bool Flag to set when examples should be split into sentences. By default, the full input article is shown. False
--model, -m str GPT-3 model to use for initial predictions. "text-davinci-003"
--prompt_path, -p Path Path to the .jinja2 prompt template. ./templates/ner_prompt.jinja2
--examples-path, -e Path Path to examples to help define the task. The file can be a .yml, .yaml or .json. If set to None, zero-shot learning is applied. None
--max-examples, -n int Max number of examples to include in the prompt to OpenAI. If set to 0, zero-shot learning is always applied, even when examples are available. 2
--batch-size, -b int Batch size of queries to send to the OpenAI API. 10
--verbose, -v bool Flag to print extra information to the terminal. False

Example usage

Let's say we want to recognize dishes, ingredients and cooking equipment from some text we obtained from a cooking subreddit. We'll send the text to GPT-3, hosted by OpenAI, and provide an annotation prompt to explain to the language model the type of predictions we want. Something like:

From the text below, extract the following entities in the following format:
dish: <comma delimited list of strings>
ingredient: <comma delimited list of strings>
equipment: <comma delimited list of strings>

Text:
...

We define the definition of this prompt in a .jinja2 file which also describes how to append examples for few-shot learning. You can create your own template and provide it to the recipe with the --prompt-path or -p option. Additionally, with --examples-path or -e you can set the file path of a .y(a)ml or .json file that contains additional examples:

python -m prodigy ner.openai.correct my_ner_data ./data/reddit_r_cooking_sample.jsonl "dish,ingredient,equipment"
-p ./templates/ner_prompt.jinja2 -e ./examples/input.yaml -n 2 -F ./recipes/openai_ner.py

After receiving the results from the OpenAI API, the Prodigy recipe converts the predictions into an annotation task that can be rendered with Prodigy. The task even shows the original prompt as well as the raw answer we obtained from the language model.

Here, we see that the model is able to correctly recognize dishes, ingredients and cooking equipment right from the start!

The recipe also offers a --verbose or -v option that includes the exact prompt and response on the terminal as traffic is received. Note that because the requests to the API are batched, you might have to scroll back a bit to find the current prompt.

Interactively tune the prompt examples

At some point, you might notice a mistake in the predictions of the OpenAI language model. For instance, we noticed an error in the recognition of cooking equipment in this example:

If you see these kind of systematic errors, you can steer the predictions in the right direction by correcting the example and then selecting the small "flag" icon in the top right of the Prodigy UI:

Once you hit accept on the Prodigy interface, the flagged example will be automatically picked up and added to the examples that are sent to the OpenAI API as part of the prompt. Note that because Prodigy batches these requests, the prompt will be updated with a slight delay, after the next batch of prompts is sent to OpenAI. You can experiment with making the batch size (--batch-size or -b) smaller to have the change come into effect sooner, but this might negatively impact the speed of the annotation workflow.

ner.openai.fetch: Fetch examples up-front

The ner.openai.correct recipe fetches examples from OpenAI while annotating, but we've also included a recipe that can fetch a large batch of examples upfront.

python -m prodigy ner.openai.fetch input_data.jsonl predictions.jsonl "dish,ingredient,equipment" -F ./recipes/ner.py

This will create a predictions.jsonl file that can be loaded with the ner.manual recipe.

Note that the OpenAI API might return "429 Too Many Request" errors when requesting too much data at once - in this case it's best to ensure you only request 100 or so examples at a time.

Exporting the annotations and training an NER model

After you've curated a set of predictions, you can export the results with db-out:

python -m prodigy db-out my_ner_data  > ner_data.jsonl

The format of the exported annotations contains all the data you need to train a smaller model downstream. Each example in the dataset contains the original text, the tokens, span annotations denoting the entities, etc.

You can also export the data to spaCy's binary format, using data-to-spacy. This format lets you load in the annotations as spaCy Doc objects, which can be convenient for further conversion. The data-to-spacy command also makes it easy to train an NER model with spaCy. First you export the data, specifying the train data as 20% of the total:

python -m prodigy data-to-spacy ./data/annotations/ --ner my_ner_data -es 0.2

Then you can train a model with spaCy:

python -m spacy train ./data/annotations/config.cfg --paths.train ./data/annotations/train.spacy --paths.dev ./data/annotations/dev.spacy -o ner-model

This will save a model to the ner-model/ directory.

We've also included an experimental script to load in the .spacy binary format and train a model with the HuggingFace transformers library. You can use the same data you just exported and run the script like this:

# First you need to install the HuggingFace library and requirements
pip install -r requirements_train.txt
python scripts/train_hf_ner.py ./data/annotations/train.spacy ./data/annotations/dev.spacy hf-ner-model

The resulting model will be saved to the hf-ner-model/ directory.

What's next?

There’s lots of interesting follow-up experiments to this, and lots of ways to adapt the basic idea to different tasks or data sets. We’ll definitely follow up with a similar recipe for text categorization, but you can also adapt the recipe yourself in the meantime. We’re also interested to try out different prompts. It’s unclear how much the format the annotations are requested in might change the model’s predictions, or whether there’s a shorter prompt that might perform just as well. We also want to run some end-to-end experiments.

About

✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.7%
  • Jinja 1.8%
  • CSS 1.5%