Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError: [Errno 2] No such file or directory: 'local-ner-cache/9963044417883968883.spacy' #414

Open
nikolaysm opened this issue Jan 18, 2024 · 3 comments
Labels
bug Something isn't working feat/cache Feature: caching

Comments

@nikolaysm
Copy link

I changed the spacy.NER.v2 to spacy.NER.v3

ValueError: Prompt template in cache directory (local-ner-cache/prompt_template.txt) is not equal with current prompt template. Reset your cache if you are using a new prompt template.

After deleting the folder local-ner-cache, I encountered the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'local-ner-cache/9963044417883968883.spacy'

What is the right way to "Reset your cache if you are using a new prompt template."?

Because after the deleting the folder local-ner-cache, I'm no longer able to annotate the same dataset:

dotenv run -- prodigy ner.llm.correct
image

There are still around 1k samples to annotate.

@rmitsch rmitsch added bug Something isn't working feat/cache Feature: caching labels Jan 19, 2024
@rmitsch
Copy link
Collaborator

rmitsch commented Jan 19, 2024

Hi @nikolaysm! Thanks for reporting this. We can't identify right away why this would happen, but will look into it.

@rmitsch
Copy link
Collaborator

rmitsch commented Jan 19, 2024

Can you provide your spacy-llm config?

@nikolaysm
Copy link
Author

nikolaysm commented Jan 19, 2024

Hi @rmitsch,

spacy-llm-config.cfg:

[paths]
examples = "./assets/examples.json"
template = "./assets/prompt_template.txt/"

[nlp]
lang = "en"
pipeline = ["llm"]

[components]

[components.llm]
factory = "llm"
save_io = true

[components.llm.task]
@llm_tasks = "spacy.NER.v3"
labels = ["ORG", "PERSON"]
description = "Entities are the names of company,
    associated brands, people.
    Adjectives, verbs, adverbs are not entities.
    Pronouns are not entities."

[components.llm.task.template]
@misc = "spacy.FileReader.v1"
path = "${paths.template}"

[components.llm.task.label_definitions]
ORG = "Extract the names of the companies and associated brands, e.g. ...."
PERSON = "Extract the people's names, e.g. ...."

[components.llm.task.examples]
@misc = "spacy.FewShotReader.v1"
path = "${paths.examples}"

[components.llm.model]
# Also tried to use "spacy.GPT-3-5.v3", spacy.GPT-3-5.v1,  "spacy.GPT-4.v1", "spacy.GPT-4.v3"
@llm_models = "spacy.GPT-4.v1"
# Also tried to use "gpt-3.5-turbo", "gpt-4"
name = "gpt-4"
config = {"temperature": 0.3}

[components.llm.cache]
@llm_misc = "spacy.BatchCache.v1"
path = "local-ner-cache"
batch_size = 4
max_batches_in_mem = 10

Next, I see that the number of API requests is 567, which is much more than the total number of samples I annotated, which is 370:
image


Not sure, but the issue may be related to a timeout during the API call to OpenAI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feat/cache Feature: caching
Projects
None yet
Development

No branches or pull requests

2 participants