Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is a significant difference between feed shot and zero shot #371

Open
tianchiguaixia opened this issue Nov 16, 2023 · 6 comments
Open
Labels
feat/task Feature: tasks

Comments

@tianchiguaixia
Copy link

If I use fewshot. It was found that the extracted content was much worse than the results extracted by zero shot。
code.zip
The attachment is my code。
image
image

@rmitsch
Copy link
Collaborator

rmitsch commented Nov 16, 2023

Please copy-paste your code and config (formatted with ``` ```) into this thread.

@rmitsch rmitsch added the feat/task Feature: tasks label Nov 16, 2023
@tianchiguaixia
Copy link
Author

examples.yml:

  • text: 前白蛋白(PA) 302.65 mg/L 180-400
    entities:
    实验室检查的指标:
    - 前白蛋白(PA)
    实验室检查的单位:
    - mg/L
    实验室检查的结果数值:
    - 302.65
    实验室检查的范围值:
    - 180-400

  • text: 谷氨酰转肽酶(GGT) 17 IU/L 10-60
    entities:
    实验室检查的指标:
    - 谷氨酰转肽酶(GGT)
    实验室检查的单位:
    - IU/L
    实验室检查的结果数值:
    - 17
    实验室检查的范围值:
    - 10-60

    fewshot.cfg:
    [paths]
    examples = null

[nlp]
lang = "zh"
pipeline = ["llm_ner"]

[components]

[components.llm_ner]
factory = "llm"

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v2"
labels = 实验室检查的指标,实验室检查的单位,实验室检查的结果数值,实验室检查的范围值

[components.llm_ner.task.examples]
@misc = "spacy.FewShotReader.v1"
path = ${paths.examples}

[components.llm_ner.model]
@llm_models = "spacy.GPT-3-5.v2"
name = "gpt-3.5-turbo"
config = {"temperature": 0.0}

zeroshot.cfg:

[nlp]
lang = "zh"
pipeline = ["llm_ner"]

[components]

[components.llm_ner]
factory = "llm"

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v2"
labels = 实验室检查的指标,实验室检查的单位,实验室检查的结果数值,实验室检查的范围值

[components.llm_ner.model]
@llm_models = "spacy.GPT-3-5.v2"
name = "gpt-3.5-turbo"
config = {"temperature": 0.0}

run_pipeline.py

import os
from pathlib import Path
from typing import Optional

import typer
from wasabi import msg

from spacy_llm.util import assemble

Arg = typer.Argument
Opt = typer.Option

def run_pipeline(
# fmt: off
text: str = Arg("", help="Text to perform text categorization on."),
config_path: Path = Arg(..., help="Path to the configuration file to use."),
examples_path: Optional[Path] = Arg(None, help="Path to the examples file to use (few-shot only)."),
verbose: bool = Opt(False, "--verbose", "-v", help="Show extra information."),
# fmt: on
):
if not os.getenv("OPENAI_API_KEY", None):
msg.fail(
"OPENAI_API_KEY env variable was not found. "
"Set it by running 'export OPENAI_API_KEY=...' and try again.",
exits=1,
)

msg.text(f"Loading config from {config_path}", show=verbose)
nlp = assemble(
    config_path,
    overrides={}
    if examples_path is None
    else {"paths.examples": str(examples_path)},
)

doc = nlp(text)
msg.text(f"Entities: {[(ent.text, ent.label_,ent.start,ent.end) for ent in doc.ents]}")

if name == "main":
typer.run(run_pipeline)

!python run_pipeline.py
"中国医学科学院 阜外医院 检验报告单 姓名: 贾全喜 性别:男 年龄: 55岁 门诊:0066000117992 样品号: 科别: 门诊 床号: 诊断: 标本种类:血清 送检项目: 0265 生化全套 项 目 结果 单位 参考值 项 目 结果 单位 1 前白蛋白(PA) 302.65 mg/L 180-400 参考值 2 *总蛋白(TP) 69.9 19*尿酸(URIC) 542.06 umol/L 1 148.8-416.5 g/L 65-85 20 *肌酸激酶(CK) IU/L 0-200 16 3 *白蛋白(溴甲酚绿法)(ALB) 41.6 g/L 40-55 21 肌酸激酶同工酶(CKMB-Mass) 2.06 ng/nL 0-5 4 *丙氨酸氨基转移酶(ALT) 22 IU/L 9-50 22*乳酸脱氢酶(LDH) 149 IU/L 0-250 5 *天门冬氨酸氨基转移酶(AST) 24 IU/L 15-40 23 淀粉酶(AMY) 100 U/L 0-220 6 *碱性磷酸酶(ALP) 85 45-125 24 脂蛋白(a)(Lp(a)) 827.42 ng/L ↑ 10-300 1/0I 7 *谷氨酰转肽酶(GGT) 17 IU/L 10-60 25 超敏C反应蛋白(HSCRP) 1.28 mg/L 0.00-3.00 8 *总胆红素(TBi1) 16.94 umo1/L 5.1-19 26 同型半胱氨酸(HCY) 8.31 umol/L 6-15 9 直接胆红素(DBil) 4.34 μmol/L 0-6.8 27 游离脂肪酸(FFA) 0.65 mmol/L t 0.1-0.6 10*钾(K) 4.41 mmol/L 3.5-5.3 28*甘油三酯(TG) 0.94 mmol/L 0.38-1.76 11*钠(NA) 141.69 mmol/1 137-147 29*总胆固醇(CHOL) 3.38 mmol/L 13.64-5.98 12*氯(CL) 101.89mmol/L 99-110 30*高密度脂蛋白胆固醇(HDL-C) 1.10 mmol/L 0.7-1.59 13二氧化碳(C02) 32.65 mmol/L ↑21.0-31.0 31*低密度脂蛋白胆固醇(LDL-C) 1.86 mmol/L 一般人群<3.37 14*葡萄糖(GLU) 5.01 mmol/L 3.58-6.05 高危人群<2.59 15*磷(P) 0.95 mmol/L ↓0.97-1.50 极高危人群<2.00 16*钙(CA) 2.40 mmol/L 2.2-2.75 32 小密低密度脂蛋白(sdLDL) 0.55 mmol/L 0.23-1.39 17*肌酐(苦味酸法)(CREA) 89.54 umol/L 44-133 33 载脂蛋白A1(apoA1) 1.05 g/L 11.1-1.8 18*尿素氮(BUN) 5.45 mmol/L 2.86-7.90 34 载脂蛋白B(apoB) 0.67 g/L 0.5-1.2 极高危人群:急性冠脉综合征(ACS)或冠心病/缺血性脑卒中/周围动脉硬化合并糖尿病。 申请日期:2021.08.18 采样时间:2021.08.19 09:08 接收时间:2021.08.19 10:08 报告时间:2021.08.19 11:43 申请医师:李子煦 检验者: 邢跃雷 审核者: 苏保满 备 注: 此报告仅对送检样本负责。 *标记项目为北京市三级医院互认项目 实验诊断中心生化 电话: 88398271"
./zeroshot.cfg

@tianchiguaixia
Copy link
Author

The complete code is included in the attachment

@tianchiguaixia
Copy link
Author

Hello, I have discovered a problem. Just treat me as a zero shot. Once fewshot reports an error. Why did providing fewshot knowledge report an error.

@rmitsch
Copy link
Collaborator

rmitsch commented Nov 27, 2023

It's difficult to diagnose why fewshotting yields worse results here. I recommend debugging with one example at a time and looking into the raw output received from the model (see here on how to do that).

The fact that you get no entities at all if you include fewshot examples indicate that the LLM might have issues understanding those examples, or that the output produced by the LLM if those examples are included is incoherent and cannot be parsed. Either way the best way forward is to have a closer look at how both the prompt and the response look like if you add one example at a time.

@tianchiguaixia
Copy link
Author

Thank you very much for your answers. It would be even better if we could add a Chinese model later on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/task Feature: tasks
Projects
None yet
Development

No branches or pull requests

2 participants