There is a significant difference between feed shot and zero shot #371

tianchiguaixia · 2023-11-16T10:53:32Z

If I use fewshot. It was found that the extracted content was much worse than the results extracted by zero shot。
code.zip
The attachment is my code。

rmitsch · 2023-11-16T10:56:26Z

Please copy-paste your code and config (formatted with ``` ```) into this thread.

tianchiguaixia · 2023-11-16T11:04:04Z

examples.yml:

text: 前白蛋白（PA） 302.65 mg/L 180-400
entities:
实验室检查的指标:
- 前白蛋白（PA）
实验室检查的单位:
- mg/L
实验室检查的结果数值:
- 302.65
实验室检查的范围值:
- 180-400
text: 谷氨酰转肽酶（GGT） 17 IU/L 10-60
entities:
实验室检查的指标:
- 谷氨酰转肽酶（GGT）
实验室检查的单位:
- IU/L
实验室检查的结果数值:
- 17
实验室检查的范围值:
- 10-60

fewshot.cfg:
[paths]
examples = null

[nlp]
lang = "zh"
pipeline = ["llm_ner"]

[components]

[components.llm_ner]
factory = "llm"

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v2"
labels = 实验室检查的指标,实验室检查的单位,实验室检查的结果数值,实验室检查的范围值

[components.llm_ner.task.examples]
@misc = "spacy.FewShotReader.v1"
path = ${paths.examples}

[components.llm_ner.model]
@llm_models = "spacy.GPT-3-5.v2"
name = "gpt-3.5-turbo"
config = {"temperature": 0.0}

zeroshot.cfg:

[nlp]
lang = "zh"
pipeline = ["llm_ner"]

[components]

[components.llm_ner]
factory = "llm"

[components.llm_ner.task]
@llm_tasks = "spacy.NER.v2"
labels = 实验室检查的指标,实验室检查的单位,实验室检查的结果数值,实验室检查的范围值

[components.llm_ner.model]
@llm_models = "spacy.GPT-3-5.v2"
name = "gpt-3.5-turbo"
config = {"temperature": 0.0}

run_pipeline.py

import os
from pathlib import Path
from typing import Optional

import typer
from wasabi import msg

from spacy_llm.util import assemble

Arg = typer.Argument
Opt = typer.Option

def run_pipeline(
# fmt: off
text: str = Arg("", help="Text to perform text categorization on."),
config_path: Path = Arg(..., help="Path to the configuration file to use."),
examples_path: Optional[Path] = Arg(None, help="Path to the examples file to use (few-shot only)."),
verbose: bool = Opt(False, "--verbose", "-v", help="Show extra information."),
# fmt: on
):
if not os.getenv("OPENAI_API_KEY", None):
msg.fail(
"OPENAI_API_KEY env variable was not found. "
"Set it by running 'export OPENAI_API_KEY=...' and try again.",
exits=1,
)

msg.text(f"Loading config from {config_path}", show=verbose)
nlp = assemble(
    config_path,
    overrides={}
    if examples_path is None
    else {"paths.examples": str(examples_path)},
)

doc = nlp(text)
msg.text(f"Entities: {[(ent.text, ent.label_,ent.start,ent.end) for ent in doc.ents]}")

if name == "main":
typer.run(run_pipeline)

!python run_pipeline.py
"中国医学科学院阜外医院检验报告单姓名：贾全喜性别：男年龄： 55岁门诊：0066000117992 样品号：科别：门诊床号：诊断：标本种类：血清送检项目： 0265 生化全套项目结果单位参考值项目结果单位 1 前白蛋白（PA） 302.65 mg/L 180-400 参考值 2 ＊总蛋白（TP） 69.9 19＊尿酸（URIC） 542.06 umol/L 1 148.8-416.5 g/L 65-85 20 ＊肌酸激酶（CK） IU/L 0-200 16 3 ＊白蛋白（溴甲酚绿法）（ALB） 41.6 g/L 40-55 21 肌酸激酶同工酶（CKMB-Mass） 2.06 ng/nL 0-5 4 ＊丙氨酸氨基转移酶（ALT） 22 IU/L 9-50 22＊乳酸脱氢酶（LDH） 149 IU/L 0-250 5 ＊天门冬氨酸氨基转移酶（AST） 24 IU/L 15-40 23 淀粉酶（AMY） 100 U/L 0-220 6 ＊碱性磷酸酶（ALP） 85 45-125 24 脂蛋白（a）（Lp（a）） 827.42 ng/L ↑ 10-300 1/0I 7 ＊谷氨酰转肽酶（GGT） 17 IU/L 10-60 25 超敏C反应蛋白（HSCRP） 1.28 mg/L 0.00-3.00 8 ＊总胆红素（TBi1） 16.94 umo1/L 5.1-19 26 同型半胱氨酸（HCY） 8.31 umol/L 6-15 9 直接胆红素（DBil） 4.34 μmol/L 0-6.8 27 游离脂肪酸（FFA） 0.65 mmol/L t 0.1-0.6 10＊钾（K） 4.41 mmol/L 3.5-5.3 28＊甘油三酯（TG） 0.94 mmol/L 0.38-1.76 11＊钠（NA） 141.69 mmol/1 137-147 29＊总胆固醇（CHOL） 3.38 mmol/L 13.64-5.98 12＊氯（CL） 101.89mmol/L 99-110 30＊高密度脂蛋白胆固醇（HDL-C） 1.10 mmol/L 0.7-1.59 13二氧化碳（C02） 32.65 mmol/L ↑21.0-31.0 31＊低密度脂蛋白胆固醇（LDL-C） 1.86 mmol/L 一般人群＜3.37 14＊葡萄糖（GLU） 5.01 mmol/L 3.58-6.05 高危人群＜2.59 15＊磷（P） 0.95 mmol/L ↓0.97-1.50 极高危人群＜2.00 16＊钙（CA） 2.40 mmol/L 2.2-2.75 32 小密低密度脂蛋白（sdLDL） 0.55 mmol/L 0.23-1.39 17＊肌酐（苦味酸法）（CREA） 89.54 umol/L 44-133 33 载脂蛋白A1（apoA1） 1.05 g/L 11.1-1.8 18＊尿素氮（BUN） 5.45 mmol/L 2.86-7.90 34 载脂蛋白B（apoB） 0.67 g/L 0.5-1.2 极高危人群：急性冠脉综合征（ACS）或冠心病／缺血性脑卒中／周围动脉硬化合并糖尿病。申请日期：2021.08.18 采样时间：2021.08.19 09：08 接收时间：2021.08.19 10：08 报告时间：2021.08.19 11：43 申请医师：李子煦检验者：邢跃雷审核者：苏保满备注：此报告仅对送检样本负责。＊标记项目为北京市三级医院互认项目实验诊断中心生化电话： 88398271"
./zeroshot.cfg

tianchiguaixia · 2023-11-16T11:06:51Z

The complete code is included in the attachment

tianchiguaixia · 2023-11-24T09:29:41Z

Hello, I have discovered a problem. Just treat me as a zero shot. Once fewshot reports an error. Why did providing fewshot knowledge report an error.

rmitsch · 2023-11-27T08:26:58Z

It's difficult to diagnose why fewshotting yields worse results here. I recommend debugging with one example at a time and looking into the raw output received from the model (see here on how to do that).

The fact that you get no entities at all if you include fewshot examples indicate that the LLM might have issues understanding those examples, or that the output produced by the LLM if those examples are included is incoherent and cannot be parsed. Either way the best way forward is to have a closer look at how both the prompt and the response look like if you add one example at a time.

tianchiguaixia · 2023-11-27T08:31:12Z

Thank you very much for your answers. It would be even better if we could add a Chinese model later on

rmitsch added the feat/task Feature: tasks label Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There is a significant difference between feed shot and zero shot #371

There is a significant difference between feed shot and zero shot #371

tianchiguaixia commented Nov 16, 2023

rmitsch commented Nov 16, 2023

tianchiguaixia commented Nov 16, 2023

tianchiguaixia commented Nov 16, 2023

tianchiguaixia commented Nov 24, 2023

rmitsch commented Nov 27, 2023

tianchiguaixia commented Nov 27, 2023

There is a significant difference between feed shot and zero shot #371

There is a significant difference between feed shot and zero shot #371

Comments

tianchiguaixia commented Nov 16, 2023

rmitsch commented Nov 16, 2023

tianchiguaixia commented Nov 16, 2023

tianchiguaixia commented Nov 16, 2023

tianchiguaixia commented Nov 24, 2023

rmitsch commented Nov 27, 2023

tianchiguaixia commented Nov 27, 2023