Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raget: Possible miscalculation of all Ragas metrics, in particular Precision and Recall #1924

Closed
Chabert-Liddell opened this issue May 3, 2024 · 2 comments

Comments

@Chabert-Liddell
Copy link

Issue Type

Bug

Source

source

Giskard Library Version

2.11

Giskard Hub Version

OS Platform and Distribution

No response

Python version

No response

Installed python packages

No response

Current Behaviour?

Giskard RAGet uses the reference context when calling Ragas. 

https://github.com/Giskard-AI/giskard/blob/main/giskard/rag/metrics/ragas_metrics.py

        ragas_sample = {
            "question": question_sample["question"],
            "answer": answer,
            "contexts": question_sample["reference_context"].split("\n\n"),
            "ground_truth": question_sample["reference_answer"],
        }

According to Ragas documentation the retrieved context should be used (the one used for the answer Generation).

As an example, when computing Precision or Recall which both uses {"question", "contexts", "ground_truth"}, if you are giving the reference context, then you are evaluating your test set generation pipeline  and not your RAG pipeline.

Standalone code OR list down the steps to reproduce the issue

.

Relevant log output

No response

@alexcombessie
Copy link
Member

@pierlj what do you think?

@pierlj
Copy link
Contributor

pierlj commented May 7, 2024

Hi @Chabert-Liddell, you are right, thanks for pointing this out. A fix will be release soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants