Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM common metrics for Generative AI #955

Merged
merged 34 commits into from
May 28, 2024

Conversation

drewby
Copy link
Member

@drewby drewby commented Apr 24, 2024

Fixes #811

Changes

This adds initial metric definitions to the current set of gen_ai semantic conventions. These initial two metrics (gen_ai.usage.tokens and gen_ai.request.duration) are a minimal set to get started, and more can be added with future PRs.

Merge requirement checklist

@drewby drewby requested review from a team as code owners April 24, 2024 06:41
docs/gen-ai/llm-metrics.md Outdated Show resolved Hide resolved
docs/gen-ai/llm-metrics.md Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@cartermp cartermp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question/observation:

Should we go through and just use "Gen AI Model" instead of "LLM" throughout the content of this document? Naming attributes genai.* but then having the descriptions talk about an LLM feels a little inconsistent to me now.

I think we're already at the point where the same model supports multi-modal inputs and outputs. Consider the following in Claude's API reference:

Starting with Claude 3 models, you can also send image content blocks:

{"role": "user", "content": [
  {
    "type": "image",
    "source": {
      "type": "base64",
      "media_type": "image/jpeg",
      "data": "/9j/4AAQSkZJRg...",
    }
  },
  {"type": "text", "text": "What is in this image?"}
]}

And with OpenAI it's a little less straightforward, but still possible:

An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.

It's orthogonal to this PR, but maybe it's good to start here and not "limit" ourselves by using the LLM terminology, since it's usually associated with just text interpretation and generation?

@drewby
Copy link
Member Author

drewby commented Apr 26, 2024

A question/observation:

Should we go through and just use "Gen AI Model" instead of "LLM" throughout the content of this document? Naming attributes genai.* but then having the descriptions talk about an LLM feels a little inconsistent to me now.

I think we're already at the point where the same model supports multi-modal inputs and outputs. Consider the following in Claude's API reference:

Starting with Claude 3 models, you can also send image content blocks:

{"role": "user", "content": [
  {
    "type": "image",
    "source": {
      "type": "base64",
      "media_type": "image/jpeg",
      "data": "/9j/4AAQSkZJRg...",
    }
  },
  {"type": "text", "text": "What is in this image?"}
]}

And with OpenAI it's a little less straightforward, but still possible:

An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.

It's orthogonal to this PR, but maybe it's good to start here and not "limit" ourselves by using the LLM terminology, since it's usually associated with just text interpretation and generation?

I agree both addressing this soon and doing it separate from this PR. I've updated anything I could that does not impact Spans yet (keeping this as the metrics PR). Let's create another PR to update the other references to LLM.

model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/registry/gen-ai.yaml Show resolved Hide resolved
model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a few minor-ish comments

model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/registry/gen-ai.yaml Outdated Show resolved Hide resolved
model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved
@drewby
Copy link
Member Author

drewby commented May 25, 2024

LGTM! Left a few minor-ish comments

Thanks! All resolved.

@lmolkova lmolkova merged commit 805cad3 into open-telemetry:main May 28, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

LLM: common and OpenAI metrics
7 participants