Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Usage Tokens Output in Claude API #667

Open
exa256 opened this issue May 14, 2024 · 3 comments
Open

Support Usage Tokens Output in Claude API #667

exa256 opened this issue May 14, 2024 · 3 comments

Comments

@exa256
Copy link

exa256 commented May 14, 2024

Is your feature request related to a problem? Please describe.
Currently, reporting usage dictionary from OpenAI API is supported as seen in this document and usage dictionary. https://python.useinstructor.com/concepts/usage/?h=token+usage

However, Claude API patch does not have this functionality, even though usage is available from a successful 200 response from Anthropic's server:
200 Response from https://docs.anthropic.com/en/api/messages

{
  "content": [
    {
      "text": "Hi! My name is Claude.",
      "type": "text"
    }
  ],
  "id": "msg_013Zva2CMHLNnXjNJJKqJ2EF",
  "model": "claude-3-opus-20240229",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 25
  }
}

Describe the solution you'd like
Instructor should patch Claude's API and surface the usage dictionary as part of the output in the second tuple like so:

structure_output, completion = client.chat.completions.create_with_completion(...)
completion.usage # should returns usage, consists of input and output tokens
@Elijas
Copy link

Elijas commented May 30, 2024

Use Anthropic Claude through LiteLLM, the usage and cost gets reported

import instructor
from litellm import completion
from litellm import completion, completion_cost, cost_per_token
from pydantic import BaseModel


class User(BaseModel):
    name: str
    age: int


client = instructor.from_litellm(completion)

resp, completion = client.chat.completions.create_with_completion(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Extract Jason is 25 years old.",
        }
    ],
    response_model=User,
)

assert isinstance(resp, User)
assert resp.name == "Jason"
assert resp.age == 25

usage = completion.usage
input_tokens = usage.prompt_tokens
output_tokens = usage.completion_tokens
total_tokens = usage.total_tokens
    input_cost_usd, output_cost_usd = cost_per_token(model, prompt_tokens=input_tokens, completion_tokens=output_tokens)
    completion_cost_usd = completion_cost(completion_response=raw_result)

@ssonal
Copy link
Contributor

ssonal commented Jun 1, 2024

Describe the solution you'd like
Instructor should patch Claude's API and surface the usage dictionary as part of the output in the second tuple like so:

structured_output._raw_response.usage works but doesn't take retries into account.

@jxnl maybe we attach cumulative usage data here? It's currently getting lost while processing response.

model._raw_response = response
return model

@Elijas
Copy link

Elijas commented Jun 1, 2024

Describe the solution you'd like
Instructor should patch Claude's API and surface the usage dictionary as part of the output in the second tuple like so:

structured_output._raw_response.usage works but doesn't take retries into account.

@jxnl maybe we attach cumulative usage data here? It's currently getting lost while processing response.

model._raw_response = response
return model

related #715

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants