Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterable not working with larger nested structures #657

Closed
5 of 7 tasks
timlod opened this issue May 10, 2024 · 3 comments
Closed
5 of 7 tasks

Iterable not working with larger nested structures #657

timlod opened this issue May 10, 2024 · 3 comments

Comments

@timlod
Copy link

timlod commented May 10, 2024

  • [?] This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

What Model are you using?

  • gpt-3.5-turbo
  • gpt-4-turbo
  • gpt-4
  • Other (please specify)

Describe the bug
I have a relatively complex BaseModel, with nested BaseModels. When asking for a generation with that model, it works just fine, but as soon as I want to retrieve multiple within a single call it fails to validate.
I can't share the exact model I used, so I created a working example using the example from https://github.com/wandb/edu/blob/main/llm-structured-extraction/2.tips.ipynb.

This seems to become and issue only once we ask for an Iterable of 3 nested BaseModels.
In the example I've created no models can create age_range at all. If we remove the top level (i.e. use Character directly instead of CharacterAndNickname), it will add the age_range just fine.

In my specific case, the model doesn't appear to be followed at all once I ask for an Iterable.

I'm not entirely sure if this is an actual bug, or just a limitation of capabilities. But the fact that GPT4 is not a bit better in this case than GPT3.5 makes me think it might be a bug.

To Reproduce

class Range(BaseModel):
    minimum: Optional[int] = None
    maximum: Optional[int] = None

class CharacterAndNickname(BaseModel):
    nickname: str
    character: Character

class Character(BaseModel):
    id: int
    name: str
    friends_array: List[int] = Field(
        description="Relationships to their friends using the id"
    )
    age_range: Range = Field(
        description=(
            "The range of ages the character has over the course of the"
            " series."
        )
    )


resp = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": (
                "5 kids from Harry Potter. Make sure to get all numbers"
                " right, especially their age ranges!"
            ),
        }
    ],
    response_model=Iterable[CharacterAndNickname],
)

for character in resp:
    print(character)

In this case we get the following errors:

ValidationError: 5 validation errors for IterableCharacterAndNickname
tasks.0.character.age_range
  Input should be an object [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type
tasks.1.character.age_range
  Input should be an object [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type
tasks.2.character.age_range
  Input should be an object [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type
tasks.3.character.age_range
  Input should be an object [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type
tasks.4.character.age_range
  Input should be an object [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type

If needed to add detail to get to this failure, but it's interesting that all models failed to generate multiple characters once I added the third nested level (GPT 3.5, 4, and the turbos).

Expected behavior
Multiple results generated regardless of complexity.

@jxnl
Copy link
Owner

jxnl commented May 10, 2024

For something like this, I would rather improve the dog string or include an example.

@timlod
Copy link
Author

timlod commented May 10, 2024

You mean docstring? Not sure which one you'd be referring to, or what kind of example to include where.

@jxnl
Copy link
Owner

jxnl commented May 11, 2024

@jxnl jxnl closed this as not planned Won't fix, can't repro, duplicate, stale Jun 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants