New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add service to openai integration to use openai vision #117156
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there @balloob, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems useful to follow the conventions in https://www.home-assistant.io/integrations/google_generative_ai_conversation/#service-google_generative_ai_conversationgenerate_content in the homes that we'll be able to converge ollama and google at some point.
We're planning a generative_ai integration that will standardize how we describe images. It's currently waiting for config subentries to be finalized (development is in progress) |
Would you say its worth waiting with this PR and possibly migrating it to whatever the result of that might be? Or should we rather close it for now, since it'll be vastly different bits of code anyway? In which case i'll maybe just dump my changes into a temporary custom component. |
I would stick to a custom component. I don't think we'll make this release but we've moved mountains before. |
Proposed change
OpenAI provides a lot of interesting features. One feature thats quite interesting for home automation scenarios is the Vision API, which lets clients provide images and ask OpenAI questions about it.
I extended the existing openai_conversation integrations which so far only provides a service for image generation. I added an additional service, which takes a camera entity as an input and uses OpenAI to analyze it.
One usecase for this would using OpenAI to find out if the person at your front door is a delivery person, some children, or a group of dogs.
This is obviously very much inspired by AmbleGPT, which does basically the same thing, except better (using several frames and providing more context). The difference being that using my provided change lets you integrate it into arbitrary automations easily.
Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.To help with the load of incoming pull requests: