Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure OpenAI binding: Enable support for multiple endpoints #3381

Open
stuartleeks opened this issue Mar 18, 2024 · 1 comment
Open

Azure OpenAI binding: Enable support for multiple endpoints #3381

stuartleeks opened this issue Mar 18, 2024 · 1 comment
Labels
kind/enhancement New feature or request P1 pinned Issue does not get stale
Milestone

Comments

@stuartleeks
Copy link
Contributor

Describe the feature

The current Azure OpenAI binding takes in configuration for connecting to a single Azure OpenAI endpoint.

In an number of scenarios it is useful to be able to work with a number of Azure OpenAI endpoints.

Scenario 1 - fail-over

For high-volume usage, customers may purchase a Provisioned Throughput Unit(PTU). In this scenario, the PTU capacity isn't always sufficient for peak-load and a customer might want to send a request to the PTU first and then re-send to a Pay-As-You-Go (PAYG) endpoint if the PTU endpoint returns a 429 response.

Scenario 2 - round-robin

The limits for Azure OpenAI are per-region and customers may set up multiple PAYG endpoints across regions and want to distribute requests between them

Proposal

Sometimes customers with either of the above requirements will set up a gateway in front of the Azure OpenAI endpoints and have that handle the load distribution, but in other cases they come back to the application code to add these capabilities in as the usage scales up.

The proposal is to update the Azure OpenAI binding to allow multiple endpoints to be configured along with a distribution mode (failover or round-robin).

Release Note

RELEASE NOTE: ADD Enable multiple endpoints to be configured in Azure OpenAI binding.

@stuartleeks stuartleeks added the kind/enhancement New feature or request label Mar 18, 2024
@ItalyPaleAle ItalyPaleAle added P1 pinned Issue does not get stale labels Mar 20, 2024
@ItalyPaleAle ItalyPaleAle added this to the v1.13 milestone Mar 20, 2024
@berndverst
Copy link
Member

@ItalyPaleAle since our OpenAI binding component is not stable, I don't think this is a P1 for the project.

Instead, updating the component to the latest SDK and making the component stable should be P1s first. Then this item could follow.

Did you mean to add it to milestone v1.14 by the way?

@ItalyPaleAle ItalyPaleAle modified the milestones: v1.13, v1.14 Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request P1 pinned Issue does not get stale
Projects
None yet
Development

No branches or pull requests

3 participants