Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support baidu ernie bot ai model #974

Closed
wants to merge 4 commits into from

Conversation

hanxiantao
Copy link
Contributor

Ⅰ. Describe what this PR did

Support baidu ernie bot ai model. API documentation: https://console.bce.baidu.com/tools/?_=1692863460488#/api?product=QIANFAN&project=%E5%8D%83%E5%B8%86%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%B9%B3%E5%8F%B0&parent=ERNIE%204.0&api=rpc%2F2.0%2Fai_custom%2Fv1%2Fwenxinworkshop%2Fchat%2Fcompletions_pro&method=post

Ⅱ. Does this pull request fix one issue?

fixes #941

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.3.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志,正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./main.wasm:/etc/envoy/main.wasm

  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"

networks:
  wasmtest: {}

使用OpenAI协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "baidu",
                                  "apiTokens": [
                                    "your-api-token"
                                  ],
                                  "baiduRequestPath": "/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions_pro"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

非流式请求
请求:

{
    "model": "gpt-4-turbo",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ]
}

响应:

{
    "id": "as-zs1d25es9w",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "你好,我是文心一言,英文名是ERNIE Bot。我能够与人对话互动,回答问题,协助创作,高效便捷地帮助人们获取信息、知识和灵感。"
            },
            "finish_reason": "stop"
        }
    ],
    "created": 1716123502,
    "model": "gpt-4-turbo",
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 4,
        "completion_tokens": 33,
        "total_tokens": 37
    }
}
image

流式请求
请求:

{
    "model": "gpt-4-turbo",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}

响应:

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant","content":"你好,"}}],"created":1716124011,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant","content":"我是文心一言,英文名是ERNIE Bot。"}}],"created":1716124012,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"completion_tokens":12,"total_tokens":16}}

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant","content":"我能够与人对话互动,回答问题,协助创作,高效便捷地帮助人们获取信息、知识和灵感。"}}],"created":1716124014,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"completion_tokens":12,"total_tokens":16}}

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant"},"finish_reason":"stop"}],"created":1716124014,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"completion_tokens":33,"total_tokens":37}}


image

使用文心一言协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "baidu",
                                  "apiTokens": [
                                    "your-api-token"
                                  ],
                                  "baiduRequestPath": "/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions_pro",
                                  "protocol": "original"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

非流式请求
请求:

{
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ]
}

响应:

{
    "id": "as-8jeij68pbi",
    "object": "chat.completion",
    "created": 1716124375,
    "result": "你好,我是文心一言,可以协助你完成范围广泛的任务并提供有关各种主题的信息,比如回答问题,提供定义和解释及建议。如果你有任何问题,请随时向我提问。",
    "is_truncated": false,
    "need_clear_history": false,
    "finish_reason": "normal",
    "usage": {
        "prompt_tokens": 4,
        "completion_tokens": 38,
        "total_tokens": 42
    }
}
image

流式请求
请求:

{
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}

响应:

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124427,"sentence_id":0,"is_end":false,"is_truncated":false,"result":"你好,","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":0,"total_tokens":4}}

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124429,"sentence_id":1,"is_end":false,"is_truncated":false,"result":"我是文心一言,可以协助你完成范围广泛的任务并提供有关各种主题的信息,比如回答问题,提供定义和解释及建议。","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":0,"total_tokens":4}}

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124430,"sentence_id":2,"is_end":false,"is_truncated":false,"result":"如果你有任何问题,请随时向我提问。","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":0,"total_tokens":4}}

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124430,"sentence_id":3,"is_end":true,"is_truncated":false,"result":"","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":38,"total_tokens":42}}


image

Ⅴ. Special notes for reviews

…igress into wasm-baidu-ai-proxy

� Conflicts:
�	plugins/wasm-go/extensions/ai-proxy/provider/provider.go

| 名称 | 数据类型 | 填写要求 | 默认值 | 描述 |
| ------------------ | -------- | -------- | ------ | ------------------------------------------------------------ |
| `baiduRequestPath` | string | 必填 | - | 百度文心一言不同的模型请求路径不同,通过该配置区分调用哪个模型 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原本是想要这样处理的
但开发时遇到的问题是,如果要进行模型映射的话,需要在OnRequestBody中根据requestBody中的model来生成不同的requestPath,然后调用_ = util.OverwriteRequestPath(requestPath)方法替换请求头中的:path
但是验证的时候,我这边发现OnRequestBody方法中不能再修改请求头了(报错:error status returned by host: bad argument)
已在https://github.com/alibaba/higress/issues/941下进行沟通

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header 阶段返回这个 HeaderStopIteration 就可以在 body 阶段处理 header 了

return types.HeaderStopIteration

另外,处理完如果需要禁止重新计算路由(建议是这样,以防止用户路由配了精确匹配的),可以执行:
SetProperty([]string{"clear_route_cache"}, []byte{"off"})

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetProperty([]string{"clear_route_cache"}, []byte{"off"})

想了下,这个处理可以在 ai proxy 插件框架里统一做,不用在适配各个模型的地方处理

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header 阶段返回这个 HeaderStopIteration 就可以在 body 阶段处理 header 了

return types.HeaderStopIteration

另外,处理完如果需要禁止重新计算路由(建议是这样,以防止用户路由配了精确匹配的),可以执行: SetProperty([]string{"clear_route_cache"}, []byte{"off"})

header 阶段返回这个 HeaderStopIteration 就可以在 body 阶段处理 header 了

return types.HeaderStopIteration

另外,处理完如果需要禁止重新计算路由(建议是这样,以防止用户路由配了精确匹配的),可以执行: SetProperty([]string{"clear_route_cache"}, []byte{"off"})

@CH3CHO @johnlanni 有个问题请教下,在onHttpRequestHeader中return types.HeaderStopIteration,请求就进入不了OnRequestBody了,使用的gateway:1.4.0-rc.1版本
docker-compose.yaml:

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志,正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./main.wasm:/etc/envoy/main.wasm

  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"

networks:
  wasmtest: {}

envoy.yaml:

# File generated by hgctl. Modify as required.

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "baidu",
                                  "apiTokens": [
                                    "24.723b91d4cfe924bc69534df43e2bf71d.2592000.1718611447.282335-72001699"
                                  ]
                                },"modelMapping": {
                                    "gpt-3": "ERNIE-4.0",
                                    "*": "ERNIE-4.0"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

编译方式要改一下。在 ai-proxy 目录下执行这个命令:

DOCKER_BUILDKIT=1 docker build --build-arg PLUGIN_NAME=ai-proxy --build-arg EXTRA_TAGS=proxy_wasm_version_0_2_100 --build-arg BUILDER=higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/wasm-go-builder:go1.19-tinygo0.28.1-oras1.0.0 -t ai-proxy:0.0.1 --output ./out ../..

@hanxiantao hanxiantao marked this pull request as draft May 19, 2024 13:50
Comment on lines +110 to +113
// 非流式接口返回数据,由OnResponseBody()处理响应
if !request.Stream {
ctx.BufferResponseBody()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我在开发时发现,同时实现了OnStreamingResponseBody和OnResponseBody方法的时候,无论是流式响应还是非流式响应,都只会调用OnStreamingResponseBody方法,跳过OnResponseBody方法。目前我这边处理的时候如果是非流式响应会调用ctx.BufferResponseBody(),保证流式响应交给OnStreamingResponseBody处理,非流式响应交给OnResponseBody处理。这个也可以考虑在 ai proxy 插件框架里统一做,不知道适配通义千问时是否有遇到类似问题 @CH3CHO

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以的。这个可以改成判断 response header 里的 content-type。如果是 text/event-stream,那就走流式处理;反之则缓存下来整体处理。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以的。这个可以改成判断 response header 里的 content-type。如果是 text/event-stream,那就走流式处理;反之则缓存下来整体处理。

我看到你这边新提的pr是这样处理的,我会等#976 这个pr合并后,再调整下我这边的代码

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面PR已经合并了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面PR已经合并了。

收到 我这边后面调整下插件的逻辑,感谢

hanxiantao and others added 2 commits May 23, 2024 20:17
# Conflicts:
#	plugins/wasm-go/extensions/ai-proxy/README.md
#	plugins/wasm-go/extensions/ai-proxy/provider/provider.go
# Conflicts:
#	plugins/wasm-go/extensions/ai-proxy/README.md
#	plugins/wasm-go/extensions/ai-proxy/provider/provider.go
@hanxiantao hanxiantao closed this Jun 1, 2024
@hanxiantao hanxiantao deleted the wasm-baidu-ai-proxy branch June 7, 2024 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AI 代理 Wasm 插件对接百度文心一言
3 participants