feat: support baidu ernie bot ai model #974

hanxiantao · 2024-05-19T13:18:10Z

Ⅰ. Describe what this PR did

Support baidu ernie bot ai model. API documentation: https://console.bce.baidu.com/tools/?_=1692863460488#/api?product=QIANFAN&project=%E5%8D%83%E5%B8%86%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%B9%B3%E5%8F%B0&parent=ERNIE%204.0&api=rpc%2F2.0%2Fai_custom%2Fv1%2Fwenxinworkshop%2Fchat%2Fcompletions_pro&method=post

Ⅱ. Does this pull request fix one issue?

fixes #941

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.3.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志，正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./main.wasm:/etc/envoy/main.wasm

  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"

networks:
  wasmtest: {}

使用OpenAI协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "baidu",
                                  "apiTokens": [
                                    "your-api-token"
                                  ],
                                  "baiduRequestPath": "/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions_pro"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

非流式请求
请求：

{
    "model": "gpt-4-turbo",
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ]
}

响应：

{
    "id": "as-zs1d25es9w",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "你好，我是文心一言，英文名是ERNIE Bot。我能够与人对话互动，回答问题，协助创作，高效便捷地帮助人们获取信息、知识和灵感。"
            },
            "finish_reason": "stop"
        }
    ],
    "created": 1716123502,
    "model": "gpt-4-turbo",
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 4,
        "completion_tokens": 33,
        "total_tokens": 37
    }
}

流式请求
请求：

{
    "model": "gpt-4-turbo",
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ],
    "stream": true
}

响应：

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant","content":"你好，"}}],"created":1716124011,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant","content":"我是文心一言，英文名是ERNIE Bot。"}}],"created":1716124012,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"completion_tokens":12,"total_tokens":16}}

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant","content":"我能够与人对话互动，回答问题，协助创作，高效便捷地帮助人们获取信息、知识和灵感。"}}],"created":1716124014,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"completion_tokens":12,"total_tokens":16}}

data:{"id":"as-8tnvsnmncu","choices":[{"index":0,"message":{"role":"assistant"},"finish_reason":"stop"}],"created":1716124014,"model":"gpt-4-turbo","object":"chat.completion","usage":{"prompt_tokens":4,"completion_tokens":33,"total_tokens":37}}

使用文心一言协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "baidu",
                                  "apiTokens": [
                                    "your-api-token"
                                  ],
                                  "baiduRequestPath": "/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions_pro",
                                  "protocol": "original"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

非流式请求
请求：

{
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ]
}

响应：

{
    "id": "as-8jeij68pbi",
    "object": "chat.completion",
    "created": 1716124375,
    "result": "你好，我是文心一言，可以协助你完成范围广泛的任务并提供有关各种主题的信息，比如回答问题，提供定义和解释及建议。如果你有任何问题，请随时向我提问。",
    "is_truncated": false,
    "need_clear_history": false,
    "finish_reason": "normal",
    "usage": {
        "prompt_tokens": 4,
        "completion_tokens": 38,
        "total_tokens": 42
    }
}

流式请求
请求：

{
    "messages": [
        {
            "role": "user",
            "content": "你好，你是谁？"
        }
    ],
    "stream": true
}

响应：

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124427,"sentence_id":0,"is_end":false,"is_truncated":false,"result":"你好，","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":0,"total_tokens":4}}

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124429,"sentence_id":1,"is_end":false,"is_truncated":false,"result":"我是文心一言，可以协助你完成范围广泛的任务并提供有关各种主题的信息，比如回答问题，提供定义和解释及建议。","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":0,"total_tokens":4}}

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124430,"sentence_id":2,"is_end":false,"is_truncated":false,"result":"如果你有任何问题，请随时向我提问。","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":0,"total_tokens":4}}

data: {"id":"as-g91d6f1yyv","object":"chat.completion","created":1716124430,"sentence_id":3,"is_end":true,"is_truncated":false,"result":"","need_clear_history":false,"finish_reason":"normal","usage":{"prompt_tokens":4,"completion_tokens":38,"total_tokens":42}}

Ⅴ. Special notes for reviews

…igress into wasm-baidu-ai-proxy � Conflicts: � plugins/wasm-go/extensions/ai-proxy/provider/provider.go

johnlanni · 2024-05-19T13:34:57Z

plugins/wasm-go/extensions/ai-proxy/README.md

+
+| 名称               | 数据类型 | 填写要求 | 默认值 | 描述                                                         |
+| ------------------ | -------- | -------- | ------ | ------------------------------------------------------------ |
+| `baiduRequestPath` | string   | 必填     | -      | 百度文心一言不同的模型请求路径不同，通过该配置区分调用哪个模型 |


建议参考下：https://github.com/songquanpeng/one-api/blob/91b80ae87945ed1a77b3507dd277ee9cdddaa0b4/relay/adaptor/baidu/adaptor.go#L24

尽量统一各模型的配置方式，不做特殊化

原本是想要这样处理的
但开发时遇到的问题是，如果要进行模型映射的话，需要在OnRequestBody中根据requestBody中的model来生成不同的requestPath，然后调用_ = util.OverwriteRequestPath(requestPath)方法替换请求头中的:path
但是验证的时候，我这边发现OnRequestBody方法中不能再修改请求头了（报错：error status returned by host: bad argument）
已在https://github.com/alibaba/higress/issues/941下进行沟通

header 阶段返回这个 HeaderStopIteration 就可以在 body 阶段处理 header 了

higress/plugins/wasm-go/extensions/transformer/main.go

Line 345 in 2ff56c8

return types.HeaderStopIteration

另外，处理完如果需要禁止重新计算路由（建议是这样，以防止用户路由配了精确匹配的），可以执行：
SetProperty([]string{"clear_route_cache"}, []byte{"off"})

SetProperty([]string{"clear_route_cache"}, []byte{"off"})

想了下，这个处理可以在 ai proxy 插件框架里统一做，不用在适配各个模型的地方处理

header 阶段返回这个 HeaderStopIteration 就可以在 body 阶段处理 header 了

higress/plugins/wasm-go/extensions/transformer/main.go

Line 345 in 2ff56c8

return types.HeaderStopIteration

另外，处理完如果需要禁止重新计算路由（建议是这样，以防止用户路由配了精确匹配的），可以执行： SetProperty([]string{"clear_route_cache"}, []byte{"off"})

header 阶段返回这个 HeaderStopIteration 就可以在 body 阶段处理 header 了

higress/plugins/wasm-go/extensions/transformer/main.go

Line 345 in 2ff56c8

return types.HeaderStopIteration

另外，处理完如果需要禁止重新计算路由（建议是这样，以防止用户路由配了精确匹配的），可以执行： SetProperty([]string{"clear_route_cache"}, []byte{"off"})

@CH3CHO @johnlanni 有个问题请教下，在onHttpRequestHeader中return types.HeaderStopIteration，请求就进入不了OnRequestBody了，使用的gateway:1.4.0-rc.1版本
docker-compose.yaml：

version: '3.7' services: envoy: image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.0-rc.1 entrypoint: /usr/local/bin/envoy # 注意这里对wasm开启了debug级别日志，正式部署时则默认info级别 command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug depends_on: - httpbin networks: - wasmtest ports: - "10000:10000" volumes: - ./envoy.yaml:/etc/envoy/envoy.yaml - ./main.wasm:/etc/envoy/main.wasm httpbin: image: kennethreitz/httpbin:latest networks: - wasmtest ports: - "12345:80" networks: wasmtest: {}

envoy.yaml：

# File generated by hgctl. Modify as required. admin: address: socket_address: protocol: TCP address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: protocol: TCP address: 0.0.0.0 port_value: 10000 filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager scheme_header_transformation: scheme_to_overwrite: https stat_prefix: ingress_http # Output envoy logs to stdout access_log: - name: envoy.access_loggers.stdout typed_config: "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog # Modify as required route_config: name: local_route virtual_hosts: - name: local_service domains: [ "*" ] routes: - match: prefix: "/" route: cluster: baidu timeout: 300s http_filters: - name: wasmtest typed_config: "@type": type.googleapis.com/udpa.type.v1.TypedStruct type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm value: config: name: wasmtest vm_config: runtime: envoy.wasm.runtime.v8 code: local: filename: /etc/envoy/main.wasm configuration: "@type": "type.googleapis.com/google.protobuf.StringValue" value: | { "provider": { "type": "baidu", "apiTokens": [ "24.723b91d4cfe924bc69534df43e2bf71d.2592000.1718611447.282335-72001699" ] },"modelMapping": { "gpt-3": "ERNIE-4.0", "*": "ERNIE-4.0" } } - name: envoy.filters.http.router clusters: - name: httpbin connect_timeout: 30s type: LOGICAL_DNS # Comment out the following line to test on v6 networks dns_lookup_family: V4_ONLY lb_policy: ROUND_ROBIN load_assignment: cluster_name: httpbin endpoints: - lb_endpoints: - endpoint: address: socket_address: address: httpbin port_value: 80 - name: baidu connect_timeout: 30s type: LOGICAL_DNS dns_lookup_family: V4_ONLY lb_policy: ROUND_ROBIN load_assignment: cluster_name: baidu endpoints: - lb_endpoints: - endpoint: address: socket_address: address: aip.baidubce.com port_value: 443 transport_socket: name: envoy.transport_sockets.tls typed_config: "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext "sni": "aip.baidubce.com"

编译方式要改一下。在 ai-proxy 目录下执行这个命令：

DOCKER_BUILDKIT=1 docker build --build-arg PLUGIN_NAME=ai-proxy --build-arg EXTRA_TAGS=proxy_wasm_version_0_2_100 --build-arg BUILDER=higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/wasm-go-builder:go1.19-tinygo0.28.1-oras1.0.0 -t ai-proxy:0.0.1 --output ./out ../..

hanxiantao · 2024-05-19T14:03:22Z

plugins/wasm-go/extensions/ai-proxy/provider/baidu.go

+	// 非流式接口返回数据,由OnResponseBody()处理响应
+	if !request.Stream {
+		ctx.BufferResponseBody()
+	}


我在开发时发现，同时实现了OnStreamingResponseBody和OnResponseBody方法的时候，无论是流式响应还是非流式响应，都只会调用OnStreamingResponseBody方法，跳过OnResponseBody方法。目前我这边处理的时候如果是非流式响应会调用ctx.BufferResponseBody()，保证流式响应交给OnStreamingResponseBody处理，非流式响应交给OnResponseBody处理。这个也可以考虑在 ai proxy 插件框架里统一做，不知道适配通义千问时是否有遇到类似问题 @CH3CHO

可以的。这个可以改成判断 response header 里的 content-type。如果是 text/event-stream，那就走流式处理；反之则缓存下来整体处理。

可以的。这个可以改成判断 response header 里的 content-type。如果是 text/event-stream，那就走流式处理；反之则缓存下来整体处理。

我看到你这边新提的pr是这样处理的，我会等#976 这个pr合并后，再调整下我这边的代码

上面PR已经合并了。

上面PR已经合并了。

收到我这边后面调整下插件的逻辑，感谢

# Conflicts: # plugins/wasm-go/extensions/ai-proxy/README.md # plugins/wasm-go/extensions/ai-proxy/provider/provider.go

hanxiantao added 2 commits May 19, 2024 20:54

feat: support baidu ernie bot ai model

b80f49d

Merge branch 'wasm-baidu-ai-proxy' of https://github.com/hanxiantao/h…

469255c

…igress into wasm-baidu-ai-proxy � Conflicts: � plugins/wasm-go/extensions/ai-proxy/provider/provider.go

hanxiantao requested review from johnlanni and WeixinX as code owners May 19, 2024 13:18

johnlanni requested a review from CH3CHO May 19, 2024 13:32

johnlanni requested changes May 19, 2024

View reviewed changes

hanxiantao marked this pull request as draft May 19, 2024 13:50

hanxiantao commented May 19, 2024

View reviewed changes

hanxiantao and others added 2 commits May 23, 2024 20:17

Merge remote-tracking branch 'origin/main' into wasm-baidu-ai-proxy

9c0829c

# Conflicts: # plugins/wasm-go/extensions/ai-proxy/README.md # plugins/wasm-go/extensions/ai-proxy/provider/provider.go

Merge remote-tracking branch 'origin/main' into wasm-baidu-ai-proxy

dc8206a

# Conflicts: # plugins/wasm-go/extensions/ai-proxy/README.md # plugins/wasm-go/extensions/ai-proxy/provider/provider.go

hanxiantao closed this Jun 1, 2024

hanxiantao deleted the wasm-baidu-ai-proxy branch June 7, 2024 00:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support baidu ernie bot ai model #974

feat: support baidu ernie bot ai model #974

hanxiantao commented May 19, 2024

johnlanni May 19, 2024

hanxiantao May 19, 2024

johnlanni May 19, 2024

johnlanni May 19, 2024

hanxiantao May 25, 2024 •

edited

CH3CHO May 27, 2024

hanxiantao May 19, 2024 •

edited

CH3CHO May 20, 2024

hanxiantao May 20, 2024

CH3CHO May 20, 2024

CH3CHO May 28, 2024

hanxiantao May 28, 2024

feat: support baidu ernie bot ai model #974

feat: support baidu ernie bot ai model #974

Conversation

hanxiantao commented May 19, 2024

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

使用OpenAI协议

使用文心一言协议

Ⅴ. Special notes for reviews

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanxiantao May 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanxiantao May 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanxiantao May 25, 2024 •

edited

hanxiantao May 19, 2024 •

edited