Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sonic/decoder.StreamDecoder return unexpected SyntaxError #293

Closed
Betelgeuse1 opened this issue Sep 2, 2022 · 10 comments
Closed

sonic/decoder.StreamDecoder return unexpected SyntaxError #293

Betelgeuse1 opened this issue Sep 2, 2022 · 10 comments

Comments

@Betelgeuse1
Copy link

Issue

When using a StreamDecoder, I run into unexpected syntax error that neither go-json nor encoding/json seems to detect.
I try to Pretouch the struct (you never know) but didn't change a thing. It seems like no option could resolve this.

Not that great solution

I manage to make it work by using a sonic.Decoder and a scanner like so

var (
    dec sonic.Decoder
    input string
    record MyStruct
)
scanner := bufio.NewScanner(os.Stdin)

for scanner.Scan() {
    input = scanner.Text()
    dec.Reset(input)
    dec.Decode(&record)
    // ....
}

But it makes the package way slower than go-json where it should be alot faster.

Solutions ?

Is there a way to enforce my StreamDecoder to skip this value and goto next one ?

@Betelgeuse1
Copy link
Author

Betelgeuse1 commented Sep 5, 2022

Here is a sample of the data I'm using
{
  "id": "65513c4d-419e-4d95-9645-4a10d6ab8ab2",
  "status": "DECLINED",
  "error_message": "",
  "raw_bid_request": {
    "id": "65513c4d-419e-4d95-9645-4a10d6ab8ab2",
    "imp": [
      {
        "id": "display-area-1-0",
        "banner": {
          "w": 1280,
          "h": 720,
          "mimes": [
            "image/jpeg",
            "image/png"
          ],
          "ext": {
            "dooh": {
              "impmultiply": 5
            }
          }
        },
        "video": {
          "mimes": [
            "video/mp4",
            "video/mpeg"
          ],
          "w": 1280,
          "h": 720,
          "protocols": [
            3,
            7
          ],
          "ext": {
            "dooh": {
              "impmultiply": 5
            }
          }
        },
        "bidfloor": 5,
        "bidfloorcur": "USD",
        "pmp": {
          "private_auction": 1,
          "deals": [
            {
              "id": "XXXXXXXXXXXXXX",
              "bidfloor": 5.25,
              "at": 2
            },
            {
              "id": "RASTIVDOOHHHHHOOO",
              "bidfloor": 5,
              "at": 2
            }
          ]
        },
        "exp": 3331
      }
    ],
    "site": {
      "domain": "REDACTEDXXXXXX-YYZYYYZY.com",
      "publisher": {
        "id": "046c62471f3308530741043b60020719371c365c0736",
        "name": "REDACTEDXXXXXX-YYZYYYZY"
      }
    },
    "device": {
      "ua": "RASTIV Media 1.0",
      "geo": {
        "lat": 40,
        "lon": 74,
        "country": "USA",
        "type": 1
      },
      "ext": {
        "dooh": {
          "industryid": "Bus Station 10",
          "venuetypeids": [
            3,
            303
          ],
          "publicid": "SOMEIDS"
        }
      }
    },
    "at": 2,
    "tmax": 1000,
    "cur": [
      "USD"
    ],
    "wlang": [
      "en"
    ]
  },
  "bid_request_date": "2022-09-01T14:19:09.565278041Z",
  "bid_url": "/ssp/RASTIV/bid-request",
  "bid_request_id": "65513c4d-419e-4d95-9645-4a10d6ab8ab2",
  "timezone": "Asia/Shanghai",
  "lat": 40,
  "lon": 74,
  "display_time": "0001-01-01T00:00:00Z",
  "ssp_name": "RASTIV",
  "frame_external_id": "SOMEIDS",
  "device_id": "RASTIV/SOMEIDS",
  "publisher_id": "046c62471f3308530741043b60020719371c365c0736",
  "publisher_name": "REDACTEDXXXXXX-YYZYYYZY",
  "auction_type": 2,
  "targeting_id": "",
  "targeting_deal_id": "",
  "creative_id": "",
  "line_item_id": "",
  "line_item_name": "",
  "campaign_id": "",
  "campaign_name": "",
  "agency_name": "",
  "advertiser_name": "",
  "targeting_cpm": 0,
  "targeting_currency_code": "",
  "targeting_change_rate": 0,
  "targeting_budget": 0,
  "targeting_daily_budget": 0,
  "targeting_hourly_budget": 0,
  "targeting_start_date": "0001-01-01T00:00:00Z",
  "targeting_end_date": "0001-01-01T00:00:00Z",
  "targeting_dsp_fee_rate": 0,
  "targeting_use_geofencing": false,
  "targeting_triggers": null,
  "targeting_adsquare_segments": null,
  "targeting_adsquare_max_score": 0,
  "targeting_adsquare_min_score": 0,
  "targeting_options": null,
  "creative_status": "",
  "creative_width": 0,
  "creative_height": 0,
  "creative_duration": 0,
  "creative_mime": "",
  "creative_type": "",
  "raw_bid_response": "",
  "bid_response_date": "2022-09-01T14:19:09.566063136Z",
  "bid_response_code": 204,
  "no_bid_reason": "NO_COMPLIANT_TARGETING",
  "no_bid_targeting_reasons": null,
  "bid_response_price": 0,
  "bid_response_currency": "",
  "bid_response_imp_id": "",
  "bid_response_deal_id": "",
  "bid_floor": 0,
  "bid_floor_currency": "",
  "bought": false,
  "win_date": "0001-01-01T00:00:00Z",
  "win_url": "",
  "loss_date": "0001-01-01T00:00:00Z",
  "loss_reason": "",
  "loss_url": "",
  "pop_date": "0001-01-01T00:00:00Z",
  "pop_url": "",
  "auction_price": 0,
  "auction_currency": "",
  "imps": 5,
  "bid_cost": 0,
  "cost": 0,
  "bid_net_cost": 0,
  "net_cost": 0,
  "cost_dsp_fee": 0,
  "cost_data": 0,
  "cost_service": 0,
  "cost_data_details": null,
  "cost_service_details": null,
  "imp_multiply": 5,
  "exp": 0,
  "last_budget_spent": 0,
  "last_budget_in_flight": 0,
  "frame_country": "",
  "frame_city": "",
  "frame_address": "",
  "frame_name": "",
  "frame_venue_types": null,
  "bid_request_date_localized": "2022-09-01T22:19:09.565278+08:00",
  "bid_request_date_localized_day": 3,
  "bid_request_date_localized_hour": 22,
  "frame_id": "SOMEIDS",
  "bid_request_device_frame_venue_types_id": [],
  "bid_request_device_lat": 40,
  "bid_request_device_lon": 74,
  "bid_request_device_country": "USA",
  "bid_request_device_type": 1,
  "bid_request_device_h": null,
  "bid_request_device_ifa": null,
  "bid_request_device_ip": null,
  "bid_request_device_ua": "RASTIV Media 1.0",
  "bid_request_device_w": null,
  "bid_request_site_name": null,
  "bid_request_site_publisher_name": "REDACTEDXXXXXX-YYZYYYZY",
  "bid_request_site_publisher_id": "046c62472f3303530841043b60020719371c365c0736",
  "bid_request_site_publisher_domain": null,
  "bid_request_site_id": null,
  "bid_request_site_cat": "null",
  "bid_request_tmax": 1000,
  "bid_request_cur": [
    "USD"
  ],
  "bid_request_at": 2,
  "bid_request_imp_deal_imp_id": "display-area-1-0",
  "bid_request_imp_deal_deal_id": "XXXXXXXXXXXXXX",
  "bid_request_imp_deal_video_width": 1280,
  "bid_request_imp_deal_video_height": 720,
  "bid_request_imp_deal_video_mimes": [
    "video/mp4",
    "video/mpeg"
  ],
  "bid_request_imp_deal_video_min_duration": 0,
  "bid_request_imp_deal_video_max_duration": 0,
  "bid_request_imp_deal_banner_width": 1280,
  "bid_request_imp_deal_banner_height": 720,
  "bid_request_imp_deal_banner_mimes": [
    "image/jpeg",
    "image/png"
  ],
  "bid_request_imp_deal_private_auction": 1,
  "bid_request_imp_deal_auction_type": 2,
  "bid_request_imp_deal_bid_floor": 5.25,
  "bid_request_imp_deal_bid_floor_currency": "USD"
}

It's on one line when I'm working with sonic's decoder, just prettify it for readability.

Usually, sonic fails while reading a floating point value.
Sonic error:

Syntax error at index 2559: invalid char
        ne":"America/New_York","lat":33.
        ...............................^

What's weird is that it never fails on the first read values, but those have floating point too since they've coordinates.

@AsterDY
Copy link
Collaborator

AsterDY commented Sep 7, 2022

It seems the reading buffer happens to truncate a number into two parts: one is read into the buffer while the other is still in the stream. Due to native.skip() function returns 'invalid char' for this case, StreamDecoder will stop reading and return error. Let me think how to fix it...

@liuq19
Copy link
Collaborator

liuq19 commented Sep 7, 2022

can you give a minimal example for us to recurrence this problem, it may be very helpful, thx.

@Betelgeuse1
Copy link
Author

Yep, let me just quickly anonymize the data. Not sure if it will still trigger the issue, since it seems related to the data length.

@AsterDY
Copy link
Collaborator

AsterDY commented Sep 7, 2022

#295 fixed

@AsterDY AsterDY closed this as completed Sep 7, 2022
@Betelgeuse1
Copy link
Author

I got another error, but that must be related, on a - sign.

24,"geo":{"country":"US","lon":-
................................^

My go.mod, I'm using last commit.

// ....
require (
	github.com/goccy/go-json v0.9.11
	github.com/bytedance/sonic v1.4.1-0.20220907115413-6e979df0d373
)

require (
	github.com/chenzhuoyu/base64x v0.0.0-20211019084208-fb5309c8db06 // indirect
	github.com/klauspost/cpuid/v2 v2.0.9 // indirect
	github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
	golang.org/x/arch v0.0.0-20210923205945-b76863e36670 // indirect
)

@AsterDY
Copy link
Collaborator

AsterDY commented Sep 7, 2022

could you give more specific case like issue293_test.go ?

@Betelgeuse1
Copy link
Author

something like this ?

func TestIssue293Sign(t *testing.T) {
    left := `{"a":`
    var data = left+strings.Repeat(" ", 4096 - len(left)-1) + "-33.0}"
    sd := decoder.NewStreamDecoder(strings.NewReader(data))
    var v = struct{
        A json.RawMessage
    }{}
    err := sd.Decode(&v)
    if err != nil {
	if e, ok := err.(decoder.SyntaxError); ok {
	    t.Fatal(e.Description())
	}
        t.Fatal(err)
    }
}

@AsterDY
Copy link
Collaborator

AsterDY commented Sep 8, 2022

sorry, forget the negative case. I've fixed it by #296

@Betelgeuse1
Copy link
Author

ahah don't apologize, thanks for the quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants