Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization of 64bit values are truncated #3837

Closed
kskem opened this issue Nov 3, 2021 · 22 comments · Fixed by dapr/components-contrib#1597
Closed

Serialization of 64bit values are truncated #3837

kskem opened this issue Nov 3, 2021 · 22 comments · Fixed by dapr/components-contrib#1597
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@kskem
Copy link

kskem commented Nov 3, 2021

In what area(s)?

/area runtime

What version of Dapr?

edge: output of git describe --dirty
starting Dapr Runtime -- version edge -- commit b1cad9a

Expected Behavior

Sending 64bit values with pub/sub are expected to arrive unchanged.

Actual Behavior

Receiving 64bit values are truncated in the trailing two decimal digits.
Best guess is that the serialized json has a max string-length for numbers.

Steps to Reproduce the Problem

I have posted a reproducible setup, based on Docker, with instructions:
https://github.com/kskem/dapr-serialization-issue

Recap of the problem:

owner_1            | Publishing record 1/5: {"int32":1778635775,"uint32":1778635775,"int64":637715205845980876,"uint64":637715205845980876,"base64long":"zJZQ2Jme2Qg=","base64int":"/9MDag=="}
replicaone_1       | Got event            : {"int32":1778635775,"uint32":1778635775,"int64":637715205845980900,"uint64":637715205845980900,"base64long":"zJZQ2Jme2Qg=","base64int":"/9MDag=="}

Release Note

RELEASE NOTE: FIX Bug in runtime.

@kskem kskem added the kind/bug Something isn't working label Nov 3, 2021
@dapr-bot
Copy link
Collaborator

dapr-bot commented Dec 3, 2021

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

@dapr-bot dapr-bot added stale Issues and PRs without response and removed stale Issues and PRs without response labels Dec 3, 2021
@real-zony
Copy link

I have this problem too.

@hueifeng
Copy link
Member

Any update?

@SayHelloGsx
Copy link

Yeah I got the same problem too. Could any one tell me how to fix it?

@yaron2 yaron2 added this to the v1.6 milestone Dec 20, 2021
@yaron2
Copy link
Member

yaron2 commented Dec 20, 2021

I triaged this to be fixed in 1.6.

cc @artursouza

@artursouza
Copy link
Member

Is this a runtime or SDK issue?

@yaron2
Copy link
Member

yaron2 commented Dec 20, 2021

Is this a runtime or SDK issue?

Runtime

@yaron2 yaron2 changed the title Serialization of 64bit values are truncated with .net sdk Serialization of 64bit values are truncated Dec 20, 2021
@hueifeng
Copy link
Member

Serialization is used in runtime.go, but when it encounters 64bit it automatically identifies Number as float64.

For example:

package main

import(
	jsoniter "github.com/json-iterator/go"
)

var name string

type Stu struct{
	Name string `json:"name"`
	Id int64
}

func main()  {
	Stu:=Stu{
		name:"张三",
		Id:637715205845980876
	}
	m1 :=make(map[string]interface{})
	var json= jsoniter.ConfigDefault
	var jsonStu,_=json.Marshal(stu)
	println(string(jsonStu))
	json.Unmarshal(jsonStu,&m1)

}

739f6f2a275078a3e0fe9d341219d33

json.Unmarshal will use float64 to handle the problem, and you'll need to change the method to display the declaration.

var val map[string]interface{}
d := json.NewDecoder(bytes.NewBuffer(bdata))
d.UseNumber()
err = d.Decode(&val)

@yaron2
Copy link
Member

yaron2 commented Dec 20, 2021

Related, possible: json-iterator/go#510

@hueifeng
Copy link
Member

@yaron2 You can consider using the following code to solve.

package main

import (
	"strings"

	jsoniter "github.com/json-iterator/go"
)

type Stu struct {
	Name string `json:"name"`
	Id   int64
}

func main() {
	stu := Stu{
		Name: "张三",
		Id:   637715205845980876,
	}
	m1 := make(map[string]interface{})
	var json = jsoniter.ConfigDefault
	var jsonStu, _ = json.Marshal(stu)
	println(string(jsonStu))
	json.Unmarshal(jsonStu, &m1)

	decoder := jsoniter.NewDecoder(strings.NewReader(string(jsonStu)))
	decoder.UseNumber()
	d := make(map[string]interface{})
	decoder.Decode(&d)
	s2, _ := jsoniter.Marshal(d)
	println(string(s2))
}

Use the func (*Decoder) UseNumber method to deserialize the numeral type of json, instead of directly converting to float64, convert to json.Number, json.Number internal:

// A Number represents a JSON number literal.
type Number string

// String returns the literal text of the number.
func (n Number) String() string { return string(n) }

// Float64 returns the number as a float64.
func (n Number) Float64() (float64, error) {
    return strconv.ParseFloat(string(n), 64)
}

// Int64 returns the number as an int64.
func (n Number) Int64() (int64, error) {
    return strconv.ParseInt(string(n), 10, 64)
}

@yaron2
Copy link
Member

yaron2 commented Dec 29, 2021

Thanks @hueifeng.

@artursouza after looking at this, I recommend we move away from jsoniter to the Go stdlib encoding/json library. It looks like since Go 1.14 performance is no longer a critical aspect of this library: json-iterator/go#455.

Moving to encoding/json will allow us to optimize for JSON consistency and stability whilst removing a dependency.

@artursouza
Copy link
Member

Thanks @hueifeng.

@artursouza after looking at this, I recommend we move away from jsoniter to the Go stdlib encoding/json library. It looks like since Go 1.14 performance is no longer a critical aspect of this library: json-iterator/go#455.

Moving to encoding/json will allow us to optimize for JSON consistency and stability whilst removing a dependency.

LGTM

@yaron2
Copy link
Member

yaron2 commented Dec 30, 2021

@hueifeng would you like to contribute this change?

@hueifeng
Copy link
Member

I can do it.

@artursouza
Copy link
Member

E2E tests are not passing in the PR, moving to 1.7

@artursouza artursouza modified the milestones: v1.6, v1.7 Jan 7, 2022
@yaron2 yaron2 modified the milestones: v1.7, v1.6 Jan 7, 2022
@artursouza
Copy link
Member

Moving to 1.7 because we are cutting RC for 1.6.

@artursouza artursouza removed this from the v1.6 milestone Jan 14, 2022
@artursouza artursouza added this to the v1.7 milestone Jan 14, 2022
@shubham1172
Copy link
Member

/assign

@shubham1172
Copy link
Member

shubham1172 commented Mar 17, 2022

Dapr/dapr does not have a problem. I tried putting telemetry around pub (pkg/grpc/api.go) and sub (pkg/runtime/runtime.go) code to check what data is being sent and received.

Example data sent from application:

{"int32":1041701776,"uint32":1041701776,"int64":637831257272814404,"uint64":637831257272814404,"base64long":"RAtjLCYI2gg=","base64int":"kBsXPg=="}

On Dapr side, while sending the data, the raw bytes from Marshaling the struct gets its int64 and uint64 data converted to float, and here it loses precision. Sample data after marshaling on publisher's end:

map[data:map[base64int:kBsXPg== base64long:RAtjLCYI2gg= int32:1.041701776e+09 int64:6.378312572728145e+17 uint32:1.041701776e+09 uint64:6.378312572728145e+17]

On Dapr side, while receiving the data, the raw bytes itself have incorrect data (as in shubham1172/dapr-serialization-issue's readme). Thus Unmarshaling with UseNumber here won't help. Sample data after unmarshalling:

map[data:map[base64int:kBsXPg== base64long:RAtjLCYI2gg= int32:1041701776 int64:637831257272814500 uint32:1041701776 uint64:637831257272814500]

The root cause is the json.Unmarshal in https://github.com/dapr/components-contrib/blob/master/pubsub/envelope.go
On adding the UseNumber way, it works as expected without data loss. I will create a PR and attach to this issue soon.

Should we create a separate item to move away from jsoniter to encoding/json?

@dapr-bot
Copy link
Collaborator

dapr-bot commented Mar 17, 2022 via email

@shubham1172
Copy link
Member

I did some more research, and this will not be a breaking change.

The only change that this PR introduces is in the unmarshalling of the CloudEvent data from the PublishEventRequest in PublishEvent (api.go)

Earlier, this unmarshalling resulted in data loss while converting 64-bit values to 53-bits float (as per JSON). Now, it uses json.Number representation, an alias of string, to unmarshall the same without any data loss.

After unmarshalling the CloudEvent's data, it's packed into the PublishRequest as []byte (both HTTP and GRPC). This is the request that gets published on the actual pubsub component.

The same thing is being read as []byte by our subscribers, in beginPubSub (runtime.go)

Since the public API contracts are not changed, this won't affect anything external.

@shubham1172
Copy link
Member

Adding some more data here for comparison. All these are requests with CloudEvents in JSON format (other CloudEvent formats/rawPayloads are unaffected).

Before the PR:

// data received by PublishRequest
{"int32":1778635775,"uint32":1778635775,"int64":637715205845980876,"uint64":637715205845980876,"base64long":"zJZQ2Jme2Qg=","base64int":"/9MDag=="}

// data marshalled by PublishRequest
{"int32":1778635775,"uint32":1778635775,"int64":637715205845980900,"uint64":637715205845980900,"base64long":"zJZQ2Jme2Qg=","base64int":"/9MDag=="}

// data receievd by Subscriber
{"int32":1778635775,"uint32":1778635775,"int64":637715205845980900,"uint64":637715205845980900,"base64long":"zJZQ2Jme2Qg=","base64int":"/9MDag=="}

After the PR:

// data received by PublishRequest
{"int32":1228499251,"uint32":1228499251,"int64":637834811545047664,"uint64":637834811545047664,"base64long":"cFKFt2EL2gg=","base64int":"M2k5SQ=="}

// data marshalled by PublishRequest
{"int32":1228499251,"uint32":1228499251,"int64":637834811545047664,"uint64":637834811545047664,"base64long":"cFKFt2EL2gg=","base64int":"M2k5SQ=="}

// data received by Subscriber
{"int32":1228499251,"uint32":1228499251,"int64":637834811545047664,"uint64":637834811545047664,"base64long":"cFKFt2EL2gg=","base64int":"M2k5SQ=="}

@shubham1172
Copy link
Member

@mukundansundar can you please un-tag the PR as breaking-change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment