tftypes: Guarantee Type interface MarshalJSON error safety #371

bflad · 2024-01-24T16:08:01Z

Reference: #238
Closes #365

This change removes the necessity for protocol type conversion to handle JSON encoding errors, therefore greatly simplifying those packages. Type system conversions into the protocol should always be possible, albeit potentially not valid for usage in Terraform. The protocol conversion code is not the place to handle that consideration, rather if developers want to validate their schema/function/type definitions before its sent across the protocol with respect to the validation rules imposed by Terraform, separate validation functionality should be introduced. This is not conceptually new, does not introduce behavior changes, and uses fuzz testing to verify that a panic is not possible today in the one way that developers could potentially introduce a new panic.

The fuzz testing has been run for 60 seconds on a currently powerful workstation without generating any failure cases:

$ go test -fuzz=Fuzz -fuzztime=60s ./tftypes
fuzz: elapsed: 0s, gathering baseline coverage: 0/138 completed
fuzz: elapsed: 0s, gathering baseline coverage: 138/138 completed, now fuzzing with 10 workers
fuzz: elapsed: 3s, execs: 406242 (135409/sec), new interesting: 6 (total: 144)
fuzz: elapsed: 6s, execs: 833268 (142312/sec), new interesting: 7 (total: 145)
fuzz: elapsed: 9s, execs: 1269960 (145562/sec), new interesting: 7 (total: 145)
fuzz: elapsed: 12s, execs: 1684371 (138143/sec), new interesting: 8 (total: 146)
fuzz: elapsed: 15s, execs: 2137768 (151160/sec), new interesting: 9 (total: 147)
fuzz: elapsed: 18s, execs: 2574729 (145641/sec), new interesting: 11 (total: 149)
fuzz: elapsed: 21s, execs: 3011973 (145722/sec), new interesting: 12 (total: 150)
fuzz: elapsed: 24s, execs: 3442147 (143418/sec), new interesting: 12 (total: 150)
fuzz: elapsed: 27s, execs: 3868833 (142210/sec), new interesting: 12 (total: 150)
fuzz: elapsed: 30s, execs: 4313780 (148320/sec), new interesting: 14 (total: 152)
fuzz: elapsed: 33s, execs: 4763813 (150034/sec), new interesting: 14 (total: 152)
fuzz: elapsed: 36s, execs: 5201686 (145936/sec), new interesting: 15 (total: 153)
fuzz: elapsed: 39s, execs: 5613562 (137285/sec), new interesting: 15 (total: 153)
fuzz: elapsed: 42s, execs: 6051164 (145875/sec), new interesting: 15 (total: 153)
fuzz: elapsed: 45s, execs: 6485230 (144677/sec), new interesting: 15 (total: 153)
fuzz: elapsed: 48s, execs: 6912045 (142294/sec), new interesting: 15 (total: 153)
fuzz: elapsed: 51s, execs: 7349416 (145756/sec), new interesting: 16 (total: 154)
fuzz: elapsed: 54s, execs: 7792201 (147626/sec), new interesting: 16 (total: 154)
fuzz: elapsed: 57s, execs: 8225683 (144471/sec), new interesting: 18 (total: 156)
fuzz: elapsed: 1m0s, execs: 8637257 (137227/sec), new interesting: 18 (total: 156)
fuzz: elapsed: 1m0s, execs: 8637257 (0/sec), new interesting: 18 (total: 156)
PASS
ok      github.com/hashicorp/terraform-plugin-go/tftypes        60.883s

Reference: #238 Reference: #365 This change removes the necessity for protocol type conversion to handle JSON encoding errors, therefore greatly simplifying those packages. Type system conversions into the protocol should always be possible, albeit potentially not valid for usage in Terraform. The protocol conversion code is not the place to handle that consideration, rather if developers want to validate their schema/function/type definitions before its sent across the protocol with respect to the validation rules imposed by Terraform, separate validation functionality should be introduced. This is not conceptually new, does not introduce behavior changes, and uses fuzz testing to verify that a panic is not possible today in the one way that developers could potentially introduce a new panic. The fuzz testing has been run for 60 seconds on a currently powerful workstation without generating any failure cases: ```console $ go test -fuzz=Fuzz -fuzztime=60s ./tftypes fuzz: elapsed: 0s, gathering baseline coverage: 0/138 completed fuzz: elapsed: 0s, gathering baseline coverage: 138/138 completed, now fuzzing with 10 workers fuzz: elapsed: 3s, execs: 406242 (135409/sec), new interesting: 6 (total: 144) fuzz: elapsed: 6s, execs: 833268 (142312/sec), new interesting: 7 (total: 145) fuzz: elapsed: 9s, execs: 1269960 (145562/sec), new interesting: 7 (total: 145) fuzz: elapsed: 12s, execs: 1684371 (138143/sec), new interesting: 8 (total: 146) fuzz: elapsed: 15s, execs: 2137768 (151160/sec), new interesting: 9 (total: 147) fuzz: elapsed: 18s, execs: 2574729 (145641/sec), new interesting: 11 (total: 149) fuzz: elapsed: 21s, execs: 3011973 (145722/sec), new interesting: 12 (total: 150) fuzz: elapsed: 24s, execs: 3442147 (143418/sec), new interesting: 12 (total: 150) fuzz: elapsed: 27s, execs: 3868833 (142210/sec), new interesting: 12 (total: 150) fuzz: elapsed: 30s, execs: 4313780 (148320/sec), new interesting: 14 (total: 152) fuzz: elapsed: 33s, execs: 4763813 (150034/sec), new interesting: 14 (total: 152) fuzz: elapsed: 36s, execs: 5201686 (145936/sec), new interesting: 15 (total: 153) fuzz: elapsed: 39s, execs: 5613562 (137285/sec), new interesting: 15 (total: 153) fuzz: elapsed: 42s, execs: 6051164 (145875/sec), new interesting: 15 (total: 153) fuzz: elapsed: 45s, execs: 6485230 (144677/sec), new interesting: 15 (total: 153) fuzz: elapsed: 48s, execs: 6912045 (142294/sec), new interesting: 15 (total: 153) fuzz: elapsed: 51s, execs: 7349416 (145756/sec), new interesting: 16 (total: 154) fuzz: elapsed: 54s, execs: 7792201 (147626/sec), new interesting: 16 (total: 154) fuzz: elapsed: 57s, execs: 8225683 (144471/sec), new interesting: 18 (total: 156) fuzz: elapsed: 1m0s, execs: 8637257 (137227/sec), new interesting: 18 (total: 156) fuzz: elapsed: 1m0s, execs: 8637257 (0/sec), new interesting: 18 (total: 156) PASS ok github.com/hashicorp/terraform-plugin-go/tftypes 60.883s ```

austinvalle

LGTM, do we have a general guideline of when we should run fuzz testing? Assuming that dropping it in the CI seems like overkill, so I guess just ad-hoc?

austinvalle · 2024-01-24T18:43:31Z

tftypes/object_test.go

+	seeds := []string{
+		"ASCII",
+		"こんにちは",
+		// "ﬄ" is a single-character ligature
+		"baﬄe",
+		// These "e" have multiple combining diacritics
+		"wé́́é́́é́́!",
+		// Astral-plane characters
+		"😸😾",
+		// neither byte is valid UTF-8
+		"\xff\xff",
+		// invalid bytes interleaved with single byte characters
+		"t\xffe\xffst",
+		// invalid bytes interleaved with multiple byte sequences
+		"\xffこんにちは\xffこんにちは",
+	}


l̺̺̠o̻̞v̼͔͔e͍̫͓ i̢̼̟t͔͉̙ ❤️

bflad · 2024-01-24T20:28:05Z

do we have a general guideline of when we should run fuzz testing

Great question! I guess I'm okay with ad-hoc for now since we only have one instance of this type of testing, similar to how there is a single benchmark test in the framework code that is not wired up to anything. If more fuzz tests pop up, we likely should consider running them regularly and depending on how they are setup, figuring out how to notify the maintainers (since that setup may not necessarily be PR/status check based).

In this case, we do not know what, if ever any, change could be made to the Go standard library encoding/json package that could cause a regression in this code, but we do know that sort of change would only happen after a Go release. We could opt to setup a CI job that detects go.mod changes and only then adds a new status check, but other fuzzing scenarios might not exactly line up with that change detection. Another option would be running it as a scheduled job, but then the notification problem is difficult.

So yeah, I'm personally okay with a wait-and-see approach to see how prevalent this type of testing becomes and the use cases causing their creation before making any decisions. Happy to hear your thoughts on the matter. What I do prefer, regardless of how they are ran, is keeping the (fuzz/benchmark) test definitions around for future maintainers since they explicitly point to code that at some point was under extra scrutiny for some reason.

bflad · 2024-01-24T20:40:52Z

(I'm going to merge this as its already approved, but very happy to discuss the above more and/or actually implement something to run it.)

austinvalle · 2024-01-24T20:43:55Z

I'm cool with wait-and-see as well, finding more uses for fuzzing should definitely precede implementing too much CI 👍🏻

bflad added the tech-debt label Jan 24, 2024

bflad requested a review from a team as a code owner January 24, 2024 16:08

bflad added this to the v0.21.0 milestone Jan 24, 2024

bflad mentioned this pull request Jan 24, 2024

Introduce Go Fuzz Testing for Critical Code Paths #238

Closed

austinvalle approved these changes Jan 24, 2024

View reviewed changes

Merge branch 'main' into bflad/tftypes-marshaljson-error

149d40c

bflad merged commit c9a4f80 into main Jan 24, 2024
65 checks passed

bflad deleted the bflad/tftypes-marshaljson-error branch January 24, 2024 20:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tftypes: Guarantee Type interface MarshalJSON error safety #371

tftypes: Guarantee Type interface MarshalJSON error safety #371

bflad commented Jan 24, 2024

austinvalle left a comment

austinvalle Jan 24, 2024

bflad commented Jan 24, 2024

bflad commented Jan 24, 2024

austinvalle commented Jan 24, 2024

tftypes: Guarantee Type interface MarshalJSON error safety #371

tftypes: Guarantee Type interface MarshalJSON error safety #371

Conversation

bflad commented Jan 24, 2024

austinvalle left a comment

Choose a reason for hiding this comment

austinvalle Jan 24, 2024

Choose a reason for hiding this comment

bflad commented Jan 24, 2024

bflad commented Jan 24, 2024

austinvalle commented Jan 24, 2024