Worse performance when response schema specified #3498

dgg · 2021-11-29T10:03:43Z

Prerequisites

I have written a descriptive issue title
I have searched existing issues to ensure the bug has not already been reported

Fastify version

3.20.2

Plugin version

No response

Node.js version

14.17.1

Operating system

macOS

Operating system version (i.e. 20.04, 11.3, 10)

Mojave/Monterey

Description

We have a service written in fastify that queries a time series database and generates a sizeable report (around 7 MB of JSON).
The endpoint has a json schema setup as the successful (200) response schema.

We saw some performance problems with it (taking long to execute the request) and, while digging, we saw signs that pointed to the serialization of the response.

We then proceeded to remove the response schema and we saw a performance improvement (despite what documentation suggests: that response schemas improves performance).

Steps to Reproduce

The response schema:

{
	"definitions": {},
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "response",
	"type": "object",
	"additionalProperties": false,
	"required": [
		"series"
	],
	"properties": {
		"series": {
			"$id": "#/properties/series",
			"type": "array",
			"items": {
				"$id": "#/properties/series/items",
				"type": "object",
				"additionalProperties": false,
				"required": [
					"name",
					"columns",
					"points"
				],
				"properties": {
					"name": {
						"$id": "#/properties/series/items/properties/name",
						"type": "string",
						"minLength": 1,
						"examples": [
							"223926"
						]
					},
					"columns": {
						"$id": "#/properties/series/items/properties/columns",
						"type": "array",
						"minItems": 2,
						"items": {
							"$id": "#/properties/series/items/properties/columns/items",
							"type": "string",
							"minLength": 1
						}
					},
					"points": {
						"$id": "#/properties/series/items/properties/points",
						"type": "array",
						"items": {
							"$id": "#/properties/series/items/properties/points/items",
							"type": "array",
							"minItems": 2,
							"items": {
								"$id": "#/properties/series/items/properties/points/items/items",
								"type": [
									"number",
									"boolean",
									"string",
									"null"
								],
								"minLength": 1
							}
						}
					}
				}
			}
		}
	}
}

The data file can be retrieved from this gist: https://gist.github.com/dgg/b1ef44e0ad88097134e716bfdd6155a5

The service with two endpoints (one that uses the schema and one that does not):

const { fastify } = require("fastify")

const data = require("./data.json")
const schema = require("./schema.json")

const options = {
	schema: {
		response: {
			200: schema
		}
	}
}

const instance = fastify({ logger: true })
	.post("/with", options, (request, reply) => {
		reply.status(200).send(data)
	})
	.post("/without", (request, reply) => {
		reply.status(200).send(data)
	})

// Run the server!
const start = async () => {
	try {
		await instance.listen(3000)
	} catch (err) {
		instance.log.error(err)
		process.exit(1)
	}
}
start()

Results when executing a single request against the endpoint that does not use the response schema:

{"level":30,"time":1638179955533,"pid":8471,"hostname":"macbook-pro.lan","reqId":"req-1","req":{"method":"POST","url":"/without","hostname":"localhost:3000","remoteAddress":"127.0.0.1","remotePort":50433},"msg":"incoming request"}
{"level":30,"time":1638179955609,"pid":8471,"hostname":"macbook-pro.lan","reqId":"req-1","res":{"statusCode":200},"responseTime":75.19169599562883,"msg":"request completed"}

Results when executing a single request against the endpoint that does use the response schema:

{"level":30,"time":1638179973829,"pid":8471,"hostname":"macbook-pro.lan","reqId":"req-2","req":{"method":"POST","url":"/with","hostname":"localhost:3000","remoteAddress":"127.0.0.1","remotePort":50434},"msg":"incoming request"}
{"level":30,"time":1638179974192,"pid":8471,"hostname":"macbook-pro.lan","reqId":"req-2","res":{"statusCode":200},"responseTime":362.3208159953356,"msg":"request completed"}

As one can see, the response time is significantly worse.

We have tried with newer versions of fastify and the problem still remains.
We have also tried to get the JSON data from an asynchronous source (like reading a file asynchronously and the parsing the string) and it did not make a difference.

Expected Behavior

Better (or at last not worse) response times when response schemas are used.

zekth · 2021-11-29T10:15:21Z

Interesting. That might be an issue in https://github.com/fastify/fast-json-stringify

We did not include huge JSON files to serialize like yours. Worth creating more benchmarks.

Eomm · 2021-11-29T11:17:42Z

From docs:

fast-json-stringify is significantly faster than JSON.stringify() for small payloads. Its performance advantage shrinks as your payload grows.

We should be more clear about what it means small in KB, but 7MB is not small for sure.

It should be possible to write a circuit-breker logic where we skip the json-schema serialization and run the JSON.stringify one 🤔
I think it is useful to set up a schema (for documentation at least) without using it

dgg · 2021-11-29T12:56:02Z

Well, that depends if you are generating your docs (let's say your OpenAPI spect) from fastify metadata, you are not going to get the information about the schema if you don't use it.

I was not aware of that piece of documentation.
As I see it, not only the advantage shrinks, using fast-json-stringify turns into a "not an advantage at all" for larger payloads.

mcollina · 2021-11-29T13:22:16Z

Good spot. Could you upload a reproducible example so that we can see if there is some unoptimized path that is causing trouble.

dgg · 2021-11-29T14:13:30Z

@mcollina do you mean a repo with the files that I included in the bug description? the schema, the server and the data that is returned?
That would be easy to provide.

mcollina · 2021-11-29T14:15:14Z

Something that we can benchmark.

dgg · 2021-11-29T14:43:07Z

Here it goes: https://github.com/dgg/fastify-large-serialization
Let me know if there is anything missing.

mcollina · 2021-12-04T15:02:39Z

I've done some investigation in this case and come out through a few conclusions.

The first - maybe not obvious - is that these routes are slow in both cases. On my machine the one without the schema can do about 14 req/sec vs 4 req/sec of the one with the schema. From an architecture perspective these kind of data dumps are better generated by a queue and downloaded asynchronously.

The second is that we have a bug in fast-json-stringify and it's going to be very hard to solve. The gist of it is that we spend about 25% of time in __pthread_cond_wait. In fact, the bug might even be inside Node.js or V8. This will require significant more time than the one I have available to look into.

I would conclude that we should develop and document a way to skip fjs when the developer think they'll hit this case. Morever we should identify the "breaking point" at which fjs starts being counterproductive. @Eomm could you look into it?

mcollina · 2021-12-04T15:26:32Z

A few more interesting pieces.

Flamegraph without schema:

Flamegraph with schema:

mcollina · 2021-12-04T15:37:07Z

On second though, it might also be related to: https://github.com/fastify/fast-json-stringify/blob/master/index.js#L1075-L1110 and how we huge arrays of many primitives.

zekth · 2021-12-04T15:55:49Z

On second though, it might also be related to: https://github.com/fastify/fast-json-stringify/blob/master/index.js#L1075-L1110 and how we huge arrays of many primitives.

This is worth creating an issue and add huge array in the benchmark too. I wonder if it's the [].map which makes this issue.

mcollina · 2021-12-04T16:12:01Z

The solution to this case is to have this block: https://github.com/fastify/fast-json-stringify/blob/a46425478630325478646d7e131cd131e016a8fd/index.js#L1055-L1068

replaced with a JSON.stringify(obj) in case the types specified are all natives (like the one in this example). Note that this will lose any removal or check if the data is the right type (which fjs partially do) but it might be the right call to do.

wilkmaia · 2022-04-06T12:03:57Z

I'm looking into this issue. I'll update with some notes and a PR asap

wilkmaia · 2022-04-06T18:29:54Z

I've submitted fastify/fast-json-stringify#402 and added extensive notes on what I've found on the PR's description. I'm out of ideas of what else we could do to improve it, specially without compromising on the spec validation.

I've proposed a mixed solution, which would give the user the ability to pick a specific mechanism to handle such edge cases. Right now the idea of allowing the user to submit their own array handling function came to mind as well. But I'm not sure whether that'd still be in the scope of this issue and whether that'd be a good feature for the lib overall.

Please, check the PR and comment on the proposed solution and the investigation done. It's possible I might've overlooked something.

simplenotezy · 2023-03-09T12:06:48Z

@wilkmaia thanks for your work! When dealing with large payloads with fastify, is there something I need to do, in order to configure your PR? Or would handle it out of the box, by automatically judging payload size, and then using the most appropriate method?

wilkmaia · 2023-03-09T12:17:56Z

Hi @simplenotezy. If you're experiencing performance issues you might want to tweak the settings outlined in https://github.com/fastify/fast-json-stringify#large-arrays. But bear in mind this only relates to large arrays, not large payloads in general (i.e. a large json document with no arrays wouldn't benefit from that).

simplenotezy · 2023-03-09T12:21:36Z

@wilkmaia thanks for your prompt reply. I have a Nest.js application that uses GraphQL. We have a single (backoffice) endpoint that serves a rather large payload (~3-10mb), and I'd like to optimise that.

I've been looking around for a fast-json-stringify option in the FastifyAdapter, but couldn't find any

  const fastify = new FastifyAdapter({
    logger: false,
    bodyLimit: MAX_BODY_SIZE,
  });

  fastify.enableCors({
    origin: '*',
    methods: ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS'],
  });

  const app = await NestFactory.create<NestFastifyApplication>(AppModule, fastify, {

wilkmaia · 2023-03-09T12:38:53Z

@simplenotezy judging by FastifyAdapter's constructor you can simply pass FastifyServer's options and it'd handle it. The param you're looking for is serializerOpts.

simplenotezy · 2023-03-09T13:26:19Z

aha, thanks for that pointer @wilkmaia!

wilkmaia mentioned this issue Apr 6, 2022

feat: add support for better handling with large arrays fastify/fast-json-stringify#402

Merged

4 tasks

mcollina closed this as completed in fastify/fast-json-stringify#402 Apr 17, 2022

fastify locked as resolved and limited conversation to collaborators Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worse performance when response schema specified #3498

Worse performance when response schema specified #3498

dgg commented Nov 29, 2021

zekth commented Nov 29, 2021

Eomm commented Nov 29, 2021

dgg commented Nov 29, 2021

mcollina commented Nov 29, 2021

dgg commented Nov 29, 2021

mcollina commented Nov 29, 2021

dgg commented Nov 29, 2021

mcollina commented Dec 4, 2021

mcollina commented Dec 4, 2021

mcollina commented Dec 4, 2021

zekth commented Dec 4, 2021

mcollina commented Dec 4, 2021

wilkmaia commented Apr 6, 2022

wilkmaia commented Apr 6, 2022 •

edited

simplenotezy commented Mar 9, 2023

wilkmaia commented Mar 9, 2023

simplenotezy commented Mar 9, 2023

wilkmaia commented Mar 9, 2023 •

edited

simplenotezy commented Mar 9, 2023

Worse performance when response schema specified #3498

Worse performance when response schema specified #3498

Comments

dgg commented Nov 29, 2021

Prerequisites

Fastify version

Plugin version

Node.js version

Operating system

Operating system version (i.e. 20.04, 11.3, 10)

Description

Steps to Reproduce

Expected Behavior

zekth commented Nov 29, 2021

Eomm commented Nov 29, 2021

dgg commented Nov 29, 2021

mcollina commented Nov 29, 2021

dgg commented Nov 29, 2021

mcollina commented Nov 29, 2021

dgg commented Nov 29, 2021

mcollina commented Dec 4, 2021

mcollina commented Dec 4, 2021

mcollina commented Dec 4, 2021

zekth commented Dec 4, 2021

mcollina commented Dec 4, 2021

wilkmaia commented Apr 6, 2022

wilkmaia commented Apr 6, 2022 • edited

simplenotezy commented Mar 9, 2023

wilkmaia commented Mar 9, 2023

simplenotezy commented Mar 9, 2023

wilkmaia commented Mar 9, 2023 • edited

simplenotezy commented Mar 9, 2023

wilkmaia commented Apr 6, 2022 •

edited

wilkmaia commented Mar 9, 2023 •

edited