Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for const schemas #505

Merged
merged 0 commits into from Aug 25, 2022
Merged

Improve performance for const schemas #505

merged 0 commits into from Aug 25, 2022

Conversation

DanieleFedeli
Copy link
Contributor

Checklist

As per issue #502, when using a const schema, the performance degrades.

Benchmark pre fix:
Screenshot 2022-08-16 alle 10 50 37

Benchmark post fix:
Screenshot 2022-08-16 alle 11 02 14

As you can see, object const schemas did not improve. I am working on that.

@Uzlopak Uzlopak linked an issue Aug 16, 2022 that may be closed by this pull request
2 tasks
@mcollina
Copy link
Member

wow, good numbers! Is there anything missing here (the PR is in draft)?

@DanieleFedeli
Copy link
Contributor Author

DanieleFedeli commented Aug 16, 2022

I am trying to figure out if I can improve const object schemas as well

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 16, 2022

@DanieleFedeli

What about

  if (schema.const) {
    if (typeof schema.const !== 'object') {
        return typeof schema.const
    } else if (schema.const === null) {
        return 'null'
    } else if (Array.isArray(schema.const)) {
        return 'array'
    } else {
        return 'object'
    }
  }

@DanieleFedeli
Copy link
Contributor Author

@Uzlopak I tried but object schemas are not serialized correctly.
test/const.test.js fails
Screenshot 2022-08-16 alle 11 31 11

@DanieleFedeli
Copy link
Contributor Author

Improve const strings and const objects 🚀
Screenshot 2022-08-16 alle 13 46 53

@DanieleFedeli DanieleFedeli marked this pull request as ready for review August 16, 2022 11:50
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

index.js Outdated
code += `json += ${funcName}(${input})`
break
if ('const' in schema) {
const stringifiedSchema = JSON.stringify(schema.const)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so convinced by this, as it is not deterministic and would break probably on circular references and bigint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So probably returns false if you set const: { a: 1, b: 2 } then it would return false on { b: 2, a: 1}

Maybe we need here rfdc instead of stringification for objects

Also what happens when we provide a Date.

E.g.

const: { }

and JSON.stringify on Date is also returning { }

So probably it would pass on Dates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sry, my mistake JSON.stringify(Date) returns the ISODateTime. But still a potential pitfall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on rfdc to remove any __proto__

We know that schemas should be treated as code, but adding this const check gives to the user the possibility to inject bad code

Copy link
Contributor

@Uzlopak Uzlopak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to pass.

test('schema with const object', (t) => {
  t.plan(1)

  const schema = {
    type: 'object',
    properties: {
      foo: { const: { a: 1, b: 2 } }
    }
  }

  const stringify = build(schema)
  t.doesNotThrow(() => stringify({
    foo: { b: 2, a: 1 }
  }))
})

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 16, 2022

tbh. I thought the first commit was good enough as it was improving the performance of primitives.

And I actually dont like the notation of passing an Object simply with const.

Because this works perfect and is the JSONSchema-way

test('schema with const object', (t) => {
  t.plan(2)

  const schema = {
    type: 'object',
    properties: {
      foo: {
        type: 'object',
        additionalProperties: false,
        required: ['a', 'b'],
        properties: {
          a: { const: 1 },
          b: { const: 2 }
        }
      }
    }
  }

  const stringify = build(schema)
  t.doesNotThrow(() => stringify({
    foo: { b: 2, a: 1 }
  }))
  t.doesNotThrow(() => stringify({
    foo: { a: 1, b: 2 }
  }))
})

@DanieleFedeli
Copy link
Contributor Author

Ok then I will revert the last commit

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 16, 2022

Please wait for the feedback of @ivan-tymoshenko. Maybe he has some idea.

@ivan-tymoshenko
Copy link
Member

ivan-tymoshenko commented Aug 16, 2022

@Uzlopak @climba03003 Is there a reason we match data with a const before inserting the const? Can't we just insert the const? We have a vague explanation of when we validate data and when we don't, but it looks like we can avoid it in this case for performance purposes. WDYT?

Copy link
Member

@Eomm Eomm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to start a discussion here: why not coerce the output value to the const value instead?
A json schema with const is telling me (standard quotes :

An instance validates successfully against this keyword if its value
is equal to the value of the keyword.

So, a serialization perspective should be "no matter what the input is, the output will be the const value."

I think fast-json-stringify is not a validator, it is an information filter (with rare exceptions such as anyOf and allOf)

index.js Outdated
funcName = nullable ? 'serializer.asStringNullable.bind(serializer)' : 'serializer.asString.bind(serializer)'
code += `json += ${funcName}(${input})`
break
if ('const' in schema) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an option to turn this feature on, or it will be a major version with a harder distribution across userland

index.js Outdated
code += `json += ${funcName}(${input})`
break
if ('const' in schema) {
const stringifiedSchema = JSON.stringify(schema.const)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on rfdc to remove any __proto__

We know that schemas should be treated as code, but adding this const check gives to the user the possibility to inject bad code

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 16, 2022

i have no strong opinion on this. I just think that he first commit was a nice patch, which had only the issue, that it maybe could have resulted in some hickups like what if the const is a bigint or function. But it would have improved the performance significantly, if somebody would have provided some json schema without type information.

I assume that the following commits were under the impression that we want to improve the performance under any circumstances. Thats why I wrote, that I liked the first commit, as it was small and very effective. It does not even make fast-json-stringify to a validator as you mentioned @Eomm .

I think the first commit is a good base for a PR. But I would actually also not simply return typeof value.

more like

function getJSONSchemaType(value) {
  const type = typeof value
  if (type === 'string') {
    return type;
  } else if (type === 'boolean') {
    return type
  } else if (type === 'number') {
    // NaN and Infinity and -Infinity are not serializable
    if (isFinite(value)) {
      return type
    }
  }
}

@ivan-tymoshenko
Copy link
Member

ivan-tymoshenko commented Aug 16, 2022

i have no strong opinion on this. I just think that he first commit was a nice patch, which had only the issue, that it maybe could have resulted in some hickups like what if the const is a bigint or function. But it would have improved the performance significantly, if somebody would have provided some json schema without type information.

I assume that the following commits were under the impression that we want to improve the performance under any circumstances. Thats why I wrote, that I liked the first commit, as it was small and very effective. It does not even make fast-json-stringify to a validator as you mentioned @Eomm .

I think the first commit is a good base for a PR. But I would actually also not simply return typeof value.

more like

function getJSONSchemaType(value) {
  const type = typeof value
  if (type === 'string') {
    return type;
  } else if (type === 'boolean') {
    return type
  } else if (type === 'number') {
    // NaN and Infinity and -Infinity are not serializable
    if (isFinite(value)) {
      return type
    }
  }
}

@Uzlopak, I think we've got a slight misunderstanding. What @Eomm and I mean is that FJS already validates the data when inserting a constant, which is unnecessary.

  if (schema.const) {
    return `json += JSON.stringify(schema.const)`
  }

Something like that.

@@ -65,3 +145,53 @@ test('schema with const and invalid object', (t) => {
t.ok(err)
}
})

test('schema with const and invalid number', t => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make an example @Uzlopak

talking about the behaviour, I think this test should be always successful:

const x = stringify({
      foo: 2
    })
// x should be '{"foo":1}'

No matter what the input object is, the output should be always equals to const

@climba03003
Copy link
Member

climba03003 commented Aug 17, 2022

So, a serialization perspective should be "no matter what the input is, the output will be the const value."

Just one question, how do we handle array of types that include null.

{
  "type": ["string", "null"],
  "const": "foo"
}

It is a special case that user may expect the const to be null-able.

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 17, 2022

That is kind of contradicting.

@climba03003
Copy link
Member

climba03003 commented Aug 17, 2022

That is kind of contradicting.

Yes, I thought of it. But, I can't really think of a way to optionally parse the const.

const schema = {
  "type": "object",
  "properties": {
    "foo": {
      "const": "bar"
    }
  }
}

stringify({ "foo": "bar" }) // { "foo": "bar" }
stringify({ "foo": "baz" }) // { "foo": "bar" }
stringify({ "foo": 1 }) // { "foo": "bar" }
stringify({}) // { "foo": "bar" } ??? what if I want { } only

I think we should

const schema = {
  "type": "object",
  "properties": {
    "foo": {
      "const": "bar"
    }
  }
}


stringify({ "foo": "bar" }) // { "foo": "bar" }
stringify({ "foo": "baz" }) // { "foo": "bar" }
stringify({ "foo": 1 }) // { "foo": "bar" }
stringify({}) // { }

const schema = {
  "type": "object",
  "properties": {
    "foo": {
      "const": "bar"
    }
  },
  "required": ['foo'] // here is important
}


stringify({ "foo": "bar" }) // { "foo": "bar" }
stringify({ "foo": "baz" }) // { "foo": "bar" }
stringify({ "foo": 1 }) // { "foo": "bar" }
stringify({}) // { "foo": "bar" }

@DanieleFedeli
Copy link
Contributor Author

So what is the plan?
Should I revert to 68ff81c and then open an issue for objects?

@Eomm
Copy link
Member

Eomm commented Aug 18, 2022

So what is the plan? Should I revert to 68ff81c and then open an issue for objects?

No, that commit is not a problem.
The discussion here is that the following line applies a validation processing that is unwanted:

throw new Error(\`Item $\{JSON.stringify(${input})} does not match schema definition.\`)`

I think that we should force the const value into the serialized object whatever the input object is.

The @climba03003 #505 (comment) is the perfect summary of how a serializer should use a const.
The type: null case could be added in a follow PR because it is an edge case

Moreover, this feature should be hidden by an option flag: coerceConst: Bool if you need to ship this feature in the current major version.

If someone does not agree with my opinion, please explain.

@DanieleFedeli
Copy link
Contributor Author

@Eomm In this case I agree with you.
I added validation because I saw the way how const schema was handled:

case undefined:
.
.
.
.
else if ('const' in schema) {
 code += ` if(ajv.validate(${JSON.stringify(schema)}, ${input}))
    json += '${JSON.stringify(schema.const)}'
  else
    throw new Error(\`Item $\{JSON.stringify(${input})} does not match schema definition.\`)`
}

But if it not needed I am happy to remove it

@ivan-tymoshenko
Copy link
Member

@Eomm In this case I agree with you. I added validation because I saw the way how const schema was handled:

case undefined:
.
.
.
.
else if ('const' in schema) {
 code += ` if(ajv.validate(${JSON.stringify(schema)}, ${input}))
    json += '${JSON.stringify(schema.const)}'
  else
    throw new Error(\`Item $\{JSON.stringify(${input})} does not match schema definition.\`)`
}

But if it not needed I am happy to remove it

Please don't forget about null case.

@DanieleFedeli DanieleFedeli merged commit cd586bf into fastify:master Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Worse performance with response schema when 'const' keyword is used
6 participants