Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix matching URL with mix of array-like and string value #2587

Closed
wants to merge 2 commits into from

Conversation

bartoszhernas
Copy link

Fixes issue described in #2541
Also, fixes issue for Nock working with ClickHouse Client JS, as it uses non-standard URL but still URL-like library.

Comment on lines +666 to +672
expect(expand({ 'a': 'info', 'a[include]': 'extra' })).deep.equal({
"a": {
"__string_value": "info",
"include": "extra"
}
})
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be the desired behavior. what's this __string_value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nock tries to normalize the URL into the JSON structure which (in my opinion) is not fully correct, but I didn't want to refactor everything. In my opinion all the query params should be compared by only using key: value, eg. the url bar[0]=1&bar[1]=2 should be compared as {"bar[0]": "1", "bar[1]": "2"} but Nock tries to make it into {"bar": [1,2]}. Same goes for object-like query params, eg. foo[baz]=1&foo[bar]=2 is normalised by Nock into {"foo": {"baz": "1", "bar": "2"}.

Not necessarily wrong or bad, but it won't work for all cases:
It's problematic for weird URLs, eg what do you do with bar[0=1&bar[2]=2&bar8]=3 as URL?

The fix is trying to fix one specific use case seen in production at Apple Music API. Apple makes the URLs like include=record-labels,artists&include[music-videos]=artists

As you can see the property include wants to be a string record-labels,artists AND at the same time an object of {"music-videos": "artists"}

The best fix I could come up with to normalise it to JSON without changing the whole library was to normalise these kind of query params into "__string_value" inside the object the query tries to be. It will happen only in this weird case.

We are using Nock to record requests for tests of MusicAPI.com integrations and that surprised us but well, that's what they use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gr2m WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion all the query params should be compared by only using key: value, eg. the url bar[0]=1&bar[1]=2 should be compared as {"bar[0]": "1", "bar[1]": "2"} but Nock tries to make it into {"bar": [1,2]}. Same goes for object-like query params, eg. foo[baz]=1&foo[bar]=2 is normalised by Nock into {"foo": {"baz": "1", "bar": "2"}.

I agree. I haven't run into this, but this seems to be inspired by what Rails started doing at the time when nock was created, it's not standard webplatform behavior.

I wouldn't mind to introduce this as a breaking change just do a simple key/value comparison. But if we want to put in the work, we can add a toggle to enable the new behavior, and then add a toggle to revert to the old behavior, before we remove it entirely in the next breaking version. But I also don't know how much of an impact this change will have

Adding __string_value is simpler to add to have a workaround, but it's something that we should remove again in future, it feels like a hack. I'd try and do the proper fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing this change and I think it is impossible due to how many different ways there are to match different query parameters.

  1. You can pass whole string of query params (which my company uses by using recording)
  2. You can pass Object/Array to compare to
  3. You can pass RegEx to match queries

The problem with this is that you kind of have to operate on objects as both sides needs to be normalised for comparison.

Arrays can be parsed into list=1&list=2 or list[]=1&list[]=2 or list=[1,2] depending on the library etc, so normalising back to string won't alway work. Best option is to normalise to Object as you did and then compare using deepCompare. This allows you to also allow RegEx matching and in my opinion is a clear best way to do so.

Sadly, that means the string query can express something that Object cannot cleanly, and for now we have found only one case like this, a param that tried to be an object and a string.

TLDR:
All in all, I changed my mind, the system is sound and for regex to work, it uses correct way of normalizing the data (you normalise into Objects) but one side of this equasion can express more, hence the need for use of __string_value only in some tighltly specific cases.

So my proposition is to allow this hacky use, or maybe do you have better idea how to express params like obj=value&obj[key1]=val1 as an Object so you can compare these ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe alternatively we can let people pass in their own query parsing method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything can be made configurable, custom serialiser passable etc, but I do not understand the issue here except the "it smells like hack even though it's not a hack".

Current implementation does not work correct (at all) with edge-case url like key=val&key[subkey]=val2. This URL will be parsed either as {key: "val"} or {key: {subkey: "val2"}} depending on the order of params. This is just plainly wrong and the Nock doesn't match correct with current implementation. Basically current code for this URL does not work at all. The new change fixes this exact issue without even changing anything for any other type of URLs.

Basically all the upside (broken edge-case working now) with zero downside (all past working code is exactly the same).

So my question is: if __string_value seems hacky to you, what would you propose to serialise the key=val&key[subkey]=val2 to as an object? Does { key_value: "val", key: { subkey: "val2" } } look better? The problem is that now it would overshadow key_value property. That's why I used __string_value so it tries to use something that no one would use (hence it looks hacky).

It doesn't really matter to what you parse it into as this is rarely exposed to Nock users, it's mainly used for comparison.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to feel that this is hacky solution, you could do Symbol or const QSStringValue = '__string_value' and then use this instead of string literal. 😄

But IMHO, that is not a hacky solution but rather proper serialization of the query string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant because nock is 13 years old and that's the first time that I'm aware of that this problem came up.

Adding features is easy, but supporting it forever is not, no matter how small. I don't like the idea of working around the current serializer we have for this specific use case. The serializer is opinionated, so I would rather make it configurable and let folks use their own if they so choose, it would be a clean solution and have added value at the same time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that for such edge cases, we should provide an escape hatch instead of baking it into the codebase.

@mikicho
Copy link
Contributor

mikicho commented Mar 1, 2024

@bartoszhernas hey :) are you willing to add a configurable serializer as suggested here?

@bartoszhernas
Copy link
Author

Currently there is a bug in Nock that handles this edge case URL incorrectly. There is a fix for it already that doesn't change usage/output of any of the non-edge-case things.

You guys mentioned that this issue was never seen previously, for past 14 years, and based on this I see no reason to not merge this if that is the case.

Adding customisable serialiser feels like cannon agains a fly that is already dead. The current Nock doesn't work for one edge-case, there is a fix for this particular edge-case that doesn't affect any other parts of Nock, and I am sorry but I do not see how adding configurable serialiser is a better option.

I would then advise to just merge this, or close the PR and accept that the Nock won't work for perfectly legal, used by Apple Music URLs :). We currently use the PNPM patch to fix this issue and that is fine for us.

@bartoszhernas
Copy link
Author

I like the phrase "better is the enemy of good", but I also understand your point of not wanting to fix the 14yo library as it may introduce weird things happening. Cheers

@mikicho
Copy link
Contributor

mikicho commented Mar 6, 2024

Thanks for the understanding.
Closing this, feel free to reopen with a custom serializer solution.

@mikicho mikicho closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants