Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested parameters parsing error in rack 3.0.8 #2128

Open
sergey-arkhipov opened this issue Oct 12, 2023 · 22 comments
Open

Nested parameters parsing error in rack 3.0.8 #2128

sergey-arkhipov opened this issue Oct 12, 2023 · 22 comments

Comments

@sergey-arkhipov
Copy link

Problem description

We are planning an upgrade to Rails 7.1.1.
The upgrade involves updating the rack version to the current one (3.0.8)
However, some of the ours forms that use POST request stopped working.
Investigating this issue led to the fact that the parsing of parameters passed by the form to rack is not done correctly.
When using rack version 2.2.8 everything works fine.
The error occurs when using field names of the field_name[index] type in the form, which is used to work with JSON arrays.
A test example to reproduce the error.

Reproduce error

>> Rack::RELEASE
=> "2.2.8"
>> Rack::VERSION
=> [1, 3]
>> qs = 'authenticity_token=cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg&commit=%D0%A1%D0%BE%D0%B7%D0%B4%D0%B0%D1%82%D1%8C+%D0%B2%D0%B5%D1%80%D1%81%D0%B8%D1%8E+%D1%81%D1%82%D0%B0%D1%82%D1%8C%D0%B8&article_version%5Btitle%5D=&article_version%5Barticle_code_prefix%5D=ZDR_&article_version%5Bclassifier_ids%5D%5B%5D=&article_version%5Bquestionnaire_id%5D=&article_version%5Brecord_type_id%5D=&article_version%5Btag_plus%5D=0&article_version%5Btag_plus%5D=1&article_version%5Bcomment%5D=&article_version%5Bbody%5Banswer1%5D%5D=&article_version%5Bvirtual_linked_articles_codes%5D%5B%5D=&article_version%5Bdouble_linked_codes%5D%5B%5D=&article_version%5Bquestions%5D%5B%5D='
=> "authenticity_token=cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg&commit=%D0%A1%D0%BE%D0%B7%...
>>Test_parser = Class.new(Rack::QueryParser::Params)
=> Test_parser
>> p = Rack::QueryParser.new(Test_parser,10000,10000)
=> #<Rack::QueryParser:0x000000010c2f8818 @key_space_limit=10000, @param_depth_limit=10000, @params_class=Test_parser>
>> p.parse_nested_query(qs)
=>
{"authenticity_token"=>"cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg",
 "commit"=>"Создать версию статьи",
 "article_version"=>
  {"title"=>"",
   "article_code_prefix"=>"ZDR_",
   "classifier_ids"=>[""],
   "questionnaire_id"=>"",
   "record_type_id"=>"",
   "tag_plus"=>"1",
   "comment"=>"",
   "body"=>{"answer1"=>""},
   "virtual_linked_articles_codes"=>[""],
   "double_linked_codes"=>[""],
   "questions"=>[""]}}
>>

The same example for rack version 3.0.8

>> Rack::RELEASE
=> "3.0.8"
>> Rack::VERSION
=> [1, 3]
>>  qs = 'authenticity_token=cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg&commit=%D0%A1%D0%BE%D0%B7
%D0%B4%D0%B0%D1%82%D1%8C+%D0%B2%D0%B5%D1%80%D1%81%D0%B8%D1%8E+%D1%81%D1%82%D0%B0%D1%82%D1%8C%D0%B8&article_version%5Btitle%5D=&article_versio
n%5Barticle_code_prefix%5D=ZDR_&article_version%5Bclassifier_ids%5D%5B%5D=&article_version%5Bquestionnaire_id%5D=&article_version%5Brecord_ty
pe_id%5D=&article_version%5Btag_plus%5D=0&article_version%5Btag_plus%5D=1&article_version%5Bcomment%5D=&article_version%5Bbody%5Banswer1%5D%5
D=&article_version%5Bvirtual_linked_articles_codes%5D%5B%5D=&article_version%5Bdouble_linked_codes%5D%5B%5D=&article_version%5Bquestions%5D%5
B%5D='
=> "authenticity_token=cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg&commit=%D0%A1%D0%BE%D0%B7%...
>> Test_parser = Class.new(Rack::QueryParser::Params)
(irb):15: warning: already initialized constant Test_parser
(irb):4: warning: previous definition of Test_parser was here
=> Test_parser
>>  p = Rack::QueryParser.new(Test_parser,10000,10000)
(irb):16: warning: `second argument `key_space limit` is deprecated and no longer has an effect. Please call with only two arguments, which will be required in a future version of Rack
=> #<Rack::QueryParser:0x00007f52d72ad038 @param_depth_limit=10000, @params_class=Test_parser>
>>  p.parse_nested_query(qs)
=>
{"authenticity_token"=>"cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg",
 "commit"=>"Создать версию статьи",
 "article_version"=>
  {"title"=>"",
   "article_code_prefix"=>"ZDR_",
   "classifier_ids"=>[""],
   "questionnaire_id"=>"",
   "record_type_id"=>"",
   "tag_plus"=>"1",
   "comment"=>"",
   "body[answer1"=>{"]"=>""},
   "virtual_linked_articles_codes"=>[""],
   "double_linked_codes"=>[""],
   "questions"=>[""]}}
>>

Note the "body[answer1"=>{"]"=>""}, - this is a field parsed with error.

It would be appreciated if this error could be fixed

@ioquatix
Copy link
Member

URI.decode_uri_component("article_version%5Bbody%5Banswer1%5D%5D=")
=> "article_version[body[answer1]]="

I believe the format we accept is:

"article_version[body][answer1]="

@jeremyevans WDYT?

@sergey-arkhipov
Copy link
Author

sergey-arkhipov commented Oct 12, 2023

@ioquatix

As I mentioned above, this is the standard format for generating forms in Rails, and it works fine with rack version 2.2.8.
The result should be exactly as shown above

"body"=>{"answer1"=>""},

In the html code it looks like this.

<textarea class="tinymce" placeholder="Enter article text" data-add-field-target="answer" data-tinymce-target="input" name="article_version[body[answer1]]" id="article_version_body[answer1]" aria-hidden="true" style="display: none;"></textarea>

name="article_version[body[answer1]]"

At least not like this, anyway. --> "body[answer1"=>{"]"=>""},

@matthewd
Copy link
Contributor

this is the standard format for generating forms in Rails

No, it is not.

I'm very surprised that format has ever parsed predictably. I guess we'll need to dig into history to figure out whether it was totally accidental, or deliberately accommodated as a variant format. The lack of tests suggests the former, though it honestly seems hard to accidentally write code that would handle it.

@sergey-arkhipov
Copy link
Author

sergey-arkhipov commented Oct 12, 2023

Hi, @matthewd

I won't insist, but it is quite logical to use this format for some fields. It is especially useful for various JSON fields, when the value for each required key is output separately.
In any case, parsing should not produce the given result without any errors, because it produce long backtrace stack and give incorrect errors on users side.
I would like to note that sending to the "back" side works fine in both versions.
I can use any other format for this fields you suggest, but the field name must contain the name and the key by which the value of this field is determined.

Some additional.
As you can see, another method in rack v3.0.8 works fine with the same form-data, but rails itself use parse_nested_query
"article_version[body[answer1]]"=>""

>> p.parse_query(qs)
=>
{"authenticity_token"=>"cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg",
 "commit"=>"Создать версию статьи",
 "article_version[title]"=>"",
 "article_version[article_code_prefix]"=>"ZDR_",
 "article_version[classifier_ids][]"=>"",
 "article_version[questionnaire_id]"=>"",
 "article_version[record_type_id]"=>"",
 "article_version[tag_plus]"=>["0", "1"],
 "article_version[comment]"=>"",
 "article_version[body[answer1]]"=>"",
 "article_version[virtual_linked_articles_codes][]"=>"",
 "article_version[double_linked_codes][]"=>"",
 "article_version[questions][]"=>""}
>> p.parse_nested_query(qs)
=>
{"authenticity_token"=>"cATU9VIIozE3HDwS_kVfbyJpRkDlHUhJH3pfZllA-CSPm6L7fkOF-fQFByLv75mFP1uCgW64JyCi5Z4YTloaIg",
 "commit"=>"Создать версию статьи",
 "article_version"=>
  {"title"=>"",
   "article_code_prefix"=>"ZDR_",
   "classifier_ids"=>[""],
   "questionnaire_id"=>"",
   "record_type_id"=>"",
   "tag_plus"=>"1",
   "comment"=>"",
   "body[answer1"=>{"]"=>""},
   "virtual_linked_articles_codes"=>[""],
   "double_linked_codes"=>[""],
   "questions"=>[""]}}

@matthewd
Copy link
Contributor

https://guides.rubyonrails.org/form_helpers.html#understanding-parameter-naming-conventions

name="person[address][city]"

@sergey-arkhipov
Copy link
Author

Inside view

        <div class='answer_body' data-controller='tinymce'>
          <%= form.text_area "body[#{k}]", class: 'tinymce', placeholder: t(:body_prompt), value: @article_version.body[k], data: { add_field_target: 'answer', tinymce_target: 'input' } %>

@jeremyevans
Copy link
Contributor

This is a bug in the application that needs to be fixed, not a bug in Rack. That it worked in older versions of Rack is accidental.

@sergey-arkhipov
Copy link
Author

sergey-arkhipov commented Oct 12, 2023

This is a bug in the application that needs to be fixed, not a bug in Rack. That it worked in older versions of Rack is accidental.

@jeremyevans,

I draw my conclusions based on the fact that TWO methods in rack (see above) produce DIFFERENT results for the same input data.
If this is the correct behaviour from your point of view - perhaps it should be reflected in the gem description.
From my point of view, two methods that differ only in output data format should produce identical results. Another result - is a BUG.
Besides, RFC description of form-data does not contain restrictions on variable naming and the task of parsing is just to return variable names as they are in the original query string.
A parser that CHANGES the variable name and passes a PART of this name does not work correctly.

@matthewd
Copy link
Contributor

parse_query gives no meaning to the content of the field names, so it returns a flat hash containing the names as supplied (including [ and ] characters). That's the RFC-matching behaviour.

parse_nested_query, by design and documentation, applies additional parsing to the field names, giving special meaning to [ and ] characters, to construct a nested data structure. For that behaviour to work, the [ and ] characters have to be used in a way that is syntactically valid not just for the RFC, but also for the nested-query behaviour.

The whole point of having the two methods is the fact that they do different things. You can already see that happening with "article_version[title]"=>"" vs "article_version"=>{"title"=>""}, regardless of the broken example.

@sergey-arkhipov
Copy link
Author

@matthewd ,

Thank you for the detailed explanations of the difference in the objectives of the methods.
In version 2.2.8, the field named ""article_version[body[answer1]]"=>"" returned by parse_query was successfully and perfectly correctly transformed into "article_version"=> {"body"=>{"answer1"=>""}} by the parse_nested_query method.
What prevents you from doing it in the current version ?
This looks very logical and allows a shorter version of view generation for array and hash fields.
Especially now that these types of fields are being used more and more in various forms.
In any case, parsing a field name should not change that name, from my point of view.
You can produce a parsing error - that's also an option, but you can't produce a WRONG field name as a result of the method.

@jeremyevans
Copy link
Contributor

We could change article_version[body[answer1]]= to parse as {"article_version"=>{"body[answer1]"=>""}} instead, by keeping count of the number of [ and ]. That would make slightly more sense than the current parsing. But we aren't going to support parsing it as {"article_version"=>{"body"=>{answer1"=>""}}}.

However, I'm not sure trying to support {"article_version"=>{"body[answer1]"=>""}} is worth the complexity it would entail. @ioquatix @matthewd your thoughts?

@sergey-arkhipov
Copy link
Author

@jeremyevans

Another way leads to increased complexity of forms, when there is a layout of nested elements of standard arrays and hash into simple names, which leads to an additional "layer" of processing and errors.
It seems to me a better way to still process the transfer (it already works in all browsers) and parsing of such fields.
I would be grateful for your support of my approach. :-)

@jeremyevans
Copy link
Contributor

To reiterate for clarity, we aren't going to support parsing article_version[body[answer1]]= as {"article_version"=>{"body"=>{answer1"=>""}}}. Trying to support that brings more complexity and results in two separate ways to support the same thing, which we do not want.

If you do not want to change the parameter keys you are using, please switch from using parse_nested_query to parse_query, and then handle the nesting of keys yourself.

@sergey-arkhipov
Copy link
Author

sergey-arkhipov commented Oct 12, 2023

@jeremyevans

As I said, we are using rails 7.0.8, which in turn uses rack 2.2.8, and everything is working fine.
With the release of the new Rails 7.1.1, a scheduled upgrade has resulted in a change of rack version to 3.0.8, which in turn has resulted in a call to you.

The option you suggested is possible, but it is not the right way to customise Rails.

If you solve the problem with at least the correct field name - I will consider how to better solve the overall problem. The last thing I want to do is to make any custom solutions for docking two gems.

The option of changing the design of forms is also possible, as I said, the problem is that we have a lot of these forms, we actively use arrays and hashes.

Leaving rack in version 2.2.8 is also a possible option, but also not very correct from my point of view for further update of the stack, because the next updates may require the current version of rack.

@jeremyevans
Copy link
Contributor

If you use the correct parameter name, article_version[body][answer1]=, then it will work in all versions of Rack. Please understand that the fact that article_version[body[answer1]]= worked in older versions of Rack was not by design. You were relying on undefined behavior.

I'm sorry if you have a lot of forms that are doing things incorrectly. It would have better if older Rack did not unexpectedly handle the format you are using by accident. However, there is nothing we can do about that now. Your best approach forward is to fix the parameter names in all of your forms.

@sergey-arkhipov
Copy link
Author

sergey-arkhipov commented Oct 12, 2023

@jeremyevans

Yes, I understand your position very well.
It's not about how to pass the name in the form, I can make any custom name. The problem is a bit different - if the form name doesn't match the value of the variable, the field won't populate when an existing record is opened.

The field is a hash. If you use a direct normal reference - it will look like the usual body['answer1'] in ruby. Full reference looks like article_version.body['answer1'] or article_version[body['answer1'] ] for active record.
So it is only a matter of "parsing" hash in the form itself and custom solutions ala OpenStruct to support filling key fields with the values of these keys.

In case of at least some parsing of the field name from your side, I will try to implement the logic at least at the controller or model level.

I hope, I have clearly explained the essence of the problems arising from the lack of such parsing.

I was very happy when in version 2.2.8 the construction with the field name as a direct reference to the hash value worked. It really removes a lot of problems when working with arrays and hash in any forms.

And your words that it was nothing more than an accident made me very sad.

@jeremyevans
Copy link
Contributor

And your words that it was nothing more than an accident made me very sad.

I'm sorry to hear that. I realize this is disappointing to you, and could result in a lot of additional work.

@sergey-arkhipov
Copy link
Author

And your words that it was nothing more than an accident made me very sad.

I'm sorry to hear that. I realize this is disappointing to you, and could result in a lot of additional work.

Don't worry, nobody cancelled monkeypath :-)

I actually really like the existing solution in 2.2.8.

Anyway, if you somehow fix the output of the parsing results as the correct field name, there will be an opportunity to think about the problem again from all sides.

Thanks for participating in solving our problems, although I believe we are not the only ones with such problems :-)

@matthewd
Copy link
Contributor

However, I'm not sure trying to support {"article_version"=>{"body[answer1]"=>""}} is worth the complexity it would entail.

Yeah, that would be a fully new feature, and I don't think that's worth the huge slowdown it would impose on all parameter parsing.

If there were evidence that the previous behaviour was undocumented-but-originally-intended, then I might be inclined to re-support it for a while, with a gentler deprecation path. That would at least [presumably] still be possible with simple (regex) splits.

@pgumeson-fabric
Copy link

Seems omniauth may have been relying on this behavior as well. Not totally sure if it's the same, but see referenced issue cookpad/omniauth-rails_csrf_protection#15.

@vrodic
Copy link

vrodic commented Jan 29, 2024

For reference, this change is caused by this PR: #1797

tvdeyen added a commit to tvdeyen/solidus_prototypes that referenced this issue Apr 17, 2024
In Rack 3 the wrongly formatted `option_values_hash` stopped working.
It was an accident that this ever worked in Rack 1 and 2

See rack/rack#2128
@tvdeyen
Copy link

tvdeyen commented Apr 17, 2024

We had a wrongly formatted input field name and fixed it to the correct one

See: https://github.com/solidusio-contrib/solidus_prototypes/pull/65/files

TL;DR

Use some[nested][param] instead of some[nested[param]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants