Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Style/HashTransformKeys: still too greedy, probably needs config param #8773

Open
zverok opened this issue Sep 23, 2020 · 1 comment
Open

Comments

@zverok
Copy link
Contributor

zverok commented Sep 23, 2020

Was discussed at #8630 and "fixed" at #8648 but it is still not good enough as of 0.91.1 (and the cop itself is quite useful, so disabling it is undesirable). Here is my code:

    Zip::File.to_enum(:foreach, zip_tmp_file.path) 
             .select { |entry| ...skip... }
             .map { |entry| [entry, tmp_files_path.join(File.basename(entry.name))] }
             .each { |entry, path| entry.extract(path) }
             .to_h { |entry, path| [entry.name.force_encoding('UTF-8'), path] }

...and Rubocop says:

C: Style/HashTransformKeys: Prefer transform_keys over to_h {...}.
    Zip::File.to_enum(:foreach, zip_tmp_file.path)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

...which seems "obviously" bad suggestion, considering that the last methods before to_h are each (which generally shouldn't change anything), and then map (which generally produces arrays); but even before that, going through all the chain, we are getting to to_enum -- which couldn't possibly have resulted in a Hash (if it is standard Ruby method -- which it is).

I am afraid that this ad-hoc method listing: https://github.com/rubocop-hq/rubocop/pull/8648/files#diff-dc48171cf9207463288e50a193a75d1fR13 -- is too naive. Probably to suit larger codebases, the cop should, at least:

  • for chain of processing methods like I am showing here, look through the whole chain, not only the last method before .to_h
  • have a configurable "denylist" of previous methods in chain (which most probably turned it into not-a-Hash)
  • and/or, maybe easier, an "allowlist"? (of standard methods, I can think only of a small amount producing hashes, like group_by and tally; and only some of Hash methods that don't change its Hash-ness, like select/reject)

PS: In general, I believe that, as people tend to include Rubocop in CI, and try to fix all of its suggestions one way or another, cops like this should be more conservative... actually, the probability of the given to_h not changing key or value is actually transform_keys or transform_value is not that high. When Rubocop guesses right, it is enlightening, but it is still just a guess, and in situations like "10 guesses, only 2 got right" it becomes maddening.

@marcandre
Copy link
Contributor

I agree that a more conservative approach would be better here, i.e. prefer having false negatives to too many false positives.

A whitelist of methods producing hashes, plus hash literals would probably be best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants