Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved syntax checking around line continuations #6420

Closed
bquorning opened this issue Oct 29, 2018 · 21 comments
Closed

Improved syntax checking around line continuations #6420

bquorning opened this issue Oct 29, 2018 · 21 comments
Assignees
Labels
feature request stale Issues that haven't been active in a while

Comments

@bquorning
Copy link
Contributor

bquorning commented Oct 29, 2018

'this text is too ' \
'long'

Style/LineEndConcatenation recommends that we should use \ instead of + or << to concatenate strings across multiple lines. But I am still seeing wildly inconsistent formatting of these multiline strings. I would like to suggest RuboCop checking the following four aspects of formatting of multiline strings:

  1. One space between quote and \.
  2. Consistent indentation.
  3. Use same quotes on all lines.
  4. Keep whitespace at the end of lines.

1. One space between quote and \

# Bad
'this text is too '\
'long'

# Bad
'this text is too '  \
'long'

# Good
'this text is too ' \
'long'

2. Consistent indentation

Update: This is now handled by Layout/LineEndStringConcatenationIndentation.

Indent IndentationWidth or match the preceding line’s starting quote.

# IndentationWidth
puts 'this text is too ' \
  'long'

# Match quote
puts 'this text is too ' \
     'long'

3. Use same quotes on all lines

Update: This is now handled by Style/StringLiterals when setting ConsistentQuotesInMultiline: true.

# Bad
"this text is #{foo} " \
'long'

# Good
"this text is #{foo} " \
"long"

# Good
'this text is too ' \
'long'

4. Keep whitespace at the end of lines

In the actual text, disallow spaces at the beginning of line n+1 if line n ends with a non-space.

# Bad
'this text contains a lot of' \
'               spaces'

# Good
'this text contains a lot of               ' \
'spaces'

# Bad
'this text is too' \
' long'

# Good
'this text is too ' \
'long'
@bquorning
Copy link
Contributor Author

/cc @jonas054 (I’m not sure who to mention here, but I know you’ve worked on many syntax cops in the past)

@tejasbubane
Copy link
Contributor

tejasbubane commented Jan 28, 2019

3. Use same quotes on all lines. This might create ambiguity Style/StringLiterals which suggests using single quotes for strings without interpolation.

"this text is #{foo} " \
'long' # StringLiterals prefers this

"this text is #{foo} " \
"long" # But as per your suggestion LineEndConcatenation should prefer this

@bquorning
Copy link
Contributor Author

The idea was that this cop should take precedence over Style/StringLiterals on multi-line strings.

@stale
Copy link

stale bot commented May 8, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

@stale stale bot added the stale Issues that haven't been active in a while label May 8, 2019
@bquorning
Copy link
Contributor Author

@bbatsov I would like to hear your opinion. Is strings continued over multiple lines within the responsibility of RuboCop, or should a separate tool (e.g. rubocop-strings) handle these niche cases?

@stale stale bot removed the stale Issues that haven't been active in a while label May 9, 2019
@stale
Copy link

stale bot commented Aug 7, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

@stale stale bot added the stale Issues that haven't been active in a while label Aug 7, 2019
@bquorning
Copy link
Contributor Author

I’m still hopeful that someone with knowledge of Parser will comment on this issue.

Maybe @whitequark would be the right person to ask? Or @pocke ?

@stale stale bot removed the stale Issues that haven't been active in a while label Aug 7, 2019
@whitequark
Copy link

Each chunk of the literal is emitted as a separate node so you can just look at the line attribute.
Screenshot_20190807_083230

@buehmann
Copy link
Contributor

buehmann commented Aug 7, 2019

@bquorning The first aspect (one space between quote and \) could be useful for other line continuations as well, not only between string literals.

In fact I was experimenting with extending ExtraSpace the other day to cover the case of too many spaces before the backslash. For your second bad example above:

6420.rb:2:20: C: Layout/ExtraSpacing: Unnecessary spacing detected.
'this text is too '  \
                   ^

Doing that I discovered a more general flaw in the cop that needs to be fixed first.

@bquorning
Copy link
Contributor Author

Thank you for the response, @whitequark. But, your example shows a string containing a newline character. My question is regarding two strings that are “implicitly joined” (as two strings joined by whitespace would be, e.g. "a" "b" == "ab"), only the strings are on different lines and the first line is ended by a backslash (to signal to Ruby that the line continues).

In the following example, I would like to find the position of the \ following "foo " to see e.g. how many spaces are in front of \. Can that be done in Parser, currently?

>> Parser::VERSION
"2.6.3.0"
>> Parser::CurrentRuby
Parser::Ruby26 < Parser::Base
>> puts two_joined_strings = %Q{"foo " \\\n"bar"}
"foo " \
"bar"
nil
>> eval(two_joined_strings)
"foo bar"
>> Parser::CurrentRuby.parse(two_joined_strings).loc
{
           :end => nil,
         :begin => nil,
    :expression => #<Parser::Source::Range (string) 0...14>
}
>> Parser::CurrentRuby.parse(two_joined_strings).children.map(&:loc)
[
    [0] {
               :end => #<Parser::Source::Range (string) 5...6>,
             :begin => #<Parser::Source::Range (string) 0...1>,
        :expression => #<Parser::Source::Range (string) 0...6>
    },
    [1] {
               :end => #<Parser::Source::Range (string) 13...14>,
             :begin => #<Parser::Source::Range (string) 9...10>,
        :expression => #<Parser::Source::Range (string) 9...14>
    }
]

@whitequark
Copy link

In the following example, I would like to find the position of the \ following "foo " to see e.g. how many spaces are in front of \. Can that be done in Parser, currently?

No; the location of \, like with some other tokens (for example ,) is thrown away after lexing.

@buehmann
Copy link
Contributor

In the following example, I would like to find the position of the \ following "foo " to see e.g. how many spaces are in front of \.

@bquorning This is the approach I was playing with, basically analyzing the whitespace around line breaks with regexps: https://github.com/buehmann/rubocop/blob/32ccea042243ee0edf9d9f0dc426bf5baf5d604e/lib/rubocop/cop/layout/space_before_line_continuation.rb#L54-L61

@bquorning
Copy link
Contributor Author

@buehmann Interesting approach with the #investigate implementation. Does it actually work, though? Does the between object contain backslashes, or did Parser throw them away at this point?

I was playing around with e.g.

def on_dstr(node)
  node_line_numbers = node.first_line..node.last_line
  raw_lines = processed_source.raw_source.lines

  node_line_numbers.each do |node_line_number|
    raw_node_line = raw_lines[node_line_number - 1]
    investigate(node, raw_node_line)
  end
end

and matching the line with line.match?(/\S \\\Z/).

processed_source.raw_source definitely contains the backslashes, but isn’t tokenized, and I am only working with a raw string and regular expressions. I’m not sure if RuboCop (or Parser or AST) has some helper classes/methods that could make it cleaner.

@buehmann
Copy link
Contributor

buehmann commented Aug 11, 2019

It works. The source buffer contains the backslashes, they just sit between tokens returned by Parser.

➜  rubocop (line-cont/6420) ✗ rubocop -a --stdin 6420.rb < 6420.rb --only Layout/SpaceBeforeLineContinuation
Inspecting 1 file
C

Offenses:

6420.rb:2:20: C: [Corrected] Layout/SpaceBeforeLineContinuation: Missing space before line continuation
'this text is too '\
                   ^
6420.rb:6:20: C: [Corrected] Layout/SpaceBeforeLineContinuation: Extra space before line continuation
'this text is too '  \
                   ^

1 file inspected, 2 offenses detected, 2 offenses corrected
====================
# Bad
'this text is too ' \
'long'

# Bad
'this text is too ' \
'long'

# Good
'this text is too ' \
'long'

@josb
Copy link

josb commented Dec 18, 2019

Please note that https://bugs.ruby-lang.org/issues/6265#note-8 suggests the Ruby core developers are trying to remove this C-like String concatenation syntax in Ruby 3.0.

@bquorning
Copy link
Contributor Author

Thanks @josb, I wasn’t aware of that ruby-core issue. But, reading the ticket I see that the syntax change was suggested 7 years ago, didn’t become a warning in Ruby 2.0, and 5 years ago it didn’t become a warning in Ruby 2.2. The last update (3 years ago) suggests to me that the syntax is still so heavily used in core Ruby that it won‘t be removed for a while yet.

@josb
Copy link

josb commented Dec 18, 2019

Granted. But it would be good if RuboCop would help people move in the right direction, no? Could even link to that ticket...

@stale
Copy link

stale bot commented Jun 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution and understanding!

@stale stale bot added the stale Issues that haven't been active in a while label Jun 15, 2020
@stale
Copy link

stale bot commented Sep 13, 2020

This issues been automatically closed due to lack of activity. Feel free to re-open it if you ever come back to it.

@stale stale bot closed this as completed Sep 13, 2020
bquorning added a commit to rubocop/rubocop-rspec that referenced this issue Nov 14, 2020
I have suggested some rules for consistent formatting of these strings
in rubocop/rubocop#6420, but it seems to be
too complicated to have RuboCop enforce them.
bquorning added a commit to rubocop/rubocop-rspec that referenced this issue Nov 14, 2020
I have suggested some rules for consistent formatting of these strings
in rubocop/rubocop#6420, but it seems to be
too complicated to have RuboCop enforce them.
@bquorning
Copy link
Contributor Author

bquorning commented Jun 16, 2022

The first aspect (one space between quote and \) could be useful for other line continuations as well, not only between string literals.

@buehmann I finally got around to playing with checking line continuations again. You can have a look at #10717 if you like (it’s still in draft).

@bquorning
Copy link
Contributor Author

bquorning commented Jun 21, 2022

Fixed in #9883, #10715 and #10717.

pirj pushed a commit to rubocop/rubocop-capybara that referenced this issue Dec 29, 2022
I have suggested some rules for consistent formatting of these strings
in rubocop/rubocop#6420, but it seems to be
too complicated to have RuboCop enforce them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request stale Issues that haven't been active in a while
Projects
None yet
Development

No branches or pull requests

5 participants