New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nokogiri HTML refactor #6
Conversation
protected | ||
|
||
def lint_lines(lines) | ||
def lint_file(file_tree) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there no access to the source of the file? You're jumping through a lot of hoops to make this work with the parsed tree.
Can't say I am in love with the hackery, but I understand that this is probably better in the long run. Good job getting this to work 👍 |
end | ||
html_elements = Parser.filter_erb_nodes(file_tree) | ||
html_elements.each do |html_element| | ||
html_element.attribute_nodes.select { |attribute| attribute.name.casecmp('class') == 0 }.each do |class_attr| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are a little hard to read.
Could you please break this out into some well-named variables? Something like:
html_element_class_attributes = html_element.attribute_nodes.select { |attribute| attribute.name.casecmp('class') == 0 }
html_element_class_attributes.each do |class_attr|
Looks like progress! Give me a shout tomorrow if you’d like help with the Rubocop warnings. |
@justinthec I think you already have a note on this, but I just rediscovered that single quotes in text strings cause
|
Yup, I'll fix the string validation tonight, thanks for making note here 👍 |
Thanks again! |
16db763
to
04bce1b
Compare
line: lines.length, | ||
message: 'Remove the trailing newline at the end of the file.' | ||
) | ||
def lint_file(file_tree) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lemonmade this comment got unintentionally marked as resolved by Github.
Is there no access to the source of the file? You're jumping through a lot of hoops to make this work with the parsed tree.
The idea behind using a parse tree always is that it provides a consistent interface for all linters to use and that most future linters (data-attributes, product content style, etc) will be complex enough to leverage the benefits a tree brings.
The reason that I'm jumping through the hoops here in FinalNewline
is only because of the Nokogiri line number bug. Once that is fixed, the hacky end_marker
will be completely removed and this linter will be much simpler, just finding the last child and calling #line
.
As a compromise would you suggest a wrapper class around both the Parse tree and a copy of the original source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explanation makes sense to me, I think what you have is good 👍
The PR description has been updated to reflect the latest version of this refactor. All previous feedback has been addressed except for one line comment by Chris which I've re-added manually. Ready for 👀 round 2; thanks again in advance :) |
@justinthec This looks great. I rebased |
Hey @justinthec I'm trying to do something that calls this branch of erb-lint and seems like Anyway, if I run erb-lint on <p>Check out this sweet support page.<br>\nIt's really a program.</p>\n It throws an error: |
I've gone and done a self review of the code after not looking at most of it for 28 days and I'm impressed that I still understand it and can walk through it clearly (especially The Nokogiri Line number bug that I was originally waiting for has not been addressed for exactly a month now with no end in sight so I'm going to go ahead and ship this. I'll still keep my eye out on that bug to see when I can remove my workaround but it is pretty small and well documented in the code. |
👍 |
b7dca9d
to
a456aae
Compare
🎉 thanks for getting this through the last mile Jeremy! |
No problem man, thanks for all your work! |
Tried to use the gem at this revision, and now it explodes with invalid markup:
Not sure if it's the expected behaviour after this PR, but I think it should have been better if we got a linter message instead of an exception. |
Thanks @volmer, I'll look into this ASAP. |
Trying to figure out how to best provide a linter error message. This seems pretty hacky but it’s the first thing I’ve been able to puzzle out: Rather than raising a Then I could stick an I'm sure there's a better way to do this, @justinthec . . . |
I think throwing a So https://github.com/Shopify/erb-lint/blob/master/lib/erb_lint/runner.rb#L18 would look like: def run(filename, file_content)
file_tree = begin
Parser.parse(file_content)
rescue Parser::ParseError
nil
end
return unless file_tree
linters_for_file = @linters.select { |linter| !linter_excludes_file?(linter, filename) }
linters_for_file.map do |linter|
{
linter_name: linter.class.simple_name,
errors: linter.lint_file(file_tree)
}
end
end This would fail silently; if we instead wanted to generate a violation for Policial.io then we could modify it to return an error from a pseudo return {
linter_name: 'HTMLValidity`,
errors: ['File is not HTML valid and could not be successfully parsed. Ensure all tags are properly closed']
} |
@volmer @justinthec Am testing in Policial.io. Ran into an error, am troubleshooting:
|
|
@justinthec Yes, that's the case. I need to figure out a way to pass the error directly to Policial as a linter message without erb-lint trying to actually lint the file. |
You can use the snippet I provided. Return the HTMLValidity error in the rescue block. |
@justinthec Ah, got it! I was just missing some brackets and return [
{ linter_name: 'HTMLValidity', errors: [
{ line: 1,
message: 'File is not HTML valid and could not be successfully parsed.
Ensure all tags are properly closed.' }
] }
] |
This reverts commit 4cf4cb7.
Why?
The current use of the
StartTagHelper
module inDeprecatedClasses
for analyzing HTML start tags is not robust (as demonstrated by #5), difficult to maintain due to excessive use of regular expressions, and limited in functionality (no multiline start tags, catches tags within quotes, etc).Refactoring
ERBLint
to use a proper parser, supported by over 100 contributors will solve all the problems of the current solution and make it easier for other developers to contribute linters to the gem.What?
This PR changes the interface that linters use to analyze the file from
file_content
/lines
to afile_tree
.This
file_tree
is aNokogiri::HTML::Fragment
which can be traversed and examined just like any otherNokogiri::HTML::Node
.3 small modifications are being made during the transformation from the bare text
file_content
to thefile_tree
HTML fragment.<%..%>
,<%=..%>
,<%..-%>
,<%#..%>
, etc) have beenreplaced byremoved from the file content and replaced with corresponding whitespace and newlines.<erb>...</erb>
<% apples \n %>
-->__________\n___
<%%
,%%>
) have been escaped viahtmlentities
entity encoding.<%%
--><%%
erb_lint_end_marker
tag has been appended to the end of the document fragment for calculating end-of-file line numbers. This is a workaround for this Nokogiri line number bug.All content within ERB tags has had all of its contained entities encoded (special character escaping usinghtmlentities
)All ERB tags (now<erb>...</erb>
) within strings ("..."
or'...'
) have been replaced by_erb_..._/erb_
to prevent them from being parsed as tags byNokogiri::HTML
.How?
StartTagHelper
has been removed.ERBLint::Parser
has been added, exposing aparse
method to generate the validNokogiri::HTML::Fragment
from afile_content
string.ERBLint::Runner
parses the file using theparse
method before passing the resultingfile_tree
to each linter'slint_file
method.ERBLint::Linter#lint_lines
has been removed.Todo
parser_spec.rb
For review:
@edward @volmer @lemonmade