Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[alignment] Fix alignment issues so repeated processing works properly #1663

Closed
wants to merge 1 commit into from

Conversation

hazendaz
Copy link
Contributor

@hazendaz hazendaz commented Nov 4, 2021

Context: formatter-maven-plugin formats source code for html using jsoup with mixed results.

During first pass on formatting a file with pretty printing, if tags such as <div> with content on second line show up, jsoup currently forces that +1 in the padding which is incorrect. If the div is inline, jsoup will correctly align it with line breaks properly on that first pass. A secondary pass of now formatted file will result in the same behaviour with +1. This seems to affect many elements based on file we have here

I tried various ways to address and don't know jsoup enough to know the right way. What seemed to work best without messing anything else up was to use the normalize on the string already present in jsoup when doing pretty printing and if that ends up with space before content as it does in this case (one at start and one at end), then do the padding with 1 less character. Under testing this seemed to then work on the first round and every subsequent pretty print of same worked out the same way.

I even tried your website to drop the file I have above. It does behave in same way there so this seems right but may not be the most appropriate way to resolve. Please take a look here and let me know what you think.

One test seemed inherently wrong here too where it had a non breaking space but was being prefixed with an extra space. To me that means it got 2 when it didn't need the extra. While look and feel is better with that space, I wasn't quite sure how to make that look better and technically one space is all that needed so I adapted that test to match the change.

@hazendaz hazendaz marked this pull request as draft November 4, 2021 02:19
@hazendaz hazendaz marked this pull request as ready for review November 4, 2021 02:20
@hazendaz hazendaz marked this pull request as draft November 4, 2021 02:21
@hazendaz
Copy link
Contributor Author

hazendaz commented Nov 4, 2021

Draft status as tests don't work with this. After review, will address as needed.

@hazendaz hazendaz marked this pull request as ready for review November 25, 2021 02:05
@hazendaz
Copy link
Contributor Author

This is ready to review.

@hazendaz
Copy link
Contributor Author

@jhy Could you look at this one?

@jhy
Copy link
Owner

jhy commented Dec 19, 2021

Hi,

I'm not sure I'm following the issue here. Can you add examples of input and output as-is (that you see as an error) and what you want instead?

@hazendaz
Copy link
Contributor Author

@jhy The original request here has the file in question, see the 'here' on that part as that is the file. You can run that through the website you have provided as well which will show that it incorrectly adds 1 extra space to tabbing and it should not. The PR applied takes that space away and resolves the case. See the PR as well. It really doesn't matter the file but I attached one that has the issue. Its simply not being calculated properly.

@hazendaz
Copy link
Contributor Author

@hazendaz
Copy link
Contributor Author

hazendaz commented Dec 19, 2021

So if we ask for 4 character spacing per tab, we expect 4 character tabs per spacing for all indentations. This does so up to the point it hits the <div>, it adds a 5th character whitespace right before. So math is incorrect. The PR sent does fix the issue (maybe not how needed in this framework but does fix it on repeated formattings - over and over its fixed). We have our project running multiple runs on the file (live on github via file provided) to confirm that the formatting does not change on subsequent passes and it currently changes on us and doesn't adhere to the requirements stated in this project for pretty print. However that needs to work, it needs to properly adhere or its not really pretty printed per spec and a simply format in Eclipse or other IDE will fix the problem and its pretty obvious issue. Its important we get this right as we have a huge user base for formatter-maven-plugin https://github.com/revelc/formatter-maven-plugin and this is off by default, where its used it is entirely wrong. This is why I had the other PR that was merged to at least fix the tabbing back to actually pretty print. This is a follow up (1 of 3 outstanding issues with this). I don't know the framework here well enough to know any issues with what I added other than it does work correctly in testing.

@hazendaz
Copy link
Contributor Author

Best option is to take that file and drop in your own website for jsoup and run with pretty print. Then count the spacing up to the <div>. They are off by 1 character.

@hazendaz
Copy link
Contributor Author

ok the online site is hard to see this, there is no option for 4 character spacing that I see there to demo this. It doesn't happen with 1 character as setup. Can you expand that show it has optiont to allow selective spacing so this can be better demonstrated?

@hazendaz
Copy link
Contributor Author

hazendaz commented Jan 3, 2022

Rebased this one and squashed commits (from 3 to 1). Same change as previous but I added () back to siblingIndex. I didn't look to see where exactly it was getting that data without () on it as not defined in that method or class directly. Keep this still fixes issues I'm seeing but as with before, its probably not the right fix but easier to see together in one commit now. The test case that is involved should help.

@hazendaz
Copy link
Contributor Author

hazendaz commented Jan 9, 2022

Jsoup 2nd run Spacing issue

Left side is first time run on my test case, right side is second run. Any repeating to format pretty aftwards always ends in third without this patch.

I see this occur inside of

, , , and others. So basically seems like anytime internal to a tag a secondary run doesn't quite count the spaces correctly. Maybe a better solution is to divide the spaces used on the line and if it is over, fix the issue. For example 21/4 has 1 left over when tab spacing is 4. That is the cruxt of the issue. I'll try looking to see if I can come up with a better pull request.

@jhy
Copy link
Owner

jhy commented Jun 24, 2022

Thanks @hazendaz - the underlying issue is now fixed (with a different approach) in #1798

Tested that now your example file will round trip multiple rounds of pretty print and parsing (with the exception of one comment getting placed on a newline, which at the moment I don't see as a big deal).

@jhy jhy closed this Jun 24, 2022
@hazendaz
Copy link
Contributor Author

hazendaz commented Jul 8, 2022

@jhy Finally got back to properly confirm this. It works great! Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants