Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect width measurement for tabs? #419

Open
itsjunetime opened this issue Dec 22, 2021 · 3 comments
Open

Incorrect width measurement for tabs? #419

itsjunetime opened this issue Dec 22, 2021 · 3 comments

Comments

@itsjunetime
Copy link

I may be speaking out of my depth here, so let me know if what I'm saying doesn't make sense.

It seems that this library always measures tabs to have 0 for their width (when the unicode_width feature is enabled), thus causing some text to not be wrapped correctly when tabs are displayed with a width of one or more spaces. This happens because character widths are calculated in core.rs, with fn ch_width(ch: char) -> usize (here), which calls into unicode_width (here), calling further into unicode_width here, which tells any unicode character less than 0x20 to return None. This includes tabs (as they are 0x09), and that None is then unwrapped to 0.

Because of this, if tabs are being displayed to the user with the width of 2, 4, or 8 spaces (as they are for most people), textwrap will calculate the width of a line to be lower than it actually will be when presented to the user. This will mean that lines will not be wrapped correctly, and will still overflow when they are not supposed to.

Is this intended behavior (e.g. don't want to decide, for all users, what the width of a tab should be, so just don't deal with that)? If it's not, how do we let users configure how wide a tab should be for their uses?

Let me know what your thoughts are, I'd love to get this resolved.

@mgeisler
Copy link
Owner

Hi Ian! Thanks for writing, you raise a good question!

I've basically not though of tabs at all, so the current behavior is result of the default behavior so to speak.

How advanced would you suggest to be here? Would it be enough to simply replace \t with some number of spaces?

Implementing real tab stops would probably be very complicated since the with of a tab would depend on where it ends up in the wrapped line.

Tab expansion could be done upfront based on a setting in Options (simple) or it could possibly be done inside Word (complex). The first approach would result in all text being cloned (when the tab expansion is enabled) whereas the other approach would let wrapped lines without tabs borrow from the input.

Would you be up for trying to implement this?

@itsjunetime
Copy link
Author

I don't think we would have to modify the input string at all for what I'm suggesting, just calculate its width based on the assumption that a tab will be displayed with the width of a certain number of spaces.

I also don't know enough about how tab stops work with aligning text to say how that would work, so I would say that, just for now, it would probably be best to just treat them each as a fixed width.

I would suggest adding it in options, since I feel like that would make it easiest for users to customize (e.g. just adding pub tab_width: u8 as a field of Options).

I would be up for implementing this, assuming you understand my implementation plans and feel good about them.

@mgeisler
Copy link
Owner

mgeisler commented Dec 25, 2021

I don't think we would have to modify the input string at all for what I'm suggesting, just calculate its width based on the assumption that a tab will be displayed with the width of a certain number of spaces.

Yes, agree, it should be possible to implement this by measuring the text better.

I would suggest adding it in options, since I feel like that would make it easiest for users to customize (e.g. just adding pub tab_width: u8 as a field of Options).

Yes, I think that sounds like a good plan! A tab_width of zero would then signify the current behavior. We can debate if that should be the default or not—I don't think I have a strong preference here.

Please give it a try and let me know what you come up with!

LiterallyVoid added a commit to LiterallyVoid/textwrap that referenced this issue May 10, 2024
The `width` in the test depends on how wide `textfill` considers tabs
to be (see mgeisler#419), and this test will fail if and/or when this changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants