New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pl.Config.set_tbl_formatting("UTF8_FULL_CONDENSED") as default for printing Dataframes #5942
Comments
Thanks for the try @dandxy89. I don't know what went wrong with your PR |
This seems like it could be an issue with your specific distribution/terminal/font - I don't have any problems with the UTF8 formats on Linux. Can you provide a screenshot and some distro/terminal/font details for your Linux machine, so we can at least see what's happening for you? (We're not going to want to change the default to ASCII there, given that there are no other reports of problems). |
I just tested it on my Linux Mint 21.1 and it's working fine with utf8 formatting, the problem only occurs on WSL (Windows Subsystem for Linux). Still, I believe that the condensed utf8 formatting should be the default. |
Personally, I also prefer the look of the condensed formatting. I am not familiar with the intricacies of the various operating systems, but to me it just looks better. |
Will update my branch this evening once I'm back from the bouldering gym. Why did I close it? I realised that the Python test suite wasn't running locally. |
@stinodego is not only the looks, it is around 40% less lines and it makes it Polars DataFrame much easier to work with. Pandas, which is the "default" library for DataFrame, is even more minimal. |
It will probably be a bit of a chore to update all the docstring examples, although I'm sure that's doable with some regex magic. Should probably get sign-off from @ritchie46 before going ahead with that. |
Do you think we can automate such a change? This would be beneficial many times down the road probably. I have no strong opinions on the format, so a tight format is fine by me. Only on changing the docstrings. |
The line-contraction is definitely a font/kerning issue, rather than a polars issue (@arturdaraujo: you could try switching your font under WSL to something like JetBrains Mono, or one of the Menlo Powerline fonts?) As for a change in default format, I'm definitely biased towards the @ritchie46 / @stinodego: if you want to go ahead, I'll volunteer for updating this and the associated mass docstrings update; how about in conjunction with #5513, which would also require a global docstring update? (It's a different issue, but could kill two birds with one stone). |
I doubt it would be worth it. The docstrings right now are all flat text, you'd have to find some way to auto-generate the example output. I think regex magic is probably a way easier method of going about this.
Let's do it! Regarding #5513, I'll leave a comment in that issue. I'm not as certain about that one. |
If it's for the python files - the ast module can be used to isolate the docstrings which can be helpful for performing modifications.
The result could then be passed to |
Does unparse do what I think it does?! 👀 That would be amazing. |
Just submitted a corresponding |
FYI, I have looked into alternatives before to auto-update the docstring output, and could only find https://github.com/max-sixty/pytest-accept, but does not seem maintained, and relies on pytest, which does not pick up our custom output checker (which we need for The suggestion by @cmdlineluser to use the ast module is a good one. Note that Python's built-in doctest does not use the ast module, although a popular alternative xdoctest (insofar doctests are popular) does. Not sure if that has any consequences. I have been tempted before to wrap what we have in our script as a separate package, maybe it becomes worth it if we have auto-update as well. Although I realize that is far off from the very targeted use case here. Not sure how black should be part of this, you are only changing the outputs, right? |
Output:
Also, I just realized it doesn't retain comments either - which makes it a non-solution - d'oh. |
Looks like Parso or LibCST are the recommended tools that retain formatting/comments.
|
It's a New Year's gift! |
Have finished updating everything now except for the rather involved docstring of pl.align_frames. The tea is finished, and this one probably calls for a coffee - but then it's all done ;) |
Wow, that's fast! |
Done - unit/doctests all passed locally, so let's see how it does through CI ... ;) |
Problem description
The dataframe printing format should be densed. It does not make sense to make the full verbose format.
Here is an implementation suggestion using only built-in packages and their respective processing times below:
I also created an issue for differentiating systems OS on #5941
The current default prints like this:
But the default should be this:
The text was updated successfully, but these errors were encountered: