Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Basic ANSI/VT100 Control Sequence Colorization/Formatting Support #3160

Closed
RichiCoder1 opened this issue Apr 22, 2021 · 6 comments
Closed
Labels
enhancement An enhancement or new feature parser

Comments

@RichiCoder1
Copy link

RichiCoder1 commented Apr 22, 2021

Is your request related to a specific problem you're having?
Less of a problem and more of a hope.

The solution you'd prefer / feature you'd like to see added...
In essence, basic support for formatting or stripping out ANSI/VT100 codes for console/shell-session. Not full on terminal replay support, but basic support for translating color and formatting codes into corresponding markup and CSS and removing unsupported codes.

Additional context...
Basically to enable taking console output directly from colorized CLI output and embedding it in markdown. Evening more specifically, to enable colorized support for console output snippets in comments made by build tools without having to dive into terminal logs.

@RichiCoder1 RichiCoder1 added enhancement An enhancement or new feature parser labels Apr 22, 2021
@joshgoebel
Copy link
Member

Not sure I follow. Please provide an example of input/output.

@RichiCoder1
Copy link
Author

It may be non-trival to parse just because it's not symetrical like some grammars, but basically

�[31m
�[1m�[31mError: �[0m�[0m�[1mNo configuration files�[0m

�[0mPlan requires configuration to be present. Planning without a configuration
would mark everything for destruction, which is normally not what is desired.
If you would like to destroy everything, run plan with the -destroy option.
Otherwise, create a Terraform configuration file (.tf file) and try again.
�[0m�[0m

to

image
(span/css)

More context:

@joshgoebel
Copy link
Member

joshgoebel commented Apr 22, 2021

Sorry to crush your hopes, but this isn't what we do. We highlight code verbatim - we do not modify it or otherwise attempt to "render" it. If we supported ANSI (as a grammar) it would look just like the first capture but perhaps with the escape sequences highlighted so that they "popped", etc...

This is a great idea though. Seems like the kind of thing that'd be pretty easy to do (at least for the FG/BG ANSI sequences - less so the move cursor, etc), but sounds like a separate library.

@RichiCoder1
Copy link
Author

Fair enough :). Thank you!

@unphased
Copy link

unphased commented Apr 14, 2024

I would love this for the convenience of being able to stick colorized terminal output into a code block and get it rendered as we would expect (by having something to interpret the escape sequences). It would be nice if highlightjs could do this, then it would work out of the box in something like reveal.js for presentations.

Although it is true that ANSI escape sequences seems at first a completely different paradigm from what highlightjs does keeping content verbatim and applying styling...

It seems possible if you just take the ansi escape sequences themselves, put those into spans, styled as display:none.

Hiding them instead of making them pop out (as they are nonprintable anyhow and this is precisely what a terminal emulator would do). Then styling the rest of the content in the appropriate spans would yield the correct result.

I have a zero dependency bit of code that handles an ANSI to HTML translation, which may be a helpful reference to anyone looking at this topic. Since I built this recently I made sure to include every rendition feature I care about so it includes even underline styles and colors; the css corresponding to the classes I set from this code should be self-explanatory.

https://gist.github.com/unphased/628a116cdde6f2d3f49bbe04a8e3d555

There is a big wrinkle however.

The main issue is that ANSI styles allow for you to set multiple types of styles (set a foreground color, set bold, set italic) and then clear all those states with a single \e[0m or \e[m, or you can clear just the bold with \e[22m.

Along these lines, you could have something like \e[2mABC\e[3mDEF\e[22mGHI\e[23m which must draw "ABC" in bold, DEF in bold and italic, and GHI only in italic, for which in HTML you'll have to assemble separate spans for these 3 segments, but you see in the ANSI escape sequence bold is toggled on and off once, and italic is toggled on and off once. Their ranges overlap, which is not even a possibility with HTML.

So we have parsing challenges to attempt full terminal emulation, however even if this aspect is broken and only a straight translation (as my example code does) is done, "clean" (properly paired AND always nested ansi escape sequences) would still always render properly.

@joshgoebel
Copy link
Member

joshgoebel commented Apr 14, 2024

Anyone wishing to write truly unique parsing engines (for custom grammars) is now free to do so via #3620. Though the output must still be a valid HLJS token stream - ie all you can emit are named scopes and text, etc...

https://highlightjs.readthedocs.io/en/latest/mode-reference.html#emittokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement or new feature parser
Projects
None yet
Development

No branches or pull requests

3 participants