Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelining text with ANSI color code to bat consumes huge CPU and memory. #1481

Closed
cedric-sun opened this issue Jan 6, 2021 · 13 comments
Closed
Assignees
Labels
bug Something isn't working performance

Comments

@cedric-sun
Copy link

What version of bat are you using?
bat 0.17.1

Describe the bug you encountered:

I was working on the nginx source tree: http://nginx.org/download/nginx-1.19.6.tar.gz

grep -rniI 'u_char' --color=always | bat

less will open but scrolling to end (G) takes 8 seconds for a 2500 lines grep result. top report that less consumed a whole CPU core and 960 MiB physical memory (RES) during that 8 seconds.

grep -rniI 'u_char' --color=never | bat works fine.
grep -rniI 'u_char' --color=always | less -RN works fine.

What did you expect to happen instead?
Performance should be equivalent to grep -rniI 'u_char' --color=always | less -RN.

How did you install bat?
Archlinux pacman.


Evironment

system

$ uname -srm
Linux 5.9.14-arch1-1 x86_64

bat

$ bat --version
bat 0.17.1

$ env

bat_config

bat_wrapper

No wrapper script for 'bat'.

bat_wrapper_function

No wrapper function for 'bat'.

No wrapper function for 'cat'.

tool

$ less --version
less 563 (PCRE regular expressions)

@cedric-sun cedric-sun added the bug Something isn't working label Jan 6, 2021
@cedric-sun cedric-sun changed the title Pipelining text with ANSI color code to bat consume huge CPU and memory. Pipelining text with ANSI color code to bat consumes huge CPU and memory. Jan 6, 2021
@eth-p
Copy link
Collaborator

eth-p commented Jan 8, 2021

I might have an idea on what's going on.

The way that ANSI escape sequences are passed through by bat is that they're actually interpreted and re-emitted intermixed with the regular highlighted output. It doesn't emulate a terminal, but it uses heuristics to guess the combination of escape sequences necessary to create an equivalent color to what an xterm/vt100 terminal would have displayed at that point.

It's possible that bat isn't clearing the history of emitted sequences properly, and is just continously emitting obsolete sequences over and over again. It would look the same, but end up with tons of unnecessary bytes being sent to less.

The theory would need to be tested of course, but it's my best guess as of right now.

@sharkdp sharkdp added performance and removed bug Something isn't working labels Jan 9, 2021
@sharkdp
Copy link
Owner

sharkdp commented Jan 9, 2021

Related to #1147?

@eth-p
Copy link
Collaborator

eth-p commented Jan 9, 2021

I did some of my own testing with this on bat's repo using ggrep -rniI 'bat' --color=always | bat.
With less, it's unbearably slow (1-5 lines / sec) and consumes 200 MiB of memory.

Before I get into details, I would just like to say that the following three lines made this very difficult to figure out 😛

bat/src/bin/bat/app.rs

Lines 164 to 166 in 7ada963

// We don't have the tty width when piping to another program.
// There's no point in wrapping when this is the case.
WrappingMode::NoWrapping(false)


Time to dig into why there's such a drastic difference...

$ ggrep -rniI 'bat' --color=always | head -n5 | bat -pA
This is the output of GNU grep. I'll consider this to be the baseline, since it performs well enough.

I'm not entirely sure why it's necessary for it to be emitting ^[K (erase from cursor to end of the line) after every color change, but I'll assume there's some legacy or compatibility reason for it.

␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K5␛[m␛[K␛[36m␛[K:␛[m␛[Khomepage·=·"https://github.com/sharkdp/␛[01;31m␛[Kbat␛[m␛[K"␊
␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K7␛[m␛[K␛[36m␛[K:␛[m␛[Kname·=·"␛[01;31m␛[Kbat␛[m␛[K"␊
␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K8␛[m␛[K␛[36m␛[K:␛[m␛[Krepository·=·"https://github.com/sharkdp/␛[01;31m␛[Kbat␛[m␛[K"␊
␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K19␛[m␛[K␛[36m␛[K:␛[m␛[K#·Feature·required·for·␛[01;31m␛[Kbat␛[m␛[K·the·application.·Should·be·disabled·when·depending·on␊
␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K20␛[m␛[K␛[36m␛[K:␛[m␛[K#·␛[01;31m␛[Kbat␛[m␛[K·as·a·library.␊

$ ggrep -rniI 'bat' --color=always | head -n5 | bat --color=always --decorations=always --plain --terminal-width=80 | bat -pA
And here's the output with bat --plain.

The diff between these two shows that bat adds a leading line color (^[38;2;248;248;242m) and a style reset code (^[0m) at the end.

␛[38;2;248;248;242m␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K5␛[m␛[K␛[36m␛[K:␛[m␛[Khomepage·=·"https://github.com/sharkdp/␛[01;31m␛[Kbat␛[m␛[K"␛[0m␊
␛[38;2;248;248;242m␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K7␛[m␛[K␛[36m␛[K:␛[m␛[Kname·=·"␛[01;31m␛[Kbat␛[m␛[K"␛[0m␊
␛[38;2;248;248;242m␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K8␛[m␛[K␛[36m␛[K:␛[m␛[Krepository·=·"https://github.com/sharkdp/␛[01;31m␛[Kbat␛[m␛[K"␛[0m␊
␛[38;2;248;248;242m␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K19␛[m␛[K␛[36m␛[K:␛[m␛[K#·Feature·required·for·␛[01;31m␛[Kbat␛[m␛[K·the·application.·Should·be·disabled·when·depending·on␛[0m␊
␛[38;2;248;248;242m␛[35m␛[KCargo.toml␛[m␛[K␛[36m␛[K:␛[m␛[K␛[32m␛[K20␛[m␛[K␛[36m␛[K:␛[m␛[K#·␛[01;31m␛[Kbat␛[m␛[K·as·a·library.␛[0m␊

So far looks like my previous theory was wrong, but there still one other thing to consider... what if we remove the plain style, and bring the decorations back?

$ ggrep -rniI 'bat' --color=always | head -n5 | bat --color=always --decorations=always --terminal-width=80 | bat -pA

␛[38;5;238m\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{252c}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}␛[0m␊
·······␛[38;5;238m\u{2502}·␛[0m␛[1mSTDIN␛[0m␊
␛[38;5;238m\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{253c}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}␛[0m␊
␛[38;5;238m···1␛[0m···␛[38;5;238m\u{2502}␛[0m·␛[38;2;248;248;242m␛[35m␛[35m␛[KCargo.toml␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[K␛[32m␛[K5␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[m␛[Khomepage·=·"https://github.com/sharkdp/␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[01;31m␛[Kbat␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[m␛[K"␛[0m␊
␛[38;5;238m···2␛[0m···␛[38;5;238m\u{2502}␛[0m·␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[35m␛[KCargo.toml␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[K␛[32m␛[K7␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[m␛[Kname·=·"␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[01;31m␛[Kbat␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[m␛[K"␛[0m␊
␛[38;5;238m···3␛[0m···␛[38;5;238m\u{2502}␛[0m·␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[35m␛[KCargo.toml␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[K␛[32m␛[K8␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[m␛[Krepository·=·"https://github.com/sharkdp/␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[01;31m␛[Kbat␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[m␛[K"␛[0m␊
␛[38;5;238m···4␛[0m···␛[38;5;238m\u{2502}␛[0m·␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[35m␛[KCargo.toml␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[K␛[32m␛[K19␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[m␛[K#·Feature·required·for·␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[01;31m␛[Kbat␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[m␛[K·the·application.·Should·be·dis␛[0m␊
␛[38;5;238m····␛[0m···␛[38;5;238m\u{2502}␛[0m·␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[m␛[Kabled·when·depending·on␛[0m␊
␛[38;5;238m···5␛[0m···␛[38;5;238m\u{2502}␛[0m·␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[35m␛[KCargo.toml␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[K␛[32m␛[K20␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[K␛[36m␛[K:␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[m␛[K#·␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[01;31m␛[Kbat␛[0m␛[38;2;248;248;242m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[35m␛[m␛[36m␛[m␛[32m␛[m␛[36m␛[m␛[01;31m␛[m␛[m␛[K·as·a·library.␛[0m␊
␛[38;5;238m\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2534}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}\u{2500}␛[0m␊

There's a giant, ever-increasing string of redundant color codes... Turns out my theory was right. When we use decorations and wrapping, bat is printing out every color code it sees, even if they're redundant by that point. That's going to be a big issue for long files that contain ANSI colors, and this is going to need a fix.

@eth-p eth-p added this to the v0.18.0 milestone Jan 9, 2021
@eth-p eth-p added the bug Something isn't working label Jan 9, 2021
@eth-p eth-p self-assigned this Jan 9, 2021
@eth-p eth-p mentioned this issue Jan 9, 2021
@sharkdp
Copy link
Owner

sharkdp commented Jan 10, 2021

Great analysis!

Before I get into details, I would just like to say that the following three lines made this very difficult to figure out stuck_out_tongue

Ouch. Too much magic 😄

@sharkdp
Copy link
Owner

sharkdp commented Jan 10, 2021

Related to #1147?

I think this is exactly the same. Try

git log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit | bat  --color=always --decorations=always --terminal-width=80 --wrap=character | bat -Ap

or see the slow output directly:

git log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit | bat -P

It also creates ever-increasing lists of ANSI sequences.

@sharkdp
Copy link
Owner

sharkdp commented Jan 10, 2021

Also see my perf analysis in #1147. That explains why it's so slow.

@eth-p
Copy link
Collaborator

eth-p commented Jan 10, 2021

Hmm, this is more of a widespread issue than I thought.

I think it's time for me to pay off the accrued technical debt for my wrapping implementation from way back when.

Here's my plan:

At the moment, I'm working on writing a (hopefully dependency-free) crate for properly handling the ANSI escape sequences. The idea is that we'll feed the extracted sequences into a struct, which will interpret them and build an expected state for the foreground and background color (and other attributes). This state can be used to build ANSI sequences that we can emit, rather than storing a super long string of redundant escape sequences. I'll replace the current implementation of ANSI color passthrough with that.

After that (or perhaps simultaneously), we could find a way to reduce the number of write! calls?

What do you think?

@sharkdp
Copy link
Owner

sharkdp commented Jan 10, 2021

What do you think?

Except for the reservations in #1499 (comment), that sounds great.

@trajano
Copy link

trajano commented Mar 21, 2021

Well I think what you can do is have a configurable for switch in line of ms per update for bat to output the changes that it builds in its "virtual screen" something like virtual DOM of React.

@eth-p
Copy link
Collaborator

eth-p commented Mar 23, 2021

Well I think what you can do is have a configurable for switch in line of ms per update for bat to output the changes that it builds in its "virtual screen" something like virtual DOM of React.

bat doesn't actually have a virtual screen. The virtual screen is created by the pager, which is most likely the program less. Either way though, my fix in #1596 takes the performance in your example from unusably slow to decent on my machine :)

@trajano
Copy link

trajano commented Mar 23, 2021

@eth-p oh I didn't realize you pipe it to less maybe that could be a future update to not pass it to less I switched to bat because I didn't like how less had way too many features that I may accidentally trigger (e.g. log) but I wanted to go backwards so I couldn't use more

@eth-p
Copy link
Collaborator

eth-p commented Mar 23, 2021

@eth-p oh I didn't realize you pipe it to less maybe that could be a future update to not pass it to less I switched to bat because I didn't like how less had way too many features that I may accidentally trigger (e.g. log) but I wanted to go backwards so I couldn't use more

It doesn't even have to be a future update, actually! You will have to rely on your terminal's scrollback buffer if you do it, but you can configure bat to permanently disable the pager.

Either put export BAT_PAGER='' in your .bash_profile, or add --paging=never to the bat config file.

@Enselic
Copy link
Collaborator

Enselic commented Dec 8, 2021

Closed by #1596

@Enselic Enselic closed this as completed Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance
Projects
None yet
Development

No branches or pull requests

5 participants