Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IIP Cursor Placement #57

Open
AnonymouX47 opened this issue Mar 5, 2024 · 13 comments
Open

IIP Cursor Placement #57

AnonymouX47 opened this issue Mar 5, 2024 · 13 comments

Comments

@AnonymouX47
Copy link

Hello!

I recently realised the same cursor placement policies for sixels are used for IIP, by this implementation. This causes the behaviour for IIP to differ significantly (IMHO) from the reference implementation and every other implementation I'm aware of.

I'm aware cursor positioning is not specified in the document but I believe in such cases, the most sane choice is to comply with the reference implementation (particularly when the behaviour makes sense and is also followed by other implementations).

Deviations such as this cause application developers to introduce workarounds or special cases for such TEs, which isn't the best. See AnonymouX47/termvisage#9.

It'd be really good if cursor placement worked as expected i.e immediately to the right of the bottom-right-most cell touched by the image, without advancing the cursor to the next row when the images reaches the rightmost column of the screen (i.e just as the cursor behaves when text reaches the rightmost column of a row).

In addition, I think the doNotMoveCursor extension (implemented by Wezterm and Konsole, if I recall correctly) would be a worthy addition, if feasible. See wez/wezterm#1433 (by Autumn Lamonte, author of Jexer). The Wezterm docs states:

WezTerm supports an extension to the protocol; passing doNotMoveCursor=1 as an argument to the File escape sequence causes wezterm to not move the cursor position after processing the image.

Thank you. 🙏


For better context as to why I said "significantly" above:

In Wezterm:

simplescreenrecorder-2024-03-05_05.13.24.mp4

In xterm.js (via ttyd) (portions of (and in other cases, entire) images get overwritten by the TUI framework due to the misaligned cursor placements):

simplescreenrecorder-2024-03-05_05.08.47.mp4

For the record, this is the same application using unicode blocks to display images (to show that the issue is not an incompatibility between the framework and xterm.js):

simplescreenrecorder-2024-03-05_05.23.18.mp4

Side question:

Are you aware of any means for a program to detect that it's running within xterm.js (and maybe it's version)?

XTVERSION (CSI > q) doesn't seem to be implemented and I don't think env vars are viable as I don't think they can be set by xterm.js.

@jerch
Copy link
Owner

jerch commented Mar 5, 2024

First regarding your side question:

Are you aware of any means for a program to detect that it's running within xterm.js (and maybe it's version)?

Sadly no, I dont. I tried to address the issue several times in the xterm.js repo, but it kinda never led to a working solution. Main issues around that topic is the fact, that xterm.js is a TE lib, where ppl build their own TE with. Also the lib is highly customizable in exposed features up to additions from addons, thus xterm.js != xterm.js, plus real changes in VT support between different versions.
new attempt: xtermjs/xterm.js#4982

About the cursor positioning with IIP:
I did that to uniformly handle the cursor for different image sequences. In the beginning the addon supported all types of cursor modes (from xterm, mlterm etc), but after a detailed discussion with @j4james and @hackerb9 it became clear to me, how flawed the whole situation is and that the left-bottom corner of a graphics printout is the only determined point across all formats, thus removed the madness alltogether.
Or to say it differently - yes I deviate here intentionally from what iTerm implements for its sequence, but since its not documented behavior, it cannot be expected as such.

First I think a uniform behavior is better than having 10 different cursor modes working only for sequence XY, thus I am reluctant to change to "your expectation" here. Same goes for the wezterm addition, it is just yet another cursor mode working only with IIP.

If we really want to solve the cursor positioning after graphics output once and for all, I'd say we need a sane default (which is VT340 mode - left-bottom edge), and maybe introduce a terminal mode selecting one of the of the other graphics corners, where applicable, e.g. 0 - bottom-left (default, always for sixel level 1), 1 - top-left (same as wezterm addition), 2 - top-right, 3 - bottom-right (IIP in iTerm).

@AnonymouX47
Copy link
Author

First regarding your side question:
...

Hmm... True. I'll probably chime in to the new discussion if I have any ideas to contribute. Thanks

About the cursor positioning with IIP:
...

Hmm... interesting. 🤔

First of all, were you, j4james and hackerb9 discussing about sixels specifically or terminal graphics in general?

I sincerely don't think introducing a new sequence is a viable solution as it'll only end up widening the problem we're trying to solve. Yes, uniformity across all protocols is desirable but we both know it'll never happen for various reasons. I think uniformity across implementations of the same protocol is what matters and it's really sufficient for any application i.e I know I'm emitting X sequence and Y is the expected cursor position/placement after graphics is drawn.

One graphics protocol/sequence really doesn't/shouldn't affect the other. On that note, I also wanted to mention earlier, the fact that DEC{SET,RST} 80 (sixel scrolling) affects IIP but just decided to focus on the main thing.

@jerch
Copy link
Owner

jerch commented Mar 5, 2024

Yes the discussion was about sixel, as that was the only widespread graphics sequence, where ppl already had implemented tons of alternative cursor mode ideas. Which are all faulty for sixel in particular, only bottom-left corner works here reliably.

Yes, uniformity across all protocols is desirable but we both know it'll never happen for various reasons.

Well at least xterm has changed its cursor positioning to vt340 mode after it became clear, that the old mode had serious flaws. Graphics output in terminals is still very alpha in many regards, so I think most TE maintainers are willing to change it, if there are good reasons to do (minus kitty).
IIP does not spec its cursor handling, so in a sense of uniformity between graphics sequences the least denominator would be the vt340 mode here as well. I think this just would have to be discussed with @gnachman and @wez and whether they would agree on a more uniform handling (and all other TEs implementing IIP).

About a possible corner marking sequence for cursor positioning:
I dont think that such a sequence is really needed, my suggestion is just to make life easier for app devs and not to repeat subspeccing it on sequence XY. The idea is pretty simple - place cursor after a rectangular graphics sequence at the selected corner, again defaulting to vt340. Implemented as terminal mode it is not important anymore, which transport sequence was used for the image/graphics.

About sixel scrolling:
Thats implemented as a terminal mode, thus applies to any graphics sequence.

On a sidenote
Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons).

@pmp-p
Copy link

pmp-p commented Mar 5, 2024

Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons).

Interesting, what is exactly IIP ? And is it available on all os distributions as a portable (and signed ...) application and also xterm.js ?

windows : mintty provides sixel.
linux : mlterm provides sixel.
web: xterm.js provides sixel.
osx/ios: safari + xterm.js of course !

@jerch
Copy link
Owner

jerch commented Mar 5, 2024

@pmp-p

Interesting, what is exactly IIP ? And is it available on all os distributions as a portable (and signed ...) application and also xterm.js ?

IIP is "Inline Images Protocol" as invented by iTerm2. You dont need any application or special lib for that, it basically consists of a sequence with base64-encoded PNG/JPEG payload.

Docs: https://iterm2.com/documentation-images.html
Example in Bash: https://iterm2.com/utilities/imgcat

Currently not many TEs support it, but opted for sixel instead. Which is a pity, since it has much better quality and can be handled by any semi-experienced developer with stdlibs.

@AnonymouX47
Copy link
Author

AnonymouX47 commented Mar 5, 2024

@jerch

Without a doubt, I agree the bottom-left cursor placement is the most reasonable, straightfoward and reliable for various reasons (I think documenting these reasons will be a neccessary for a pitch). Even if a protocol will have alternate modes, this should be the default IMO. My concern is majorly about the amount of time and effort it'll take to make this go round (assuming other TE maintainers would be willing to change), and just like you, there's the case of those for which I have low hopes.

On the happier side, changing the cursor placement for IIP to the bottom-left cell should have any much adverse effects on applications, as I'm not aware of any project other than mine (and maybe Jexer, which has been archived for a while) that depends on the cursor's horizontal position after drawing graphics.


Like you said, I don't think the corner marking sequence is needed.


About sixel scrolling:
Thats implemented as a terminal mode, thus applies to any graphics sequence.

I see.


On a sidenote
Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons)

Same thoughts...

Without a doubt, IIP is far much easier, both for TE and app devs, and provides a richer feature set. Though, I guess some factors would be:

  • XTerm implemented sixels a long time ago and many TEs have followed in its steps.
  • A higher percentage of TE users are only aware of sixels (also kinda due to the previous reason) and that's mostly what gets mentioned in feature requests.
  • IIP would require external dependencies for decoding various image formats, unlike sixels.

Side note:

IIP is "Inline Images Protocol" as invented by iTerm2. You dont need any application or special lib for that, it basically consists of a sequence with base64-encoded PNG/JPEG payload.

That reminds me, I recently tried out chafa's IIP output format and images weren't displayed. I chased down the possible causes and it boils down to the fact that chafa currently uses uncompressed TIFF format for IIP. So, it's either:

  • this addon lacks TIFF support (mosy likely)
  • payload sizes are too high, since they're uncompressed

The iTerm2 doc actually states:

Any image format that macOS supports will display inline, including PDF, PICT, or any number of bitmap data formats (PNG, GIF, etc.).

thereby leaving supported formats essentially unspecified/open-ended.

I discussed with @hpjansson (on the chafa matrix chat) about this and his major reason was PNG encoding would require an external dependency as it's non-trivial, unlike uncompressed TIFF... but now he's willing to allow an external dependency for PNG encoding.

If you would like me to open a new issue for TIFF support, please let me know.

EDIT: GuardKenzie/chafa.py#54 was actually what triggered my investigation.

@jerch
Copy link
Owner

jerch commented Mar 5, 2024

  • IIP would require external dependencies for decoding various image formats, unlike sixels

Yeah that is indeed a downside up to security issue in foreign libs for more complicated formats. I am really a fan of QOI in this regard, such a small code size makes it almost impossible for bugs to slip in. And it is still reasonably fast with good enough compression, at least for local delivery.

about TIFF support:
Yes I already figured that out several weeks ago, when I was chatting with @hpjansson. Browsers dont provide builtin TIFF support, so this has to be built in JS/wasm. Not sure yet about its complexity, but I have that on my longer TODO list, so yes - plz feel free to create a feature request for it.
(Sorry for not showing up in the matrix channel for a while, have pretty limited time atm...)

@hpjansson
Copy link

An open-ended TIFF decoder would be complex to implement from scratch. @AnonymouX47 mentioned this places an unfair burden on the TE, and I agree. I'll move Chafa to emit PNG in the future. Haven't decided whether to do so with an external dep or embedded. The CLI application embeds a PNG decoder already.

(Sorry for not showing up in the matrix channel for a while, have pretty limited time atm...)

Figured :-) Similar thing going on here, it's that time of the year I think.

@gnachman
Copy link

gnachman commented Mar 5, 2024

As far as cursor positioning goes, I should specify it. The cursor should be after the last cell in the image. Future bidi support could affect the definition of "last", but for now that is the bottom right corner. That could leave the cursor in the position after the last column.

For an image taking width x height cells, you would move the cursor to its new position by following these steps:

  1. Execute height linefeeds
  2. Move the cursor right by width cells.

For image decoding, I strongly recommending doing that out-of-process. iTerm2 uses a sandbox like Chrome and there has never been a security issue.

Regarding formats, I think it makes sense to aim for parity with web browsers; at a minimum that's JPEG, PNG, GIF, SVG, WebP, BMP, TIFF, ICO.

@AnonymouX47
Copy link
Author

@jerch

  • IIP would require external dependencies for decoding various image formats, unlike sixels

Yeah that is indeed a downside up to security issue in foreign libs for more complicated formats. I am really a fan of QOI in this regard, such a small code size makes it almost impossible for bugs to slip in. And it is still reasonably fast with good enough compression, at least for local delivery.

If only it were as widely adopted already. I see it's propagating fast but still not sufficiently widely supported.

about TIFF support:
Yes I already figured that out several weeks ago, when I was chatting with @hpjansson. Browsers dont provide builtin TIFF support, so this has to be built in JS/wasm. Not sure yet about its complexity, but I have that on my longer TODO list, so yes - plz feel free to create a feature request for it.

I see... I'm really not that keen on it due to the my understanding of the required complexity and since it's already been on your TODO, it's okay.

(Sorry for not showing up in the matrix channel for a while, have pretty limited time atm...)

That's totally understandable.


@gnachman

That could leave the cursor in the position after the last column.

@jerch, I guess this was what I was referring to by "just as the cursor behaves when text reaches the rightmost column of a row" in the original post.

  1. Execute height linefeeds

Umm... shouldn't this be height - 1? 🤔

For image decoding, I strongly recommending doing that out-of-process. iTerm2 uses a sandbox like Chrome and there has never been a security issue.

Not sure how feasible that is in a web browser though 🤔... @jerch, any ideas?

@gnachman
Copy link

gnachman commented Mar 6, 2024

  1. Execute height linefeeds

Umm... shouldn't this be height - 1? 🤔

Uh, yeah. Code reading fail.

@hackerb9
Copy link

hackerb9 commented Mar 7, 2024

On a sidenote
Imho IIP is a much better graphics protocol than sixel, I really dont get why ppl insist on using sixel (besides for historical reasons).

A sidenote to the sidenote

I don't insist on using sixels, but I use them because all the other alternatives I've looked into have been half-baked or were aimed at making the console more like a GUI. Sixels aren't a GUI or even a TUI -- they are integrated with the character cells, so any graphics protocol that treats images as a separate layer from the text is already worse for my use case: an infinitely scrolling command-line.

Don't get me wrong. Sixels are crufty and the limitations are godawful. I look forward to something better to replace them, but I'm not holding my breath. And, in the meantime, I keep using sixels because they meet my needs.

...Oh, and most of all, because xterm doesn't support IIP. (Or kitty) (Or any of the defuct streaming image protocols, like TGP, or RIPscrip, or Turbo56k, or RLE 😄 ).

@AnonymouX47
Copy link
Author

@jerch, in view of the last couple comments, does your stance:

Or to say it differently - yes I deviate here intentionally from what iTerm implements for its sequence, but since its not documented behavior, it cannot be expected as such.

remain the same?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants