Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timg is not selecting sixel on Windows Terminal Preview #145

Open
gestate-8 opened this issue Jan 2, 2025 · 32 comments
Open

timg is not selecting sixel on Windows Terminal Preview #145

gestate-8 opened this issue Jan 2, 2025 · 32 comments

Comments

@gestate-8
Copy link

Windows Terminal Preview supports sixel images.

image

timg is not auto-detecting sixel support on WT.

timg -ps works well.

@j4james
Copy link
Contributor

j4james commented Jan 3, 2025

The way timg currently detects the supported protocol is with an XTVERSION query that returns the terminal name, which is compared against a hardcoded list of terminals. This is not ideal, and misses quite a few existing Sixel terminals.

The preferred way to detect Sixel is with a Primary Device Attributes query, which should report a 4 in the extension list on terminals that support Sixel.

@hzeller
Copy link
Owner

hzeller commented Jan 3, 2025

Determining the features is a careful dance, as not all queries to determine features will work and some actually end up messing up some terminals as they can't deal with the escape codes they were given (e.g. the kitty graphics query is known to make some terminals glitch). So only queries that don't interfere are selected for such things.
Having said that, Primary Device Attributes seems benign, so this might indeed be a good fallback solution.

There is one additional property to figure out: sixel terminal implementations seem to behave slightly differently w.r.t. cursor placement after an image had been sent; which requires some manual testing and hardcoding right now (setting the known_broken_sixel_cursor_placement in the timg TermGraphicsInfo. So there is also a good chance that even if we know the sixel property that we pick the wrong workaround.

Anyway, while I investigate implementing the fallback solution, let's add your terminal so the hardcoded list.
What is the output in your terminal when you emit the CSI >q query manually ?

echo -e "\033[>q"

To determine what kind of sixel workaround we need: if you run an animation

timg -ps someanimation.gif

Does it work correctly or is the image 'wandering' up or down ?

... or show multiple images in a grid:

timg -ps --grid=3 *.png

... do they align properly or look like they are arranged in a staircase ?

hzeller added a commit that referenced this issue Jan 3, 2025
Use this as a fallback query if the primary query based on environment
variables and terminal name is not conclusive. The DA1 query might
tell us if the terminals support the Sixel protocol.

Best effort: we will still not know if the newline workaround is needed.

While at it, some smallish clean-ups, e.g. low-level memmem() is only
used in one location.

Issue #145
@hzeller
Copy link
Owner

hzeller commented Jan 3, 2025

I've implemented the Primary Device Attributes query to determine Sixel capability. Please try again from head to see if this works for you (please also test if animations and grid work as described before, as that behavior can't be auto-detected AFAIK)

@gestate-8
Copy link
Author

gestate-8 commented Jan 3, 2025

echo -e "\033[>q"

Nothing prints. (Tested on vanilla bash shell)

timg -ps someanimation.gif

The image is pushed to the top of the terminal, flickering between a random colored background and frames.
The default terminal background is black.
But at least it does play.

image

timg -ps --grid=3 *.png

This works well, but also with a random colored background.

image

This was tested in latest commit.

❯ timg --version
timg 1.6.1+  <https://timg.sh/>
Copyright (c) 2016..2024 Henner Zeller. This program is free software; license GPL 2.0.

Image decoding GraphicsMagick 1.3.45 (2024-08-27)
Openslide 4.0.0
Turbo JPEG
librsvg 2.59.2 + cairo 1.18.2
PDF rendering with poppler 24.12.0 + cairo 1.18.2
QOI image loading
swscale 8.3.100
Video decoding libav 61.7.100; avdevice 61.3.100
Libsixel version 1.10.3
Half, quarter, iterm2, and kitty graphics output: timg builtin.

@hzeller
Copy link
Owner

hzeller commented Jan 3, 2025

With the latest commit: does it work without -ps ?

@gestate-8
Copy link
Author

No, it still uses fallback with Unicode characters.

@hzeller
Copy link
Owner

hzeller commented Jan 3, 2025

Sounds like the terminal is neither implementing the >q query nor the primary device attributes, so we don't have anything to auto-detect. Maybe file a bug with the terminal authors.

For now, I recommend to set the environment variable TIMG_PIXELATION=sixel so that you can use timg without having to remember to pass -ps every time.

@gestate-8
Copy link
Author

Thank you. I will make an issue on the WT repository.

@j4james
Copy link
Contributor

j4james commented Jan 3, 2025

Sounds like the terminal is neither implementing the >q query nor the primary device attributes

Windows Terminal doesn't support the >q query, but it does support device attributes. And your latest commit does work for me without the -ps option (built and run from within a WSL session).

I have some feedback on your fix though. If I'm reading it correctly, I think the test below will produce false positives on terminals that implement features like the Latin-2 character set (which is identified by the number 42), but which don't support Sixel.

if (find_str(start + strlen(kExpect), end - start, ";4")) {

And regarding the comment below, that first value identifies the terminal conformance level. technically it could be anything in the range 61 to 65. The example in the linked documentation is 64 because it's identifying a level 4 device.

timg/src/term-query.cc

Lines 285 to 288 in dbe0299

// https://vt100.net/docs/vt510-rm/DA1.html
// says CSI ?64 is returned, but I've only encountered
// CSI ?62 in various terminals.
// So, let's just watch for common prefix CSI ?6

So what you have there is technically correct as it is, but it's worth noting that some modern terminal emulators don't use a strictly correct DA response, so you may fail to detect them. Some examples that I've encountered in the past:

  • Rxvt (with a Sixel patch) - \033[?1;2;4c
  • St (with a Sixel patch) - \033[?12;4c

I don't think it's a big deal if you choose to ignore those, though, since people can always still use the -ps option.

This works well, but also with a random colored background.

This is the result of timg wrapping the libsixel output with an extra Sixel introducer. See here:

WriteStringToOutBuffer("\033Pq", &out); // Start sixel data

So the start of the Sixel image sequence actually ends up being \033Pq\033Pq. That's interpreted as an additional "blank" image, and because it has no parameters, it triggers a background erase, and with no declared dimension, the default size is the maximum extent of the screen. As a result, it'll generate a background fill from the current cursor position up to the bottom right of the page. You may not see that on a lot of terminals, because most don't implement background erase correctly.

There's also an unnecessary extra ST terminator here:

WriteStringToOutBuffer("\033\\", &out); // end sixel data

@hzeller
Copy link
Owner

hzeller commented Jan 3, 2025

The ambiguity between 4 and 42 was deliberate as I see this as a best-effort fallback solution, and I have also not seen any terminal I played with return that. If it becomes a problem, this should be addressed (only a line more code). The DA responses that don't start with 6 might be more of an issue. Maybe it is a good idea to simply look for \033[? ... since we anyway have only one query queued and expect the answer.

If you see something in the sixel implementation that causes issues like the blank image situation, please send a pull request. I don't remember exactly why these code-pieces are in there, it might well be that it was needed when testing on the various terminals when I implemented it.

hzeller added a commit that referenced this issue Jan 3, 2025
@j4james
Copy link
Contributor

j4james commented Jan 3, 2025

The ambiguity between 4 and 42 was deliberate as I see this as a best-effort fallback solution, and I have also not seen any terminal I played with return that.

I've personally seen a few - the current release version of Windows Terminal being one example - but I've just tested that now, and surprisingly it still works with the latest commit. I'm guessing it's because you can't detect the font size, and without that you won't even attempt to detect sixel. I'm not sure about the other terminals, but I suspect they probably wouldn't report the font size either, so this is likely not as big a deal as I initially thought.

If you see something in the sixel implementation that causes issues like the blank image situation, please send a pull request.

OK. I might give that a go this weekend.

hzeller added a commit that referenced this issue Jan 4, 2025
This can help figuring out details while attempting to detect new
terminals.

Issues: #145
@hzeller
Copy link
Owner

hzeller commented Jan 4, 2025

I already changed the detection so that only 4 not 42 is detected.

I've now added information about what happens in the terminal query phase if --verbose is given. Can you compile timg from head and then call it with --verbose and some image ? Before the image, there should be some information about the terminal queries. I think this will help in the future when adding new auto-detection.

Output looks something like this (here on my local terminal):

Query: '\033[>q\033[5n' Response: '\033P>|Konsole 24.12.0\033\\\033[0n' (0ms)
Query: '\033]11;?\033\\' Response: '\033]11;rgb:0000/0000/0000\033\\' (0ms)

@j4james
Copy link
Contributor

j4james commented Jan 4, 2025

I already changed the detection so that only 4 not 42 is detected.

Thanks. I only noticed that after I'd sent my message, but I've tested both versions now and can confirm that they both work.

And regarding the --verbose output, this is what I get on Windows Terminal Preview v1.22.2702.0 (which has Sixel support):

No usable TIOCGWINSZ, trying cell query.
Query: '\033[16t' Response: '\033[6;20;10t' (1ms)
Query: '\033[>q\033[5n' Response: '\033[0n' (0ms)
Query: '\033[c' Response: '\033[?61;4;6;7;14;21;22;23;24;28;32;42c' (2ms)
Query: '\033]11;?\033\\' Response: '\033]11;rgb:0c0c/0c0c/0c0c\033\\' (2ms)

Terminal cells: 80x24  cell-pixels: 10x20
Active Geometry: 78x22; canvas-pixels: 780x440
Effective pixelation: Using sixel graphics (with default cursor placement).
Background color for transparency 'auto'; effective RGB #0c0c0c
Alpha-channel merging with background color done by timg.
1 file (1 successful); 477.0 Bytes written (81.4 KiB/s) 1 frames
Environment variables
 TIMG_PIXELATION           (not set)
 TIMG_DEFAULT_TITLE        (not set)
 TIMG_ALLOW_FRAME_SKIP     (not set)
 TIMG_USE_UPPER_BLOCK      (not set)
 TIMG_FONT_WIDTH_CORRECT   (not set)

And this is what I get on the stable version, Windows Terminal v1.21.3231.0 (which doesn't have sixel):

No usable TIOCGWINSZ, trying cell query.
Query: '\033[16t' Response: '' (53ms)
Query: '\033]11;?\033\\' Response: '' (1504ms)

Terminal cells: 80x24  cell-pixels: -1x-2
Note: Terminal does not return ws_xpixel and ws_ypixel in TIOCGWINSZ ioctl or "\033[16t" query.
        ->Aspect ratio might be off.
        ->File a feature request with the terminal emulator program you use
Active Geometry: 78x22
Effective pixelation: Using quarter block.
Background color for transparency 'auto'; effective RGB #000000
Alpha-channel merging with background color done by timg.
1 file (1 successful); 3306.0 Bytes written (2175.0 Bytes/s) 1 frames
Environment variables
 TIMG_PIXELATION           (not set)
 TIMG_DEFAULT_TITLE        (not set)
 TIMG_ALLOW_FRAME_SKIP     (not set)
 TIMG_USE_UPPER_BLOCK      (not set)
 TIMG_FONT_WIDTH_CORRECT   (not set)

Note that TIOCGWINSZ pixel dimensions aren't supported on Windows, and the cell size query (CSI 16t) and palette query (OSC 11) were only recently added to Windows Terminal Preview, so everything there looks to me like it's working as expected.

@j4james
Copy link
Contributor

j4james commented Jan 4, 2025

I should also mention that the "default cursor placement" is technically incorrect, so you tend to get text overlapping the bottom of images in Windows Terminal, and the grid layouts produce the staircase pattern you mentioned (you may have already noticed this in the screenshot that gestate-8 posted above). But if that works best for the majority of your users, then I'm not expecting you to fix it. Although a command line override for the cursor placement might be a useful option if you don't already have one.

@gestate-8
Copy link
Author

gestate-8 commented Jan 4, 2025

I tested again and sixel works well without -ps or TIMG_PIXELATION=sixel on latest commit.

What does the 'staircase' effect refer to? Does that mean images not being centered in each grid column, making them appear like a staircase?
I also tested on my Linux machine using Wezterm, and it looked the same (left aligned), so I assumed this was intentional.

@hzeller
Copy link
Owner

hzeller commented Jan 4, 2025

Staircase effect would be if the images in a grid would not be all aligned well vertically, but would go one character up with each image in a row. So if that looks great then I think we're good here.

@j4james will have a look if the outputs at the beginning and end of the image can actually be removed (which might improve the background color glitch)

@hzeller
Copy link
Owner

hzeller commented Jan 7, 2025

As suggested in #145 (comment) I've now added a way to choose the handling of newlines in sixel terminals.

TIMG_SIXEL_NEWLINE_WORKAROUND
Set this environment variable to true or 1 if you are on a Sixel terminal and notice
that videos 'scroll' or grid-view items are not perfectly aligned
vertically.

With that, everything should be covered to fix this issue.

@j4james
Copy link
Contributor

j4james commented Jan 7, 2025

With TIMG_SIXEL_NEWLINE_WORKAROUND set to 1, that's definitely an improvement, but videos do still scroll sometimes, and grid view items aren't always aligned. It just depends on the image height and how well it aligns with the cell height, or maybe the sixel row count. I'm not sure this is a solvable problem, though (other than just positioning the cursor manually). But the basic single image output works reasonably well now I think.

@hzeller
Copy link
Owner

hzeller commented Jan 7, 2025

I think that is a problem of the terminal, not properly reporting the size in pixels of their fonts.
Unfortunately sometimes canvases use fractional sizes for fonts or have extra pixel lines between rows so the timg's calculated size how many rows to jump back to is off as it relies on an integer cell size.

I suspect that is something like that, so not much timg can do about it; probably worth reporting to the implementors of the terminal with some concrete examples (and possibly the output of timg --verbose and the cell-pixels it reports).

@j4james
Copy link
Contributor

j4james commented Jan 7, 2025

I think that is a problem of the terminal, not properly reporting the size in pixels of their fonts.

The Windows Terminal sixel implementation is designed to be compatible with the original VT340, so it emulates a cell size of 10x20 pixels, regardless of the font used (and that's what it reports to apps querying CSI 16 t). Images are then scaled to match that cell size, so for example a 100x100 pixel image will always occupy exactly 10x5 cells.

The source of the problem is likely the algorithm used to determine where the cursor ends up after a sixel image has been output. On a real VT340 - which again is what Windows Terminal emulates - the final cursor position overlaps the bottom of the image (this ensures that sixels can be output on the last row of the display without triggering a scroll). To be more precise, the cursor should be on the text row that intersects the top of the last sixel row. So if the last sixel row spans two text rows, the cursor ends up on the top one (this allows for images inlined with text).

Very few modern terminals follow this algorithm correctly, though, and there is a lot of variation in the cursor placement algorithms that they do use. I think I counted around 8 last time I tested, although in practice it's probably not as bad as that for an app that always produces images in a specific format. It also doesn't help that terminals sometimes change the algorithm they're using, so it's a bit of a moving target.

In short, it's almost impossible for an app to predicate where the cursor is going to end up for an image of arbitrary size. The fact that you've managed to narrow it down to one of two algorithms that work for the majority of your user base is impressive. But it's never going to work for everyone, all of the time.

@hzeller
Copy link
Owner

hzeller commented Jan 7, 2025

Thanks for your insights! (now I am almost tempted to get an original vt340 from ebay if one shows up :) ).

timg attempts to always stay one row less to not trigger scrolling to at least dodge that possible bullet. Also the emitted image is adapted so that height_pixels % 6 := 0 to avoid weird rounding effects.

I am wondering what triggers the issues you see sometimes; it sounds like there might be off-by-one situations in which the terminal places the cursor differently from what timg thinks. Is it with smaller images that don't fill the full screen, but have a particular height ? Or is it with images that are filling the full terminal (e.g. with -U option) and somtimes cross the scroll-trigger ?

@j4james
Copy link
Contributor

j4james commented Jan 7, 2025

I am almost tempted to get an original vt340 from ebay if one shows up

There's a guy who did exactly that a couple of years ago, and he has a repo on github where he documents all the information he has gathered about the terminal. See hackerb9/vt340test.

And if you decide to get one of your own, he has some useful notes on what USB serial adapters work best for hooking up the terminal to a PC, and things of that sort (for example, see the flow control page).

I am wondering what triggers the issues you see sometimes

I haven't looked at it in much detail, but the one obvious pattern I noticed is when testing with an animated gif and setting the cell height with -gx. If I used a multiple of 3 (i.e. -gx3, -gx6, -gx9, etc.) then the video remained in place. If it wasn't a multiple of 3, the video would scroll up the page. In all cases this is with TIMG_SIXEL_NEWLINE_WORKAROUND=1, otherwise it never works.

Bearing in mind that the cell height is always 20 pixels on Windows Terminal, so the working cases are all a multiple 60 pixels, which means they're an exact number of sixel rows. But an image that is say 2 text cells high, would be 40 pixels, which is 6.666 sixel rows, so the last sixel row would end up overlapping with the third text row. Some terminals will then leave the cursor on that third row, but it should really be on the second row. I'm assuming timg expects the former, and that's why it fails on Windows Terminal.

@hzeller
Copy link
Owner

hzeller commented Jan 7, 2025

The calculation for how far to jump up in character cells given an image with a certain pixel height is in cell_height_for_pixels() ... maybe this is not calculating the correct thing ?

(and then tested in the myriad of other Sixel terminals...)

I am open for pull requests :)

@j4james
Copy link
Contributor

j4james commented Jan 8, 2025

I could "fix" it so that it's VT340 compatible, and thus should work in Windows Terminal, but there's a good chance it'll then no longer work in multiple Linux terminals. And since I expect you have more Linux users than Windows users, that doesn't seem like a great idea.

Maybe there'll eventually come a time when all terminals agree to follow standards, and they'll work in the same way, but until then you kind of need to pick a side. Or as I suggested before, find a way to avoid relying on the cursor placement after an image has been output.

@hzeller
Copy link
Owner

hzeller commented Jan 8, 2025

If windows terminal would implement the >q query, we could use that to do the correct thing for that terminal.

Should be fairly straightforward for the terminal to implement, and then timg has enough information to make it work.

@hzeller
Copy link
Owner

hzeller commented Jan 8, 2025

Most of the issues are addressed here. Leaving this open to refine the automatic application of at least the newlne workaround once >q is implemented in the terminal and we have a string to latch on to.

The somtimes-scroll issue might then be worked around then as well.

@hzeller
Copy link
Owner

hzeller commented Jan 8, 2025

If the cursor position is not rounded up but rounded down, would the following cure the scrolling issue you observed with some picture heights @j4james ?

--- a/src/sixel-canvas.cc
+++ b/src/sixel-canvas.cc
@@ -156,8 +156,7 @@ int SixelCanvas::cell_height_for_pixels(int pixels) const {
     // Unlike the other exact pixel canvases where we have to round to the
     // next even cell_y_px, here we first need to round to the next even 6
     // first.
-    return -((round_to_sixel(pixels) + options_.cell_y_px - 1) /
-             options_.cell_y_px);
+    return -(round_to_sixel(pixels) / options_.cell_y_px);
 }
 
 }  // namespace timg

@j4james
Copy link
Contributor

j4james commented Jan 9, 2025

That's not quite right. You need something more like this.

// This is the pixel offset of the final sixel row.
int final_sixel_offset = round_to_sixel(pixels) - 6;

// This is the cell/row offset of the final cursor position.
int final_cell_offset = final_sixel_offset / options_.cell_y_px;

// Calling this the cell height is somewhat misleading, but if you're
// just using this to reverse the cursor movement, it should suffice.
int cell_height = final_cell_offset + 1;

return -cell_height;

The above is deliberately verbose to make the reasoning clear, but it could just be:

return -((round_to_sixel(pixels) - 6) / options_.cell_y_px + 1);

Either way, note that this still requires TIMG_SIXEL_NEWLINE_WORKAROUND=1 for it to work correctly.

@hzeller
Copy link
Owner

hzeller commented Jan 9, 2025

Thanks. So once we can recognize the windows terminal by name (>q), we can enable all necessary compatibility behavior.

hzeller added a commit that referenced this issue Jan 10, 2025
Right now only known to be implemented by development version of
Windows Terminal.
Until we can detect that this is the terminal we're running, requires
to set environment variable `TIMG_SIXEL_FULL_CELL_JUMP=1`

Issues: #145
hzeller added a commit that referenced this issue Jan 11, 2025
Right now only known to be implemented by development version of
Windows Terminal.
Until we can detect that this is the terminal we're running, requires
to set environment variable `TIMG_SIXEL_FULL_CELL_JUMP=1`

Issues: #145
hzeller added a commit that referenced this issue Jan 11, 2025
Right now only known to be implemented by development version of
Windows Terminal.
Until we can detect that this is the terminal we're running, requires
to set environment variable `TIMG_SIXEL_FULL_CELL_JUMP=1`

Issues: #145
hzeller added a commit that referenced this issue Jan 11, 2025
Right now only known to be implemented by development version of
Windows Terminal.

`TIMG_SIXEL_NEWLINE_WORKAROUND` now is a bitmap that allows for the
various newline behaviors to be set

  * 1 = regular newline/carriage return choice
  * 2 = full cell jump (this new feature)
  * 3 = the OR'ed value out of 1 and 2

Until we can detect that this is the terminal we're running, requires
to set environment variable; for windows terminal, that is
`TIMG_SIXEL_NEWLINE_WORKAROUND=3`.

Issues: #145
hzeller added a commit that referenced this issue Jan 11, 2025
Right now only known to be implemented by development version of
Windows Terminal.

`TIMG_SIXEL_NEWLINE_WORKAROUND` now is a bitmap that allows for the
various newline behaviors to be set

  * 1 = regular newline/carriage return choice
  * 2 = full cell jump (this new feature)
  * 3 = the OR'ed value out of 1 and 2

Until we can detect that this is the terminal we're running, requires
to set environment variable; for windows terminal, that is
`TIMG_SIXEL_NEWLINE_WORKAROUND=3`.

Issues: #145
@hzeller
Copy link
Owner

hzeller commented Jan 11, 2025

Alright, merged the other PR, so now the windows terminal should work with environment variable set to
TIMG_SIXEL_NEWLINE_WORKAROUND=3

@hzeller
Copy link
Owner

hzeller commented Jan 14, 2025

I've already primed the detection for the time we actually do get the name from the terminal, but since we don't know yet what the implementation of microsoft/terminal#18382 will bring, just generically looking for WindowsTerminal here. Might need updating once we know the definite name.

timg/src/term-query.cc

Lines 303 to 308 in cd5cead

if (find_str(data, len, "WindowsTerminal")) {
// TODO: check again once name is established
// https://github.com/microsoft/terminal/issues/18382
result.sixel.known_broken_cursor_placement = true;
result.sixel.full_cell_jump = true;
}

Having said that, all the relevant stuff to support Windows Terminal with Sixel is done. I am considering pushing a new release 1.6.2, but holding off for a few days in case the query appears soon. Do you think it is reasonable to be expected within days or are we more looking at weeks @j4james - in which case I'd be ready to release a new timg soonish.

@j4james
Copy link
Contributor

j4james commented Jan 14, 2025

Do you think it is reasonable to be expected within days or are we more looking at weeks

I don't have any insider knowledge of what MS devs are planning, but I don't think there's anyone actively working on that issue (there's nobody assigned, and there's no pending PR). If I had to guess, I'd say it's more likely to be years rather than days or weeks. This is something that has already been in discussion for several years, and it's not clear how it would be implemented in Windows Terminal even if someone was interested in getting it done.

In short, I'd recommend you go ahead with your release as soon as you are ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants