Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for custom parsing of APC, SOS and PM sequences. #115

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

boazy
Copy link

@boazy boazy commented Aug 4, 2024

Fixes #109.

This attempts the same goal at John-Toohey's PR #110, but since this PR was was not accepted, I wanted to try another approach and see whether it is acceptable.

Rationale for support

APC sequences are rarely supported by general-purpose terminals (if we put aside Kermit clients, tmux and screen), but there is one exception: The Kitty Terminal Graphics Protocol. While the Kitty Image Protocol is not as ubiquitous as Sixel, it is more robust, accurate and powerful and it's implemented by several prominent terminal emulators:

  • Kitty (obviously)
  • Konsole
  • Wezterm
  • Wayst
  • Ghostty

It's also supported by a large number of programs and libraries, some of them are mentioned here.

Design

Design Goals

  • Support SOS, PM and APC sequences (just like the original PR)
  • Support streaming parsing of SOS, PM and APC sequence payloads.
  • Avoid adding new conditional branches in existing code paths. This should minimize performance impact on existing code.
  • Avoid re-purposing OSC state variables (in order to reduce complexity)
  • Avoid adding any new stored state variable
  • No embedded parsers - use existing state change tables for parsing

Design Choices

Reorder actions into packable and non-packable actions

Currently, resulting state and actions are packed into an 8-bit value in the state change table (with a 4-bit nibble allocated for each). All values from 0 to 15 for both state and actions are used, so I could not add new states and action that would be packable. Unfortunately, I needed to support a state change with the OpaquePut action. The best way I've found is to separate the integer values of actions that do not need to be packed into the state change table (such as Hook, UnhookandClear`) into a value higher than 15 and reserve lower values for actions that need to be packed

I'm not sure if my current solution is acceptable or not but I have no other idea on how to resolve this situation without repurposing the OscString state and actions and re-introducing an embedded parser.

Action matching order

This should have minimal impact, but in order to avoid any performance impact for programs which do not use APC sequences, I made sure that when matching APC/SOS/PM-specific actions, they are always matched last.

Streaming

Unlike #110, this PR does not collect the sequence payload inside an array. Instead, the sequence payload is streamed directly to the Perform trait, one byte at a time. I believe this approach is more flexible and would lead to better performance overall. The key gains here are:

  • Avoiding extra vector allocation for the payload (I don't want to re-use osc_raw and in any case we would have to increase its default size to 4096 to be compatible with the max payload size).
  • Avoiding unnecessary copying when. The payload often just needs to be parsed directly. This is especially useful when sending images directly, since the image data would often just be forwarded to an image format parser anyway.

Names

I've changed SosPmApcString to OpaqueString, since I wanted to have a general name for all these types of sequences where the VTE parser has no understanding of the internal structure of the sequence payload. This is difference from OSC and CSI sequences where the VTE parser is aware of the high-level structure of the sequence (if not the semantics).

@boazy boazy force-pushed the parserless-apc-support branch 2 times, most recently from 04fbc6e to 255c2fe Compare August 4, 2024 14:41
Copy link
Member

@chrisduerr chrisduerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give a detailed review later today/tomorrow, but assuming that this was benchmarked it seems like a reasonable approach.

Generally I think the VTE crate needs a little rewrite, but I don't want to hold up other changes for that since it will likely take a long time before I get around to that.

@boazy
Copy link
Author

boazy commented Aug 5, 2024

I'll give a detailed review later today/tomorrow, but assuming that this was benchmarked it seems like a reasonable approach.

I ran cargo bench but I'm a bit unsure on how to run the full vtebench suite. Is there a guide for that?

I don't expect any performance impact since I've added no new state and I've only added tailed branches to an already-long match expression, but you never know until you test...

@kchibisov
Copy link
Member

You can open a PR against alacritty with your changes, so we can use our bot for it. But generally you just build alacritty with your changes and run the vtebench.

@boazy
Copy link
Author

boazy commented Aug 5, 2024

You can open a PR against alacritty with your changes, so we can use our bot for it. But generally you just build alacritty with your changes and run the vtebench.

Thanks! Here are the results:

Master Branch:

  dense_cells (625 samples @ 1 MiB):
    15.47ms avg (90% < 16ms) +-1.93ms

  scrolling (199 samples @ 1 MiB):
    37.19ms avg (90% < 42ms) +-4.62ms

  scrolling_bottom_region (361 samples @ 1 MiB):
    27.25ms avg (90% < 28ms) +-2.28ms

  scrolling_bottom_small_region (369 samples @ 1 MiB):
    26.69ms avg (90% < 27ms) +-2.3ms

  scrolling_fullscreen (185 samples @ 1 MiB):
    41.15ms avg (90% < 45ms) +-2.28ms

  scrolling_top_region (247 samples @ 1 MiB):
    40.09ms avg (90% < 42ms) +-2.92ms

  scrolling_top_small_region (364 samples @ 1 MiB):
    27.06ms avg (90% < 28ms) +-2.27ms

  unicode (611 samples @ 1.06 MiB):
    15.86ms avg (90% < 17ms) +-1.55ms

This PR:

  dense_cells (606 samples @ 1 MiB):
    16ms avg (90% < 17ms) +-1.26ms

  scrolling (198 samples @ 1 MiB):
    37.39ms avg (90% < 42ms) +-5.06ms

  scrolling_bottom_region (361 samples @ 1 MiB):
    27.25ms avg (90% < 28ms) +-2.12ms

  scrolling_bottom_small_region (362 samples @ 1 MiB):
    27.19ms avg (90% < 28ms) +-2.25ms

  scrolling_fullscreen (182 samples @ 1 MiB):
    41.63ms avg (90% < 45ms) +-3.56ms

  scrolling_top_region (250 samples @ 1 MiB):
    39.6ms avg (90% < 41ms) +-2.85ms

  scrolling_top_small_region (362 samples @ 1 MiB):
    27.16ms avg (90% < 28ms) +-1.28ms

  unicode (654 samples @ 1.06 MiB):
    14.79ms avg (90% < 16ms) +-2.14ms

The scrolling_* benchmarks are unstable and give me different lead every time. Unicode seems to be consistently slightly faster on this PR while dense_cells is slightly faster on master branch, but the difference is quite small and the variance is higher.

@kchibisov
Copy link
Member

@boazy just open a PR against alacritty's repo, it should better for us to review that way, since we can run the bot.

@boazy
Copy link
Author

boazy commented Aug 7, 2024

@kchibisov I've added a pull request. I'm not sure how to trigger the benchmarks bot.

@chrisduerr
Copy link
Member

I'm not sure how to trigger the benchmarks bot.

You can't. I did.

@chrisduerr
Copy link
Member

@boazy This has stalled out for a while now, are you still interested in working on this? I'm slightly concerned because the longer we wait, the higher the chance that a vte rewrite will make all your work worthless.

@boazy
Copy link
Author

boazy commented Oct 1, 2024

I would expect three sets of functions. How that is done is an implementation detail irrelevant to the consumer (as long as it's fast).

@boazy This has stalled out for a while now, are you still interested in working on this? I'm slightly concerned because the longer we wait, the higher the chance that a vte rewrite will make all your work worthless.

I didn't have time for a while, but I've had some time again today and I've checked your suggestion. Unfortunately, I don't think it would be possible without completely re-architecting the way state works. As I've mentioned before, we only have 4 bits for actions that change the state, but we also only have 4 bits for the state itself, and here all possible values (0-15) are used.

If we want to support three sets of functions, we would need to track three separate states for each type of string (SOS, PM, APC). This is not possible without either:

  1. Adding a new state variable for tracking the type of opaque string. (with possible performance impact)
  2. Modifying the state variable to a u16 or better yet something like struct ResultingState(State, Action).
    Would this impact performance? I don't know, but it's also a big change I'm a little bit uneasy with...

The only thing I can do without any major change is to have 3 different functions instead of opaque_start, but opaue_end and opaque_put would have to remain a single function in this case.

@chrisduerr
Copy link
Member

Just to recap because this is quickly fading into obscurity and I don't want to forget about it entirely:

Splitting opaque_start up would be no problem, since we have access to the three states anyway, but splitting opaque_put and opaque_end is a problem because it means we'd have to keep track of the state we're currently in?

But couldn't you just store that on Parser (the size of which is irrelevant), then branch inside Action::OpaquePut => and Action::OpaqueEnd => since doing so should leave performance of the other branches unaffected?

I believe that's kinda your first suggestion? Why do you think this would impact performance?

@boazy
Copy link
Author

boazy commented Oct 16, 2024

But couldn't you just store that on Parser (the size of which is irrelevant), then branch inside Action::OpaquePut => and Action::OpaqueEnd => since doing so should leave performance of the other branches unaffected?

I believe that's kinda your first suggestion? Why do you think this would impact performance?

I honestly don't know. I assumed the size is important for performance since we want to be able to have all state transitions in one table (which only has 256 entries?). If we go back to the start, I created this PR trying to answer the comments to the original PR that tried to add APC support (#110), particularly this comment:

This basically feels like an embedded parser. Most of our parser is structured in a way to stream through bytes and process them as they're coming in, however rather than creating state transitions you're storing a state transition to later then match on it instead. VTE is already too slow and I'm afraid this will just amplify that problem.

#110 (comment)

I realize I can add additional state that doesn't affect OSC sequences, but seeing this comment I was a bit worried. Do you believe adding an additional field to Parser (let's say opaque_sequence_kind) should be ok?

@chrisduerr
Copy link
Member

I honestly don't know. I assumed the size is important for performance since we want to be able to have all state transitions in one table (which only has 256 entries?). If we go back to the start, I created this PR trying to answer the comments to the original PR that tried to add APC support (#110), particularly this comment:

Yeah that's fair. This would make this a slow sloppy implementation and it might not be worth adding it for that reason. But that applies to your current solution too and the only alternative is a bigger rework that integrates this as a core part of our parser.

I realize I can add additional state that doesn't affect OSC sequences, but seeing this comment I was a bit worried. Do you believe adding an additional field to Parser (let's say opaque_sequence_kind) should be ok?

I think we should either do this, or go for something that properly integrates everything into our standard processing pipeline, which of course is difficult without affecting performance. I'm very hesitant to accept this sort of "second rate escape sequence", but I would prefer that over the current implementation.

@boazy
Copy link
Author

boazy commented Oct 17, 2024

I think we should either do this, or go for something that properly integrates everything into our standard processing pipeline, which of course is difficult without affecting performance. I'm very hesitant to accept this sort of "second rate escape sequence", but I would prefer that over the current implementation.

Ok, I think I've understood you now. I'll modify the pull request along these lines.

@boazy boazy force-pushed the parserless-apc-support branch 2 times, most recently from cbbdde3 to 45061fd Compare October 17, 2024 06:20
@boazy boazy requested a review from chrisduerr October 17, 2024 13:11
@boazy boazy requested a review from chrisduerr October 21, 2024 08:33
@boazy boazy force-pushed the parserless-apc-support branch from d511cdd to 4839d4d Compare October 21, 2024 08:33
@chrisduerr
Copy link
Member

Going it tenatively approve the current revision, I don't think there's any other changes necessary from your end.

While I'm not entirely convinced I'll give this a closer look and if there are no performance regressions it should be fine.

@chrisduerr
Copy link
Member

@boazy Just a heads-up: Once #118 is merged, it might make some sense to reinvestigate this. There's certainly been some changes and we do have a lot of extra free states/actions that could be made use of without affecting performance.

@kchibisov
Copy link
Member

We've reworked vte a bit, so probably would be easier to add things without hitting perf, since all the code that caused problem here is gone now.

@heysweet
Copy link

heysweet commented Feb 3, 2025

Hello! Just wanted to re-ping this thread to see what progress was being made here. We'd really like to support Kitty Image protocol in the Warp terminal, and this PR would help us get there

@boazy
Copy link
Author

boazy commented Feb 4, 2025

This PR was languishing for a while now and now, and now that vte is ready, I probably need to rewrite it I guess, but I didn't have time to revisit this recently. I'll try to find time to do it.

The good news is that it should impact performance. I hope.

@heysweet
Copy link

heysweet commented Mar 6, 2025

Hello again, sorry to be a bother, just wanted to check in on the status of this. We’re kicking off the full-time effort to support kitty image protocol starting this week, and just want to get an understanding how much of a risk this PR is to our project. Thank you!

@boazy boazy force-pushed the parserless-apc-support branch 3 times, most recently from a816971 to dc153f4 Compare March 7, 2025 07:52
@boazy
Copy link
Author

boazy commented Mar 7, 2025

@chrisduerr , @kchibisov I've finally got time to rebase the PR to work with the new version. I haven't written any unit tests yet, since I want to see if you are ok with the direction.

Some points to take note of:

  1. I'm currently not keeping any extra state for opaque sequence parsing, but that is introducing some limitations, which I will describe below.
  2. There is no clear specification on how to handle any character besides 0x20-0x7F, 0x9C (C1 ST) and the ESC ST sequence (0x1B 0x5C) in APC/SOS/PM sequences. The original DEC VT* terminals seem to have ignored these sequences completely, so essentially everything until the ST sequence (which put the terminal back in ground state) was ignored.
  3. What characters are valid inside an opaque sequence are also not clear and different terminals have different implementations. I went with supporting any non-control character, but that means that C1 ST (0x9C) currently does not terminate an opaque sequence. I didn't check all terminals who implement APC for Kitty protocol features, but Kitty and Ghostty do not seem to support C1 ST as well. In any case, I believe we should only implement C1 ST support if we support the C1 APC/SOS/PM codes for starting the sequence (which vte does not support), so this behavior is probably correct.
  4. On the other hand, the opaque sequence can be terminated with non-standard bell (0x07), since kitty supports bell-terminated sequences, and the main usage for these sequences seem to be implementing Kitty protocol features anyway).
  5. It is unclear how to handle escape sequences other than ST (0x1b 0x5c) inside an SOS/APC/PM string. My current approach is to terminate the sequence on any escape code, just as we vte does with OSC sequences, but this is not the most correct approach. A better approach would be to ignore these sequences, but the requires keeping more state.
  6. I've tried to inline everything, and avoid both code duplication and unnecessary branching (by using the OpaqueDispatch trait) wherever possible.

There are two approaches I can see for keeping track of additional state (particularly important for item #6):

  1. Add 3 additional states: SOSEscape, APCEscape and PMEscape. I will add special advance_esc implementation for each of these states, which ignore any other ESC sequence apart from ESC-\ (ST).
  2. Add an additional field to the parser that specifies the current opaque sequence mode (or none if no such mode exists). This field will be checked every time we handle an escape sequence, so it may impact the performance of non-SOS/APC/PM usage.

I believe the first approach should be better.

@heysweet Sorry for the time it took me to rebase this PR, I basically had to rewrite it from scratch.

@j4james
Copy link

j4james commented Mar 8, 2025

On the other hand, the opaque sequence can be terminated with non-standard bell (0x07)

Note that most terminals only accept BEL as a terminator for OSC sequences, strictly for backwards compatibility. In other string sequences it's just another data character. Kitty only started accepting it as a terminator for all string sequences a couple of years ago - I don't know why. Personally I don't think it's a good idea to encourage non-standard usage like that.

@boazy boazy force-pushed the parserless-apc-support branch from dc153f4 to 6421e17 Compare March 8, 2025 23:17
@boazy boazy force-pushed the parserless-apc-support branch from 6421e17 to c515809 Compare March 8, 2025 23:18
@boazy
Copy link
Author

boazy commented Mar 8, 2025

One more point of concern:
[sos/apc/pm]_dispatch() is called individually for each byte. For any use case which is not a byte-by-byte parser, it would probably be more efficient to call it on byte slices instead (I believe this was the entire point of #118).
The main issue with calling this on byte slices, is that the SOS/APC/PM sequences can be arbitrarily long; indeed, with the the most common use case (Kitty Image Protocol) these sequences could easily be several megabytes long. I am a little bit unsure how accumulating these sequence in a Vec<u8> (like what the parser is doing for OSC) would impact performance.

@j4james
Copy link

j4james commented Mar 9, 2025

with the the most common use case (Kitty Image Protocol) these sequences could easily be several megabytes long.

Assuming apps actually follow the spec, then kitty image sequences should never be more than 4K ("the pixel data must first be base64 encoded then chunked up into chunks no larger than 4096 bytes"). But otherwise you're right - in theory these string sequences can be infinitely large.

@boazy
Copy link
Author

boazy commented Mar 9, 2025

with the the most common use case (Kitty Image Protocol) these sequences could easily be several megabytes long.

Assuming apps actually follow the spec, then kitty image sequences should never be more than 4K ("the pixel data must first be base64 encoded then chunked up into chunks no larger than 4096 bytes"). But otherwise you're right - in theory these string sequences can be infinitely large.

Makes sense. I wasn't aware of the actual Kitty Image Protocol spec so I've missed that.
VTE could also implement its own chunking and make that configurable in the parser.

@kchibisov
Copy link
Member

The perf looks pretty much the same alacritty/alacritty#8507

Can not really answer wrt design now questions you've raised.

Copy link
Member

@chrisduerr chrisduerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly seems fine to me. Just some minor nits.

src/lib.rs Outdated
Comment on lines 456 to 457
// ESC-ST (ESC-\) and C1-ST (0x9C), but kitty (and probably some other
// terminals) also support bell-terminated strings. Some
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit silly because "some other terminals" includes Alacritty. This is mostly an extension to how other sequences like OSCs are handled, which also support the bell terminator and do so already in Alacritty. So this might need some rephrasing to not sound strange.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to SOS/PM/APC sequences, which are not yet supported by alacritty. The current behavior (treating them as the anywhere state) doesn't seem to support termination with BEL, so this would be a behavior change for alacrity, wouldn't it?

But since as you've said, this is how OSC sequences already behave, so I think we should change this behavior to be more consistent.

As stated above, I added these comments only to explain my reasoning, I didn't intend them to be included in the code in the final version of the PR.

src/lib.rs Outdated
Comment on lines 465 to 466
// XTerm terminates SOS/APC/PM strings on C1 CAN (^X) and SUB (^Z). This is also
// the same behavior we implement for OSC strings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0x18/0x1A is just a general-purpose reset into ground from anywhere, the transition is unrelated to its origin state. There's not really any reason for this comment to exist.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit confused, since not all terminals seem to do that (at least when handling SOS/PM/APC), but I'll remove the comment.

Copy link
Member

@chrisduerr chrisduerr Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll refer to https://vt100.net/emu/dec_ansi_parser for "general" guidance on Alacritty's parser here (even though it doesn't handle opaque strings). It's not really a SOS/PM/APC, but more of an "anywhere" thing. As such a SOS/PM/APC-specific comment is unnecessary.

src/lib.rs Outdated
Comment on lines 472 to 473
// Any escape code ends the SOS/APC/PM string. This is not standard behavior,
// but avoids having to keep additional state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems misleading. Escape resetting ongoing escape sequences from anywhere is a de-facto standard implemented by many terminal emulators. It might not be explicitly called for in older specifications, but is not incorrect behavior either, it's mostly an implementation detail really.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if I got my intentions wrong. The comments are not meant to be here to stay. They are mostly documenting my thoughts while looking for feedback to this pull request. I have little experience with de-facto terminal implementations, so I can only refer to the specifications (which are a bit vague).
If escape sequences resetting the state is common de-facto behavior, then I would be happy to keep it this way, since it keeps the implementation clean and simple.

src/lib.rs Outdated
Comment on lines 478 to 479
// Only dispatch valid characters.
dispatcher.opaque_put(byte)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this comment on a method call that dispatches bytes indiscriminately? It might be more appropriate on the match arm instead, but even then I question whether the 0x80 to 0xFF range can be considered "valid characters", considering they're not printable characters (so effectively the same as any other byte).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that. This is a leftover comment from a previous implementation attempt that I gave up on and forgot to remove.

src/lib.rs Outdated
// Only dispatch valid characters.
dispatcher.opaque_put(byte)
},
// Ignore all other control codes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Ignore all other control codes
// Ignore all other control bytes.

src/lib.rs Outdated
Comment on lines 877 to 878
/// Invoked for every valid character (0x20-0xFF) in a SOS (Start of String)
/// sequence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Invoked for every valid character (0x20-0xFF) in a SOS (Start of String)
/// sequence.
/// Invoked for every byte (0x20-0xFF) in a SOS (Start of String) sequence.

Same comment applies for the other functions. Calling these "valid characters" could be misleading to consumers.

src/lib.rs Outdated
Comment on lines 865 to 875
/// Invoked when the beginning of a new SOS (Start of String) sequence is
/// encountered.
fn sos_start(&mut self) {}

/// Invoked when the beginning of a new APC (Application Program Command)
/// sequence is encountered.
fn apc_start(&mut self) {}

/// Invoked when the beginning of a new PM (Privacy Message) sequence is
/// encountered.
fn pm_start(&mut self) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than sorting these by start/put/end, I'd prefer grouping them by sos/apc/pm. I think it makes more sense since the grouping into start/put/end is somewhat arbitrary considering they aren't really interconnected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I'll do it.

src/lib.rs Outdated
Comment on lines 879 to 887
fn sos_dispatch(&mut self, _byte: u8) {}

/// Invoked for every valid character (0x20-0xFF) in an APC (Application
/// Program Command) sequence.
fn apc_dispatch(&mut self, _byte: u8) {}

/// Invoked for every valid character (0x20-0xFF) in a PM (Privacy Message)
/// sequence.
fn pm_dispatch(&mut self, _byte: u8) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the naming _dispatch is consistent with the rest of our methods. The existing _dispatch functions generally represent the dispatch of an entire escape sequence in full, while this function just represents the dispatch of a single byte in the sequence.

I think this is much closer to the way esc works in VTE, where it is split in hook, put, and unhook.

Other parts of this patch already make use of the _put nomenclature, so I think it's better to be consistent and rename these functions to sos/pm/apc_put.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right, considering the method on the internal OpaqueDispatch trait is called opaque_put and then it ends up calling sos/apc/pm_dispatch()...

Comment on lines +914 to +918
trait OpaqueDispatch {
fn execute(&mut self, byte: u8);
fn opaque_put(&mut self, byte: u8);
fn opaque_end(&mut self);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this trait and its implementations is the main issue I have with this patch. It should work fine since it all just gets inlined and optimized out essentially, but it's still a whole lot of boilerplate.

I'm not sure I have any better ideas for now, but at the very least this trait needs a comment explaining that it's just a helper for dispatching over the opaque string escapes and in practice is inlined everywhere to function as static dispatch without conditional indirection.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trait was the best solution I could think of until now, which means it is the least terrible one. I'll add a comment.

@chrisduerr
Copy link
Member

Just to be clear on this:

[sos/apc/pm]_dispatch() is called individually for each byte. For any use case which is not a byte-by-byte parser, it would probably be more efficient to call it on byte slices instead (I believe this was the entire point of #118).
The main issue with calling this on byte slices, is that the SOS/APC/PM sequences can be arbitrarily long; indeed, with the the most common use case (Kitty Image Protocol) these sequences could easily be several megabytes long. I am a little bit unsure how accumulating these sequence in a Vec (like what the parser is doing for OSC) would impact performance.

I personally don't mind this "simplistic" approach because I have very little interest in sos/pm/apc escape sequences, so this way there is only a minimal amount of impact on Alacritty while VTE is kept simple. That's also why I don't care if these escapes are dispatched particularly efficiently or not, since ideally nobody every uses them.

That said, the collection into an (Array)Vec in VTE is essentially a solved issue. We have this for OSCs already where things are somewhat dynamically limited and having a matching/similar approach probably wouldn't be a huge deal for these escapes either. Ideally to ensure parser state doesn't get too big one would likely reuse the same byte buffers for both, since it's not possible to have two different escape sequences at the same time anyway.

But going for something like that is likely to slow down the implementation process, since it doesn't sound like that's something you're particularly familiar with. So I don't mind just going with the status quo in this patch.

@boazy
Copy link
Author

boazy commented Mar 12, 2025

That said, the collection into an (Array)Vec in VTE is essentially a solved issue. We have this for OSCs already where things are somewhat dynamically limited and having a matching/similar approach probably wouldn't be a huge deal for these escapes either. Ideally to ensure parser state doesn't get too big one would likely reuse the same byte buffers for both, since it's not possible to have two different escape sequences at the same time anyway.

Familiarity is not my issue here. I do understand how to collect bytes into a Vec (I hope). I thought of reusing osc_raw, but there may be several (minor) performance implications for Alacritty (and any other user of Parser which ignores SOS/PM/APC sequences):

  • If SOS/PM/APC sequences do get sent by a program, they'll be written to a buffer and then just get ignored later. In theory, this would impact performance.
  • Kitty Image Protocol sequences are still bigger than typical APM sequences, even with 4096 byte chunking (together with all the parameters that come before the payload, the actual chunk size would be a bit larger than 4096 bytes). If any program sends these sequences by mistake, the Vec would grow to that size and stay there (the non-std implementation will not be impacted, if we keep MAX_OSC_RAW the same).

I don't think these two issues are big in practice. Programs that send Kitty Image Protocol sequences to a terminal that doesn't support that are already misbehaving, and 4096 or 8192 bytes of extra memory per terminal are not a big thing. I also don't think cache locality would be impacted, since the shorter OSC sequences would keep using just the beginning of the buffer.

I will create another branch and a draft PR with a buffered version for comparison.

@boazy boazy requested a review from chrisduerr March 12, 2025 02:32
@boazy
Copy link
Author

boazy commented Mar 12, 2025

On a second thought, I'm not sure if buffering opaque sequences will be such a great boon for clients of VTE anyway. Just like VTE itself is inlining methods, clients can just choose to inline their apc_put() implementation and put everything inside a Vec. So I think I would be okay with this implementation.

@chrisduerr
Copy link
Member

On a second thought, I'm not sure if buffering opaque sequences will be such a great boon for clients of VTE anyway. Just like VTE itself is inlining methods, clients can just choose to inline their apc_put() implementation and put everything inside a Vec. So I think I would be okay with this implementation.

If the goal was to write the ideal implementation as far as performance goes, we'd probably special-case the OSC state to look for its termination sequence using memchr. However while that would speed up opaque strings, it would probably reduce the non-opaque string escape parsing slightly, so I don't think it's worth it.

input: &[u8],
kind: OpaqueSequenceKind,
expected_payload: &[u8],
expected_trailer: &[Sequence],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be a boolean to simplify this function and remove the necessity for a constant. st_terminated: bool or something like that.

expected_payload: &[u8],
expected_trailer: &[Sequence],
) {
let mut expected_dispatched: Vec<Sequence> = vec![Sequence::OpaqueStart(kind)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut expected_dispatched: Vec<Sequence> = vec![Sequence::OpaqueStart(kind)];
let mut expected_dispatched = vec![Sequence::OpaqueStart(kind)];

Is this really necessary? I'd be extremely surprised if Rust wasn't able to infer it here.

@chrisduerr
Copy link
Member

Added some small comments, but didn't want to actually do a review yet. Please just re-request one once you think you're done with your changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Support for APC?
5 participants