Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for custom parsing of APC, SOS and PM sequences. #115

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
292 changes: 287 additions & 5 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,9 @@ impl<const OSC_RAW_BUF_SIZE: usize> Parser<OSC_RAW_BUF_SIZE> {
State::Escape => self.advance_esc(performer, byte),
State::EscapeIntermediate => self.advance_esc_intermediate(performer, byte),
State::OscString => self.advance_osc_string(performer, byte),
State::SosPmApcString => self.anywhere(performer, byte),
State::SosString => self.advance_opaque_string(SosDispatch(performer), byte),
State::ApcString => self.advance_opaque_string(ApcDispatch(performer), byte),
State::PmString => self.advance_opaque_string(PmDispatch(performer), byte),
State::Ground => unreachable!(),
}
}
Expand Down Expand Up @@ -356,7 +358,10 @@ impl<const OSC_RAW_BUF_SIZE: usize> Parser<OSC_RAW_BUF_SIZE> {
performer.esc_dispatch(self.intermediates(), self.ignoring, byte);
self.state = State::Ground
},
0x58 => self.state = State::SosPmApcString,
0x58 => {
performer.sos_start();
self.state = State::SosString
},
0x59..=0x5A => {
performer.esc_dispatch(self.intermediates(), self.ignoring, byte);
self.state = State::Ground
Expand All @@ -374,7 +379,14 @@ impl<const OSC_RAW_BUF_SIZE: usize> Parser<OSC_RAW_BUF_SIZE> {
self.osc_num_params = 0;
self.state = State::OscString
},
0x5E..=0x5F => self.state = State::SosPmApcString,
0x5E => {
performer.pm_start();
self.state = State::PmString
},
0x5F => {
performer.apc_start();
self.state = State::ApcString
},
0x60..=0x7E => {
performer.esc_dispatch(self.intermediates(), self.ignoring, byte);
self.state = State::Ground
Expand Down Expand Up @@ -434,6 +446,28 @@ impl<const OSC_RAW_BUF_SIZE: usize> Parser<OSC_RAW_BUF_SIZE> {
}
}

#[inline(always)]
fn advance_opaque_string<D: OpaqueDispatch>(&mut self, mut dispatcher: D, byte: u8) {
match byte {
0x07 => {
dispatcher.opaque_end();
self.state = State::Ground
},
0x18 | 0x1A => {
dispatcher.opaque_end();
dispatcher.execute(byte);
self.state = State::Ground
},
0x1B => {
dispatcher.opaque_end();
self.state = State::Escape
},
0x20..=0xFF => dispatcher.opaque_put(byte),
// Ignore all other control bytes.
_ => (),
}
}

#[inline(always)]
fn anywhere<P: Perform>(&mut self, performer: &mut P, byte: u8) {
match byte {
Expand Down Expand Up @@ -743,7 +777,9 @@ enum State {
Escape,
EscapeIntermediate,
OscString,
SosPmApcString,
SosString,
ApcString,
PmString,
#[default]
Ground,
}
Expand Down Expand Up @@ -811,6 +847,40 @@ pub trait Perform {
/// subsequent characters were ignored.
fn esc_dispatch(&mut self, _intermediates: &[u8], _ignore: bool, _byte: u8) {}

/// Invoked when the beginning of a new SOS (Start of String) sequence is
/// encountered.
fn sos_start(&mut self) {}

/// Invoked for every valid byte (0x20-0xFF) in a SOS (Start of String)
/// sequence.
fn sos_put(&mut self, _byte: u8) {}

/// Invoked when the end of an SOS (Start of String) sequence is
/// encountered.
fn sos_end(&mut self) {}

/// Invoked when the beginning of a new PM (Privacy Message) sequence is
/// encountered.
fn pm_start(&mut self) {}

/// Invoked for every valid byte (0x20-0xFF) in a PM (Privacy Message)
/// sequence.
fn pm_put(&mut self, _byte: u8) {}

/// Invoked when the end of a PM (Privacy Message) sequence is encountered.
fn pm_end(&mut self) {}

/// Invoked when the beginning of a new APC (Application Program Command)
/// sequence is encountered.
fn apc_start(&mut self) {}

/// Invoked for every valid byte (0x20-0xFF) in an APC (Application Program
/// Command) sequence.
fn apc_put(&mut self, _byte: u8) {}
/// Invoked when the end of an APC (Application Program Command) sequence is
/// encountered.
fn apc_end(&mut self) {}

/// Whether the parser should terminate prematurely.
///
/// This can be used in conjunction with
Expand All @@ -825,6 +895,73 @@ pub trait Perform {
}
}

/// This trait is used internally to provide a common implementation for Opaque
/// Sequences (SOS, APC, PM). Implementations of this trait will just forward
/// calls to the equivalent method on [Perform]. Implementations of this trait
/// are always inlined to avoid overhead.
trait OpaqueDispatch {
fn execute(&mut self, byte: u8);
fn opaque_put(&mut self, byte: u8);
fn opaque_end(&mut self);
}
Comment on lines +902 to +906
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this trait and its implementations is the main issue I have with this patch. It should work fine since it all just gets inlined and optimized out essentially, but it's still a whole lot of boilerplate.

I'm not sure I have any better ideas for now, but at the very least this trait needs a comment explaining that it's just a helper for dispatching over the opaque string escapes and in practice is inlined everywhere to function as static dispatch without conditional indirection.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trait was the best solution I could think of until now, which means it is the least terrible one. I'll add a comment.


struct SosDispatch<'a, P: Perform>(&'a mut P);

impl<P: Perform> OpaqueDispatch for SosDispatch<'_, P> {
#[inline(always)]
fn execute(&mut self, byte: u8) {
self.0.execute(byte);
}

#[inline(always)]
fn opaque_put(&mut self, byte: u8) {
self.0.sos_put(byte);
}

#[inline(always)]
fn opaque_end(&mut self) {
self.0.sos_end();
}
}

struct ApcDispatch<'a, P: Perform>(&'a mut P);

impl<P: Perform> OpaqueDispatch for ApcDispatch<'_, P> {
#[inline(always)]
fn execute(&mut self, byte: u8) {
self.0.execute(byte);
}

#[inline(always)]
fn opaque_put(&mut self, byte: u8) {
self.0.apc_put(byte);
}

#[inline(always)]
fn opaque_end(&mut self) {
self.0.apc_end();
}
}

struct PmDispatch<'a, P: Perform>(&'a mut P);

impl<P: Perform> OpaqueDispatch for PmDispatch<'_, P> {
#[inline(always)]
fn execute(&mut self, byte: u8) {
self.0.execute(byte);
}

#[inline(always)]
fn opaque_put(&mut self, byte: u8) {
self.0.pm_put(byte);
}

#[inline(always)]
fn opaque_end(&mut self) {
self.0.pm_end();
}
}

#[cfg(all(test, not(feature = "std")))]
#[macro_use]
extern crate std;
Expand All @@ -842,12 +979,21 @@ mod tests {
b'c', b'r', b'i', b't', b't', b'y', 0x07, // End OSC
];

const ST_ESC_SEQUENCE: &[Sequence] = &[Sequence::Esc(vec![], false, 0x5C)];

#[derive(Default)]
struct Dispatcher {
dispatched: Vec<Sequence>,
}

#[derive(Debug, PartialEq, Eq)]
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
enum OpaqueSequenceKind {
Sos,
Pm,
Apc,
}

#[derive(Clone, Debug, PartialEq, Eq)]
enum Sequence {
Osc(Vec<Vec<u8>>, bool),
Csi(Vec<Vec<u16>>, Vec<u8>, bool, char),
Expand All @@ -856,6 +1002,9 @@ mod tests {
DcsPut(u8),
Print(char),
Execute(u8),
OpaqueStart(OpaqueSequenceKind),
OpaquePut(OpaqueSequenceKind, u8),
OpaqueEnd(OpaqueSequenceKind),
DcsUnhook,
}

Expand Down Expand Up @@ -897,6 +1046,42 @@ mod tests {
fn execute(&mut self, byte: u8) {
self.dispatched.push(Sequence::Execute(byte));
}

fn sos_start(&mut self) {
self.dispatched.push(Sequence::OpaqueStart(OpaqueSequenceKind::Sos));
}

fn sos_put(&mut self, byte: u8) {
self.dispatched.push(Sequence::OpaquePut(OpaqueSequenceKind::Sos, byte));
}

fn sos_end(&mut self) {
self.dispatched.push(Sequence::OpaqueEnd(OpaqueSequenceKind::Sos));
}

fn pm_start(&mut self) {
self.dispatched.push(Sequence::OpaqueStart(OpaqueSequenceKind::Pm));
}

fn pm_put(&mut self, byte: u8) {
self.dispatched.push(Sequence::OpaquePut(OpaqueSequenceKind::Pm, byte));
}

fn pm_end(&mut self) {
self.dispatched.push(Sequence::OpaqueEnd(OpaqueSequenceKind::Pm));
}

fn apc_start(&mut self) {
self.dispatched.push(Sequence::OpaqueStart(OpaqueSequenceKind::Apc));
}

fn apc_put(&mut self, byte: u8) {
self.dispatched.push(Sequence::OpaquePut(OpaqueSequenceKind::Apc, byte));
}

fn apc_end(&mut self) {
self.dispatched.push(Sequence::OpaqueEnd(OpaqueSequenceKind::Apc));
}
}

#[test]
Expand Down Expand Up @@ -1386,6 +1571,103 @@ mod tests {
}
}

fn expect_opaque_sequence(
input: &[u8],
kind: OpaqueSequenceKind,
expected_payload: &[u8],
expected_trailer: &[Sequence],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be a boolean to simplify this function and remove the necessity for a constant. st_terminated: bool or something like that.

) {
let mut expected_dispatched: Vec<Sequence> = vec![Sequence::OpaqueStart(kind)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut expected_dispatched: Vec<Sequence> = vec![Sequence::OpaqueStart(kind)];
let mut expected_dispatched = vec![Sequence::OpaqueStart(kind)];

Is this really necessary? I'd be extremely surprised if Rust wasn't able to infer it here.

for byte in expected_payload {
expected_dispatched.push(Sequence::OpaquePut(kind, *byte));
}
expected_dispatched.push(Sequence::OpaqueEnd(kind));
for item in expected_trailer {
expected_dispatched.push(item.clone());
}

let mut dispatcher = Dispatcher::default();
let mut parser = Parser::new();
parser.advance(&mut dispatcher, input);

assert_eq!(dispatcher.dispatched, expected_dispatched);
}

#[test]
fn sos_c0_st_terminated() {
expect_opaque_sequence(
b"\x1bXTest\x20\xFF;xyz\x1b\\",
OpaqueSequenceKind::Sos,
b"Test\x20\xFF;xyz",
ST_ESC_SEQUENCE,
);
}

#[test]
fn sos_bell_terminated() {
expect_opaque_sequence(
b"\x1bXTest\x20\xFF;xyz\x07",
OpaqueSequenceKind::Sos,
b"Test\x20\xFF;xyz",
&[],
);
}

#[test]
fn sos_empty() {
expect_opaque_sequence(b"\x1bX\x1b\\", OpaqueSequenceKind::Sos, &[], ST_ESC_SEQUENCE);
}

#[test]
fn pm_c0_st_terminated() {
expect_opaque_sequence(
b"\x1b^Test\x20\xFF;xyz\x1b\\",
OpaqueSequenceKind::Pm,
b"Test\x20\xFF;xyz",
ST_ESC_SEQUENCE,
);
}

#[test]
fn pm_bell_terminated() {
expect_opaque_sequence(
b"\x1b^Test\x20\xFF;xyz\x07",
OpaqueSequenceKind::Pm,
b"Test\x20\xFF;xyz",
&[],
);
}

#[test]
fn pm_empty() {
expect_opaque_sequence(b"\x1b^\x1b\\", OpaqueSequenceKind::Pm, &[], ST_ESC_SEQUENCE);
}

#[test]
fn apc_c0_st_terminated() {
expect_opaque_sequence(
b"\x1b_Test\x20\xFF;xyz\x1b\\",
OpaqueSequenceKind::Apc,
b"Test\x20\xFF;xyz",
ST_ESC_SEQUENCE,
);
}

#[test]
fn apc_bell_terminated() {
expect_opaque_sequence(
b"\x1b_Test\x20\xFF;xyz\x07",
OpaqueSequenceKind::Apc,
b"Test\x20\xFF;xyz",
&[],
);
}

#[test]
fn apc_empty() {
expect_opaque_sequence(b"\x1b_\x1b\\", OpaqueSequenceKind::Apc, &[], ST_ESC_SEQUENCE);
}

#[test]
fn unicode() {
const INPUT: &[u8] = b"\xF0\x9F\x8E\x89_\xF0\x9F\xA6\x80\xF0\x9F\xA6\x80_\xF0\x9F\x8E\x89";
Expand Down