Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for RegExp.leftContext RegExp.rightContext #923

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ammarahm-ed
Copy link
Contributor

@ammarahm-ed ammarahm-ed commented Feb 20, 2025

This pull request to quickjs.c introduces several changes to enhance the handling of regular expression contexts within the JSContext structure. The most important changes include adding new fields to store RegExp context information, initializing these fields in context creation, freeing them in context destruction, and implementing new functions to manage these contexts.

Enhancements to RegExp context handling:

  • quickjs.c: Added new fields regexp_left_ctx, regexp_right_ctx, regexp_last_match_str, regexp_last_match_start, and regexp_last_match_end to the JSContext structure to store RegExp context information.
  • quickjs.c: Initialized the new RegExp context fields in the JS_NewContextRaw function.
  • quickjs.c: Freed the new RegExp context fields in the JS_FreeContext function to prevent memory leaks.
  • quickjs.c: Implemented new functions js_regexp_get_leftContext and js_regexp_get_rightContext to retrieve and manage the left and right context of the last RegExp match.
  • quickjs.c: Updated the js_regexp_exec function to store the last match string and its start and end positions in the new context fields.
// Define a string to test
const testString = "Hello, world!";

// Create a regular expression to match the word "world"
const regex = /world/;

// Execute the regular expression on the test string
const result = regex.exec(testString);

// Check if the match was successful
if (result) {
    // Print the matched result
    console.log("Matched:", result[0]);

    // Print the left context
    console.log("Left Context:", RegExp.leftContext);

    // Print the right context
    console.log("Right Context:", RegExp.rightContext);
} else {
    console.log("No match found.");
}

Closes #921

Copy link
Contributor

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem I see with this approach is that regexp_last_match_str can stay around indefinitely (also applies to the substrings.)

Imagine you run a regex on a 100 MB string. That string can't be reclaimed until the next time a regexp runs - which may be never.

@ammarahm-ed ammarahm-ed requested a review from bnoordhuis March 2, 2025 04:54
@ammarahm-ed
Copy link
Contributor Author

One thing we can do is probably calculate left & right contexts together and free the last matched string. Might help with reducing memory usage.

@bnoordhuis
Copy link
Contributor

That might ameliorate it but won't fix it structurally.

I'm personally leaning towards not merging this for that reason (users shouldn't pay the cost for a feature they don't use) but a possible midway solution is adding a void JS_AddIntrinsicLegacyRegExp(JSContext *ctx) function (actual name TBD) that opts into them.

If you decide to go down that path, I'd also like to see that tested in run-test262.c (but only when a test specifies legacy-regexp) and maybe a qjs command line flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RegExp.leftContext unsupported
2 participants