Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XMLParser] Fix reentrancy issue around currentParser #5061

Merged

Conversation

kateinoigakukun
Copy link
Member

@kateinoigakukun kateinoigakukun commented Aug 9, 2024

Instead of thread-local storage, use TaskLocal to store the current
parser. This solves three issues:

  1. If someone calls XMLParser.parse() with a new parser instance in
    a delegate method call, it overwrote the current parser and wrote
    it back after the call as nil, not the previous current parser.
    This reentrancy issue can be a problem especially when someone uses
    external entity resolving since the feature depends on the current
    parser tracking. Using TaskLocal solves this issue since it tracks
    values as a stack and restores the previous value at the end of the
    withValue call.
  2. Since jobs of different tasks can be scheduled on the same thread,
    different tasks can refer to the same thread-local storage. This
    wouldn't be a problem for now since the parse() method doesn't
    have any suspention points and different tasks can't run on the same
    thread during the parsing. However, it's better to use TaskLocal
    to leverage the concurrency model of Swift.
  3. The global variable _currentParser existed in the WASI platform
    path but it's unsafe in the Swift concurrency model. It wouldn't be a
    problem on WASI since it's always single-threaded, we should avoid
    platform-specific assumption as much as possible.

The issue was revealed in #5057 (comment)

@kateinoigakukun
Copy link
Member Author

@swift-ci test

@kateinoigakukun
Copy link
Member Author

@swift-ci test

@MaxDesiatov
Copy link
Contributor

@swift-ci test windows

@MaxDesiatov MaxDesiatov requested a review from jmschonfeld August 9, 2024 11:02
@kateinoigakukun
Copy link
Member Author

kateinoigakukun commented Aug 9, 2024

Hmm, the new test case fails only in Linux CI...

@kateinoigakukun
Copy link
Member Author

@swift-ci test Linux

@kateinoigakukun kateinoigakukun force-pushed the yt/fix-xml-parser-reentrant branch from f75c04b to cc82da1 Compare August 10, 2024 07:30
@kateinoigakukun
Copy link
Member Author

@swift-ci test

@kateinoigakukun
Copy link
Member Author

Ok, a part of the new test depended on the behavior of libxml2 v2.9.11 and later, but the CI uses v2.9.10. Fixed it by adjusting test case.

/// While the ``XMLParser`` class itself is not `Sendable`, `TaskLocal`
/// requires the value type to be `Sendable`. The sendability requirement
/// of `TaskLocal` is only for the "default" value and values set with
/// `withValue` will not be shared between tasks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've actually learned recently while investigating replacing FoundationEssentials._ThreadLocal with TaskLocal that this is not the case sadly. TaskLocal values are automatically inherited by child tasks. For that reason, the value needs to be Sendable because it is possible for a detached child task to also reference this value. I think that means that if someone were to call an API that used this value from a detached task created during a delegate call that unfortunately we'd have a concurrency safety violation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really good point. So this means we have to keep it in TLS instead of TaskLocal...

@kateinoigakukun kateinoigakukun force-pushed the yt/fix-xml-parser-reentrant branch from eed9992 to 32025d2 Compare August 23, 2024 06:29
@kateinoigakukun
Copy link
Member Author

@swift-ci test

@kateinoigakukun kateinoigakukun force-pushed the yt/fix-xml-parser-reentrant branch from 32025d2 to a4a80e1 Compare August 28, 2024 07:47
@kateinoigakukun
Copy link
Member Author

@swift-ci test

1 similar comment
@MaxDesiatov
Copy link
Contributor

@swift-ci test

@kateinoigakukun
Copy link
Member Author

Ok, test fixed. @jmschonfeld Could you take a look again? 🙏

@kateinoigakukun kateinoigakukun changed the title [XMLParser] Use TaskLocal for storing the current parser [XMLParser] Fix reentrancy issue around currentParser Aug 30, 2024
@kateinoigakukun
Copy link
Member Author

@jmschonfeld Gentle ping :)

Instead of thread-local storage, use `TaskLocal` to store the current
parser. This solves three issues:

1. If someone calls `XMLParser.parse()` with a new parser instance in
   a delegate method call, it overwrote the current parser and wrote
   it back after the call as `nil`, not the previous current parser.
   This reentrancy issue can be a problem especially when someone uses
   external entity resolving since the feature depends on the current
   parser tracking. Using `TaskLocal` solves this issue since it tracks
   values as a stack and restores the previous value at the end of the
   `withValue` call.
2. Since jobs of different tasks can be scheduled on the same thread,
   different tasks can refer to the same thread-local storage. This
   wouldn't be a problem for now since the `parse()` method doesn't
   have any suspention points and different tasks can't run on the same
   thread during the parsing. However, it's better to use `TaskLocal`
   to leverage the concurrency model of Swift.
3. The global variable `_currentParser` existed in the WASI platform
   path but it's unsafe in the Swift concurrency model. It wouldn't be a
   problem on WASI since it's always single-threaded, we should avoid
   platform-specific assumption as much as possible.
TaskLocal storage is inherited by non-detached child tasks, which can
lead to the parser being shared between tasks. This is not our intention
and can lead to inconsistent state. Instead, we should keep the current
parser in thread-local storage. This should be safe as long as we don't
have any structured suspension points in `withCurrentParser` block.
@kateinoigakukun kateinoigakukun force-pushed the yt/fix-xml-parser-reentrant branch from a4a80e1 to 54b22fe Compare September 15, 2024 02:45
@kateinoigakukun
Copy link
Member Author

@swift-ci test

@MaxDesiatov
Copy link
Contributor

@swift-ci test windows

@grynspan
Copy link
Contributor

I'm not a maintainer/codeowner of this repo so I'm going to hold off on approving, but don't take that as a blocker for merging. :)

Copy link
Contributor

@jmschonfeld jmschonfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through this one! This one LGTM now

@jmschonfeld jmschonfeld merged commit 86a733f into swiftlang:main Sep 16, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants