You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a while I’ve been wondering what it would be like to use
arenas in Rust.
In C and C++ I have been turning to arenas more and more as a fast alternative
to heap allocation. If you have a bunch of objects that share a common
lifetime, arenas offer cheaper allocation and much cheaper deallocation than
the heap. The more I use this pattern, the more it feels downright wasteful to
use heap allocation when an arena would do.
I’ve been wanting to know how arenas would play with Rust’s lifetime semantics.
An arena must always outlive all the objects allocated from that arena. Rust’s
lifetime system seems ideal for expressing a condition like this. I was
curious to see how this plays out in practice.
Arena APIs
C and C++
First I will present the arena APIs I am familiar with in C and C++. Here is a
simplified version of the C++ Arena API for
protobuf:
// The C++ Arena is thread-safe (Functions taking Arena* may be called// concurrently, except the destructor).classArena{public:Arena();~Arena();// Frees all objects in the arena.
// Creates an object on the arena. The object is freed when the arena is // destroyed. The destructor will be run unless it is trivial. template<classT,class...Args> staticTCreate(Arenaarena,Args&&...args); }
// When the arena is destroyed, the individual objects are freed. }
Here is a similar but somewhat different example in C, from the upb protobuf library:
// The C arena is thread-compatible, but not thread-safe (functions that take// upb_arena* may not be called concurrently).upb_arena*upb_arena_new(void);
// Frees all memory in the arena. voidupb_arena_free(upb_arena*a);
// Allocates so memory from the arena. voidupb_arena_malloc(upb_arenaa,size_tsize);
upb_arena_free(arena);// All the individual objects are freed. }
In both the C and C++ versions, it is the user’s responsibility to make sure
that arena-allocated objects are not used past the lifetime of the arena. C
and C++ are not memory safe languages, and they offer no lifetime checking that
would help us avoiding dangling pointers here.
Even techniques like unique_ptr can’t help us here since the objects cannot
be freed independently. We could get dynamic checking by using unique_ptr
with a custom deleter that decrements a refcount on the arena, and then panic if
an arena is destroyed with a non-zero refcount. This would be better than
nothing, but it’s still a runtime check (not compile time). In any case
neither arena uses this technique at the moment, so respecting the arena’s
lifetime is entirely the responsibility of the user.
The C and C++ versions above have different thread-safety properties: the C++
arena is thread-safe (concurrent allocations are allowed), while the C arena is
thread-compatible (concurrent allocations are not allowed, because they access
a mutable pointer). This means the C++ API is paying an efficiency/complexity
overhead to allow concurrent mutable access.
Rust
The most established Rust Arena library I could find was called
Bumpalo. Its API is slightly different
than both of these:
pubstructBump{/* ... */}
implBump{ pubfnnew()->Bump{/* ... */}
<span>/// Allocate an object in this `Bump` and return an exclusive reference to</span>
<span>/// it.</span>
<span>pub</span> <span>fn</span> <span>alloc</span><span><</span><span>T</span><span>></span><span>(</span><span>&</span><span>self</span><span>,</span> <span>val</span><span>:</span> <span>T</span><span>)</span> <span>-></span> <span>&</span><span>mut</span> <span>T</span> <span>{</span> <span>/* ... */</span> <span>}</span>
}
// Bump cannot be shared between threads! impl!SyncforBump{}
implDropforBump{ /// Frees all the objects in the arena, without calling their Drop. fndrop(&mutself){/* ... */} }
fnTest(){ letbump=Bump::new();
<span>let</span> <span>i1</span> <span>=</span> <span>bump</span><span>.alloc</span><span>(</span><span>1</span><span>);</span>
<span>let</span> <span>i1</span> <span>=</span> <span>bump</span><span>.alloc</span><span>(</span><span>2</span><span>);</span>
<span>let</span> <span>i1</span> <span>=</span> <span>bump</span><span>.alloc</span><span>(</span><span>3</span><span>);</span>
<span>// When bump is dropped, the integers are freed.</span>
}
The Bump::alloc() function returns a mutable reference &mut T. Rust’s
lifetime system will statically ensure that the reference doesn’t outlive the
arena. We will explore some consequences of this below.
From a thread-safety perspective, Rust’s Bump is distinct from both the
C and C++ arenas above. Like the C arena, Bump does not perform any internal
synchronization, so it avoids the overheads of the C++ thread-safe arena.
But unlike the C arena, Bump allows allocation through an immutable reference;
in other words it uses “interior mutability.” To ensure that it is not used
concurrently, it expressly forbids sharing between threads by being !Sync.
If we use the analysis in my previous article, Bump lives in the upper right quadrant
with Cell (indeed Bump internally uses Cell, which is what prevents it
from being Sync).
Why is Rust’s “Bump” not “Sync”?
An interesting question is why Bump in Rust chooses not to implement Sync.
If we want to avoid synchronization overhead, our two main choices are:
Avoid interior mutability (use &mut self), and implement Sync.
Use interior mutability (use &self), and do not implement
Sync (likely using Cell internally).
(1) is safe because it only mutates through a &mut reference, which is
guaranteed to be unique and therefore cannot race with anything. (2) is safe
because while mutation can happen from any reference, all references are bound
to a single thread, so this cannot create a data race.
Bump chooses (2), but we could imagine an alternate world where it had
chosen (1) instead:
pubstructSyncBump{/* ... */}
// Alternate version of Bump that is sync, but stil avoids syncronization // overhead: implSyncBump{ pubfnnew()->Bump{/* ... */}
We have to use different, arena-aware variations of containers like String
and Vec. This is intended to be a temporary situation which will eventually
be remedied by adding allocator support to the standard
containers.
More interesting and more fundamental is that our struct now has a lifetime
parameter 'a that constrains the struct in a way that the original struct was
not constrained. This makes sense, as it expresses the fact that our struct
must be outlived by the arena bump. Rust will require that we propagate
this lifetime parameter to any other struct that contains ArenaFoo.
In my experience trying this out on a hobby project PDF parser, this all worked
out reasonably well. I had to put lifetime parameters on most of my types, but
it didn’t really cause a problem in my testing.
Having the compiler automatically check the lifetimes was very satisfying; it
was a level of static safety checking that I have never gotten to experience in
C or C++.
Hiding the lifetime parameter
Things changed once I tried to expose my parser to Python through the
PyO3 Python bindings for Rust. A Rust struct exposed as a
Python class through
#[pyclass]cannot
have type parameters. Here is my
attempt at a basic “Hello, World” of exposing ArenaFoo to Python:
This naturally led me to brainstorm how I could put my ArenaFoo<'a> inside of
a struct with no lifetime constraints. This question is not specific to my
problem of trying to bind to Python, it is a more general question: is it
possible for a type to use arenas internally, in a way that is invisible to the
user? Can we hide the lifetime parameter, so that our usage of arenas can be a
purely internal concern that does not impose any extra constraints on users of
the type?
My first thought was to bundle the Bump inside of the same struct, thereby
guaranteeing that it will have the same lifetime:
By itself this does nothing to convince the compiler that the 'a is
unnecessary on ArenaFoo<'a>.
I asked about this on Rust’s Discord, and I was informed that this pattern is
known as the “self-referential struct”, and it is notorously difficult to make
sound. The problem is that str and vec would have references to bump in
the steady state. Rust’s normal rules would allow anyone with &mut ArenaFoo
to get &mut Bump, but this would be unsound if other members of the struct
have &Bump references.
I was referred to the ouroboros
and owning_ref
crates, which can help users construct self-referencing structs in a reasonably
sound way (I was told that both have soundness holes, but that these are
apparently fixable).
error[E0277]: `NonNull<i32>` cannot be sent between threads safely
--> src/lib.rs:8:1
|
8 | #[pyclass]
| ^^^^^^^^^^ `NonNull<i32>` cannot be sent between threads safely
|
::: /Users/haberman/.cargo/registry/src/github.lhy31512.workers.dev-1ecc6299db9ec823/pyo3-0.14.5/src/class/impl_.rs:328:33
|
328 | pub struct ThreadCheckerStub<T: Send>(PhantomData<T>);
| ---- required by this bound in `ThreadCheckerStub`
|
= help: within `SelfReferencingFoo`, the trait `Send` is not implemented for `NonNull<i32>`
= note: required because it appears within the type `bumpalo::collections::raw_vec::RawVec<'static, i32>`
= note: required because it appears within the type `bumpalo::collections::Vec<'static, i32>`
note: required because it appears within the type `SelfReferencingFoo`
--> src/lib.rs:10:8
|
10 | struct SelfReferencingFoo {
| ^^^^^^^^^^^^^^^^^^
= note: this error originates in the attribute macro `pyclass` (in Nightly builds, run with -Z macro-backtrace for more info)
error[E0277]: Cell<NonNull<bumpalo::ChunkFooter>> cannot be shared between threads safely
--> src/lib.rs:8:1
|
8 | #[pyclass]
| ^^^^^^^^^^ Cell<NonNull<bumpalo::ChunkFooter>> cannot be shared between threads safely
|
::: /Users/haberman/.cargo/registry/src/github.lhy31512.workers.dev-1ecc6299db9ec823/pyo3-0.14.5/src/class/impl_.rs:328:33
|
328 | pub struct ThreadCheckerStub<T: Send>(PhantomData<T>);
| ---- required by this bound in ThreadCheckerStub
|
= help: within bumpalo::Bump, the trait Sync is not implemented for Cell<NonNull<bumpalo::ChunkFooter>>
= note: required because it appears within the type bumpalo::Bump
= note: required because of the requirements on the impl of Send for &'static bumpalo::Bump
= note: required because it appears within the type bumpalo::collections::raw_vec::RawVec<'static, i32>
= note: required because it appears within the type bumpalo::collections::Vec<'static, i32>
note: required because it appears within the type SelfReferencingFoo
--> src/lib.rs:10:8
|
10 | struct SelfReferencingFoo {
| ^^^^^^^^^^^^^^^^^^
= note: this error originates in the attribute macro pyclass (in Nightly builds, run with -Z macro-backtrace for more info)
The main problem here is that Bump is not Sync. This unfortunately
prevents SelfReferencingFoo from being Send; a significant limitation.
For the PyO3 case, while PyO3 does allow you to specify
#[pyclass(unsendable)]
to indicate a class that doesn’t support Send, this will cause a runtime
panic if a Python user accesses the object from a different Python thread
than the one that created it. This might be ok for a library that is just
for experimentation, but it would not be an acceptable limitation for a
production-quality Python library.
Conclusion
Rust’s lifetime system offers the very attractive possibility
of having the type system automatically check lifetimes of arena-allocated
objects. And indeed, this lifetime checking worked great in the scenarios
where I was able to use it.
I was able to use ouroboros to make my usage of arenas an internal concern of
my type. A self-referencing struct can allow the arena to be packaged together
with the references, such that we do not need to impose a lifetime parameter on
users of the struct.
I was unfortunately not able to find a satisfactory solution to using arenas in
Rust while supporting Send, which hampered my ability to wrap my library in
Python. The combination of (1) !Sync on Bump and (2) arena-aware
containers that store &Bump resulted in types that are not Send, which for
my case was too big a limitation to move forward with.
This led me to conclude that, for the time being, we will need to avoid storing
Bump& in any structure if we need it to support Send.
The only other idea that came to mind was to make Bump truly thread-safe.
Then it could support allocation through &self but also support Sync.
Protobuf’s C++ thread-safe allocator has been optimized extensively to keep the
synchronization overhead low: a thread-local is used as a cache to quickly
resolve to a structure specific to that arena and thread, from which
allocations can happen without synchronization. There is definitely some
significant complexity and a bit of overhead to this, but it is one option that
could potentially solve the issue.
via Josh Haberman
August 21, 2024 at 09:35AM
The text was updated successfully, but these errors were encountered:
Arenas and Rust
https://ift.tt/KUlFZ6a
For a while I’ve been wondering what it would be like to use arenas in Rust. In C and C++ I have been turning to arenas more and more as a fast alternative to heap allocation. If you have a bunch of objects that share a common lifetime, arenas offer cheaper allocation and much cheaper deallocation than the heap. The more I use this pattern, the more it feels downright wasteful to use heap allocation when an arena would do.
I’ve been wanting to know how arenas would play with Rust’s lifetime semantics. An arena must always outlive all the objects allocated from that arena. Rust’s lifetime system seems ideal for expressing a condition like this. I was curious to see how this plays out in practice.
Arena APIs
C and C++
First I will present the arena APIs I am familiar with in C and C++. Here is a simplified version of the C++ Arena API for protobuf:
Here is a similar but somewhat different example in C, from the upb protobuf library:
In both the C and C++ versions, it is the user’s responsibility to make sure that arena-allocated objects are not used past the lifetime of the arena. C and C++ are not memory safe languages, and they offer no lifetime checking that would help us avoiding dangling pointers here.
Even techniques like
unique_ptr
can’t help us here since the objects cannot be freed independently. We could get dynamic checking by usingunique_ptr
with a custom deleter that decrements a refcount on the arena, and then panic if an arena is destroyed with a non-zero refcount. This would be better than nothing, but it’s still a runtime check (not compile time). In any case neither arena uses this technique at the moment, so respecting the arena’s lifetime is entirely the responsibility of the user.The C and C++ versions above have different thread-safety properties: the C++ arena is thread-safe (concurrent allocations are allowed), while the C arena is thread-compatible (concurrent allocations are not allowed, because they access a mutable pointer). This means the C++ API is paying an efficiency/complexity overhead to allow concurrent mutable access.
Rust
The most established Rust Arena library I could find was called Bumpalo. Its API is slightly different than both of these:
The
Bump::alloc()
function returns a mutable reference&mut T
. Rust’s lifetime system will statically ensure that the reference doesn’t outlive the arena. We will explore some consequences of this below.From a thread-safety perspective, Rust’s
Bump
is distinct from both the C and C++ arenas above. Like the C arena,Bump
does not perform any internal synchronization, so it avoids the overheads of the C++ thread-safe arena. But unlike the C arena,Bump
allows allocation through an immutable reference; in other words it uses “interior mutability.” To ensure that it is not used concurrently, it expressly forbids sharing between threads by being!Sync
.If we use the analysis in my previous article,
Bump
lives in the upper right quadrant withCell
(indeedBump
internally usesCell
, which is what prevents it from beingSync
).Why is Rust’s “Bump” not “Sync”?
An interesting question is why
Bump
in Rust chooses not to implementSync
. If we want to avoid synchronization overhead, our two main choices are:&mut self
), and implementSync
.&self
), and do not implementSync
(likely usingCell
internally).(1) is safe because it only mutates through a
&mut
reference, which is guaranteed to be unique and therefore cannot race with anything. (2) is safe because while mutation can happen from any reference, all references are bound to a single thread, so this cannot create a data race.Bump
chooses (2), but we could imagine an alternate world where it had chosen (1) instead:Unfortunately this will not actually work. Due to a limitation in Rust’s syntax, any call to
alloc()
will mutably borrowSyncBump
for the lifetime of the returned object. The net effect is that you can only allocate a single object from the arena at a time, a limitation so severe that it makesSyncBump
useless.This means that, for the time being, interior mutability (a la Bumpalo) is the only feasible way to implement a non-thread-safe arena in Rust.
Ergonomics of Arenas in Rust
What’s it like to use Arenas in Rust? Let’s take a simple struct that doesn’t use arenas:
If we want to put the string/vec data on an arena instead, it looks like this:
We have to use different, arena-aware variations of containers like
String
andVec
. This is intended to be a temporary situation which will eventually be remedied by adding allocator support to the standard containers.More interesting and more fundamental is that our struct now has a lifetime parameter
'a
that constrains the struct in a way that the original struct was not constrained. This makes sense, as it expresses the fact that our struct must be outlived by the arenabump
. Rust will require that we propagate this lifetime parameter to any other struct that containsArenaFoo
.In my experience trying this out on a hobby project PDF parser, this all worked out reasonably well. I had to put lifetime parameters on most of my types, but it didn’t really cause a problem in my testing.
Having the compiler automatically check the lifetimes was very satisfying; it was a level of static safety checking that I have never gotten to experience in C or C++.
Hiding the lifetime parameter
Things changed once I tried to expose my parser to Python through the PyO3 Python bindings for Rust. A Rust struct exposed as a Python class through
#[pyclass]
cannot have type parameters. Here is my attempt at a basic “Hello, World” of exposingArenaFoo
to Python:The compiler rejects this:
This naturally led me to brainstorm how I could put my
ArenaFoo<'a>
inside of a struct with no lifetime constraints. This question is not specific to my problem of trying to bind to Python, it is a more general question: is it possible for a type to use arenas internally, in a way that is invisible to the user? Can we hide the lifetime parameter, so that our usage of arenas can be a purely internal concern that does not impose any extra constraints on users of the type?My first thought was to bundle the
Bump
inside of the same struct, thereby guaranteeing that it will have the same lifetime:By itself this does nothing to convince the compiler that the
'a
is unnecessary onArenaFoo<'a>
.I asked about this on Rust’s Discord, and I was informed that this pattern is known as the “self-referential struct”, and it is notorously difficult to make sound. The problem is that
str
andvec
would have references tobump
in the steady state. Rust’s normal rules would allow anyone with&mut ArenaFoo
to get&mut Bump
, but this would be unsound if other members of the struct have&Bump
references.I was referred to the ouroboros and owning_ref crates, which can help users construct self-referencing structs in a reasonably sound way (I was told that both have soundness holes, but that these are apparently fixable).
Using ouroboros for self-referencing structs
With ouroboros, things were looking promising:
However the Rust compiler threw errors:
The main problem here is that
Bump
is notSync
. This unfortunately preventsSelfReferencingFoo
from beingSend
; a significant limitation.For the PyO3 case, while PyO3 does allow you to specify
#[pyclass(unsendable)]
to indicate a class that doesn’t supportSend
, this will cause a runtime panic if a Python user accesses the object from a different Python thread than the one that created it. This might be ok for a library that is just for experimentation, but it would not be an acceptable limitation for a production-quality Python library.Conclusion
Rust’s lifetime system offers the very attractive possibility of having the type system automatically check lifetimes of arena-allocated objects. And indeed, this lifetime checking worked great in the scenarios where I was able to use it.
I was able to use ouroboros to make my usage of arenas an internal concern of my type. A self-referencing struct can allow the arena to be packaged together with the references, such that we do not need to impose a lifetime parameter on users of the struct.
I was unfortunately not able to find a satisfactory solution to using arenas in Rust while supporting
Send
, which hampered my ability to wrap my library in Python. The combination of (1)!Sync
onBump
and (2) arena-aware containers that store&Bump
resulted in types that are notSend
, which for my case was too big a limitation to move forward with.This led me to conclude that, for the time being, we will need to avoid storing
Bump&
in any structure if we need it to supportSend
.The only other idea that came to mind was to make
Bump
truly thread-safe. Then it could support allocation through&self
but also supportSync
. Protobuf’s C++ thread-safe allocator has been optimized extensively to keep the synchronization overhead low: a thread-local is used as a cache to quickly resolve to a structure specific to that arena and thread, from which allocations can happen without synchronization. There is definitely some significant complexity and a bit of overhead to this, but it is one option that could potentially solve the issue.via Josh Haberman
August 21, 2024 at 09:35AM
The text was updated successfully, but these errors were encountered: