Move exactness to heap types #18

tlively · 2025-03-18T02:03:20Z

Rather than having exactness as a new attribute of reference types, make
it a new attribute of heap types. By restricting exact heap types to
using type indices, we make it impossible to construct an exact abstract
heap type and thereby avoid introducing new uninhabitable types. This
change also dramatically simplifies the handling of exact types in
existing instructions; existing heap type immediates can now serve to
encode exactness rather than needing new encodings for the base
instructions.

Rather than having exactness as a new attribute of reference types, make it a new attribute of heap types. By restricting exact heap types to using type indices, we make it impossible to construct an exact abstract heap type and thereby avoid introducing new uninhabitable types. This change also dramatically simplifies the handling of exact types in existing instructions; existing heap type immediates can now serve to encode exactness rather than needing new encodings for the base instructions.

sjrd

This looks very neat!

sjrd · 2025-03-18T02:24:19Z

proposals/custom-descriptors/Overview.md

-
-Similarly, we allow combining `exact` with the established shorthands in the text format.
-For example `(exact anyref)` is a shorthand for `(ref null exact any)`.
+Note that the type index being encoded as a `u32` instead of a `u33`


Suggested change

Note that the type index being encoded as a `u32` instead of a `u33`

Note that the type index being encoded as a `u32` instead of an `s33`

sjrd · 2025-03-18T02:28:07Z

proposals/custom-descriptors/Overview.md

-reftype :: ...
-  | 0x62 0x64 ht:heaptype => ref exact ht
-  | 0x62 0x63 ht:heaptype => ref null exact ht
+heaptype :: ... | 0x62 x:u32 => exact x


0x62 made sense as the next opcode in the reftype production. In the heaptype production the next available opcode is 0x69. But also we might want to leave the 0x6x range entirely available for future abstract heap types? Perhaps 0x5F would then start a new category for heap types prefixes?

rossberg

LGTM

jakobkummerow

LGTM

titzer · 2025-03-18T14:29:54Z

Just to be clear, the implication of this is that due to canonicalization, there can be two different, non-equivalent declarations of an exact type for a given heap type H, because those declarations were in different recursion groups.

Does that mean that we just require some exact type for the input to a custom-RTT allocation?

rossberg · 2025-03-18T14:49:45Z

@titzer, I'm not sure I understand your question, can you make it more concrete?

Perhaps this answers it(?): Even with this refactoring, exactness remains an attribute at the use site of type definitions. Two exact types are equivalent if and only if both refer to the same type index, or to type indices for equivalent types, regardless of the context in which they occur. So wrt to RTT operands, yes, (ref (exact $t)) and (ref (exact $u)) are interchangeable as long as $t and $u are.

jakobkummerow · 2025-03-18T15:35:27Z

FWIW, while this may seem elegant and minimal from a spec perspective, my initial assessment is that the implementation impact will be huge: in V8 (and presumably most other implementations too), we have a lot of code that assumes that, essentially, HeapType == int ("s33", but with tighter limits per spec), so things like buffer->write_i32v(heap_type) just work; and to reflect this change, they'll all have to be refactored to use some kind of struct/class type instead. That's algorithmically easy, of course, but it looks like many hours of busy-work.

I hope y'all are sure that this is the right change to make.

rossberg · 2025-03-18T16:56:12Z

Given the implementation limit of 1M types per module in a JS embedding, don't you have plenty of spare bits in a 32-bit word that the implementation can store that flag in?

tlively · 2025-03-18T17:44:45Z

FWIW I'm somewhat ambivalent about this change. It's certainly cleaner in the spec, but I do think there is an advantage to handling exactness and nullability in the same place rather than splitting them between reference types and heap types. (And of course I never minded the extra uninhabitable types that came with that.)

I'm tempted to land this change for the spec, but continue treating exactness as a property of the reftype in the Binaryen implementation. I suspect deviating from the spec structure like that is asking for trouble down the road, though.

jakobkummerow · 2025-03-18T18:44:11Z

don't you have plenty of spare bits in a 32-bit word that the implementation can store that flag in?

Yes, thank you, packing the struct into 32 bits is exactly what we do. But that doesn't help for code like this.

Clearly it's possible to rewrite all that. I just hope it's worth it. (Perhaps we'll just let that legacy code rot and write new tests based on different infrastructure.)

titzer · 2025-03-19T00:08:06Z

@rossberg Yes I think we're on the same page--I was just making it explicit if it wasn't obvious enough.

I'm also ambivalent towards this change, but haven't thought much about the implications of uninhabited types.

rossberg · 2025-03-19T08:16:24Z

@tlively, I think conceptually this is the semantics we want, given that the desired rules for bottom and null fall out naturally in exactly the right way without any hacky special cases. Of course, that does not prevent implementations from shifting the representation around in some equivalent way, if that's more convenient for them. Their ASTs will likely contain more redundancy and "junk" (unused representations) in that case, but that may be a reasonable trade-off with other considerations.

tlively · 2025-03-19T19:05:08Z

After thinking about this more, I'm cautiously optimistic that it will end up being simpler in the Binaryen implementation as well. I'll go ahead and merge this.

tlively · 2025-03-22T06:02:13Z

It turns out that we don't even need to use a new bit to encode exactness in Binaryen's heap type representation. Since we only store sharedness using a bit for abstract heap types, and because abstract heap types are never exact, we can reuse that same bit to encode exactness for non-abstract types.

tlively requested review from jakobkummerow and rossberg March 18, 2025 02:03

tlively mentioned this pull request Mar 18, 2025

Exact abstract types #15

Closed

sjrd reviewed Mar 18, 2025

View reviewed changes

rossberg approved these changes Mar 18, 2025

View reviewed changes

jakobkummerow approved these changes Mar 18, 2025

View reviewed changes

tlively merged commit ca66aad into main Mar 19, 2025

This was referenced Mar 19, 2025

Generalize the binary shorthand for exact references #9

Closed

Change exactness prefix opcode? #23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move exactness to heap types #18

Move exactness to heap types #18

tlively commented Mar 18, 2025

sjrd left a comment

sjrd Mar 18, 2025

sjrd Mar 18, 2025

rossberg left a comment

jakobkummerow left a comment

titzer commented Mar 18, 2025

rossberg commented Mar 18, 2025

jakobkummerow commented Mar 18, 2025

rossberg commented Mar 18, 2025 •

edited

Loading

tlively commented Mar 18, 2025

jakobkummerow commented Mar 18, 2025

titzer commented Mar 19, 2025

rossberg commented Mar 19, 2025

tlively commented Mar 19, 2025

tlively commented Mar 22, 2025

	Note that the type index being encoded as a `u32` instead of a `u33`
	Note that the type index being encoded as a `u32` instead of an `s33`

Move exactness to heap types #18

Move exactness to heap types #18

Conversation

tlively commented Mar 18, 2025

sjrd left a comment

Choose a reason for hiding this comment

sjrd Mar 18, 2025

Choose a reason for hiding this comment

sjrd Mar 18, 2025

Choose a reason for hiding this comment

rossberg left a comment

Choose a reason for hiding this comment

jakobkummerow left a comment

Choose a reason for hiding this comment

titzer commented Mar 18, 2025

rossberg commented Mar 18, 2025

jakobkummerow commented Mar 18, 2025

rossberg commented Mar 18, 2025 • edited Loading

tlively commented Mar 18, 2025

jakobkummerow commented Mar 18, 2025

titzer commented Mar 19, 2025

rossberg commented Mar 19, 2025

tlively commented Mar 19, 2025

tlively commented Mar 22, 2025

rossberg commented Mar 18, 2025 •

edited

Loading