Replace the old topological sort everywhere #6902

tlively · 2024-09-04T21:02:40Z

To avoid having two separate topological sort utilities in the code base,
replace remaining uses of the old DFS-based, CRTP topological sort with the
newer Kahn's algorithm implementation.

This would be NFC, except that the new topological sort produces a different
order than the old topological sort, so the output of some passes is reordered.

kripken · 2024-09-04T23:01:06Z

test/lit/passes/unsubtyping.wast

 ;; CHECK:       (type $Y (sub $X (struct)))
 (type $Y (sub $X (struct)))

- ;; CHECK:       (type $A (sub (struct (field (ref null $X)))))
 (type $A (sub (struct (ref null $X))))


This test becomes less readable with this change, as it moves the check for $A away from $A. We've worked around this manually in the past, by moving the definition to where the output is, but on a change this large obviously that isn't practical. But just applying this change means making potentially many tests less readable, effectively undoing all the manual effort we've put into these tweaks.

Maybe now is a good time to automate those tweaks, before this PR? I mean that the auto-updater of lit tests could move definitions so that they appear together automatically. Or, it could move the checks if the auto-updater would also accept that change in order.

After talking offline, we decided to add an option that propagates the order of types from the input through to the output, similarly to how we propagate type names. Once we can use that option in some of these large tests with many types where the optimized output order doesn't matter, the diff here will become much smaller.

Unlike other module elements, types are not stored on the `Module`. Instead, they are collected by traversing the IR before printing and binary writing. The code that collects the types tries to optimize the order of rec groups based on the number of times each type is used. As a result, the output order of types generally has no relation to the input order of types. In addition, most type optimizations rewrite the types into a single large rec group, and the order of types in that group is essentially arbitrary. Changes to the code for counting type uses, sorting types, or sorting rec groups can yield very large changes in the output order of types, producing test diffs that are hard to review and potentially harming the readability of tests by moving output types away from the corresponding input types. To help make test output more stable and readable, introduce a wasm-opt option that causes the order of output types to match the order of input types as closely as possible. It is implemented by having the parsers record the indices of the input types on the `Module` just like they already record the type names. The `GlobalTypeRewriter` infrastructure used by type optimizations associates the new types with the old indices just like it already does for names and also respects the input order when rewriting types into a large recursion group. By default, wasm-opt clears the recorded type indices after parsing the module, in which case its behavior is not modified by this change. Other tools do not clear the recorded type indices, so their output types now match the order of their input types. While full fidelity round-tripping is not a goal of any Binaryen tool, there's no downside to making the round trip more exact for non-optimizing tools. Follow-on PRs will use the new flag in more tests, which will generate large diffs but leave the tests in stable, more readable states that will no longer change due to other changes to the optimizing type sorting logic.

These are the tests that would otherwise have the largest diffs when changing the topological sort used to sort types. signature-refining_gto.wat also cannot be automatically updated, so there is extra benefit to making sure it has stable output.

To avoid having two separate topological sort utilities in the code base, replace remaining uses of the old DFS-based, CRTP topological sort with the newer Kahn's algorithm implementation. This would be NFC, except that the new topological sort produces a different order than the old topological sort, so the output of some passes is reordered.

tlively · 2024-09-07T03:05:55Z

I've now rebased this on top of #6917, so the test diff is much smaller. I'd be happy to apply --preserve-type-order more broadly if you would prefer, but I've erred on the side of not applying it for now. Unfortunately the changes to the ctor-eval tests are unrelated to type ordering, so there's nothing we can do to make those smaller.

tlively · 2024-09-07T03:08:47Z

test/lit/passes/type-merging.wast

There are still many changes in this file despite it using --preserve-type-order, but they're because the names used for the merged types changed.

kripken · 2024-09-09T19:54:28Z

src/ir/subtypes.h

@@ -18,7 +18,7 @@
 #define wasm_ir_subtypes_h


github informs me that something changed in this file since last I read it. Looks like there were forced-pushes, so I can't read the commit, and I tried the "show diff" on the force-push, which @brendandahl recommended to me a while ago. But the diffs there is huge and dominated by unrelated changes, merges from main, I assume.

Am I missing a way that github marks the changes in the PR from unrelated changes? Or is there another good way to see incremental updates in a large PR like this with force-pushes?

No, sorry, this was a more destructive force push than usual because there were many merge conflicts when rebasing on main. The only way to avoid that would be to use merge commits instead of rebasing, but that comes with its own headaches.

...what are the headaches of merge commits? I would humbly suggest that we consider using them more 🙏

To navigate stacked PRs effectively, you want to maintain a linear history from the tip of the stack back down to main. As such, my stack management script current rebases when syncing new changes, which keeps the stack organization the same as the underlying commit organization. I could experiment with maintaining the linear stack of PRs but allowing the underlying commits to become arbitrary merge spaghetti, though. Maybe it wouldn't actually affect my workflow.

I don't have much trouble with stacked PRs myself, and I never rebase. Though, I rarely open multiple parts of the stack on github at once? (I use branches in my personal repo, which would make later parts PRs on my fork, not upstream. So I just wait to open them.)

Each time I make a change in a PR in the middle of the stack, I need to update the ones "downstream" in the chain. I just do merge commits for those. I have had no issues when doing so.

I updated my script to use merges instead of rebases. Let's see what happens!

kripken · 2024-09-09T20:45:13Z

test/lit/passes/j2cl-merge-itables.wast

@@ -11,19 +11,20 @@
      (field $vtable (ref $Object.vtable))
      (field $itable (ref $Object.itable)))))

+    ;; CHECK:       (type $Object.vtable (sub (struct (field structref))))


This motion looks like it worsened the readability?

I can add --preserve-type-order to this file.

kripken · 2024-09-09T22:16:02Z

test/lit/passes/unsubtyping-casts.wast

@@ -237,10 +237,11 @@
 (module
 (rec
  ;; CHECK:      (rec
-  ;; CHECK-NEXT:  (type $unrelated (sub (func)))
+  ;; CHECK-NEXT:  (type $top (sub (func)))


Same here. Might be worth adding the flag to any that regress here.

Update the remaining tests whose readability will be affected by the removal of the old topological sort in #6902, no matter how small their diffs would have been.

kripken · 2024-09-10T20:52:28Z

test/lit/passes/type-merging.wast

@@ -42,7 +42,7 @@
  ;; CHECK-NEXT:  )
  ;; CHECK-NEXT: )
  (func $foo
-    ;; $A will remain the same.
+    ;; $A will remain the sam^e.


Suggested change

;; $A will remain the sam^e.

;; $A will remain the same.

Unless this has a meaning I am missing?

Oops, nope, I just fat fingered it.

tlively requested a review from kripken September 4, 2024 21:02

tlively force-pushed the min-topo-sort-recgroups branch from 94e4beb to 7c11d25 Compare September 4, 2024 21:09

tlively force-pushed the replace-old-topo-sort branch from 8082440 to dee59f2 Compare September 4, 2024 21:09

tlively force-pushed the min-topo-sort-recgroups branch from 7c11d25 to 9ee868a Compare September 4, 2024 22:44

tlively force-pushed the replace-old-topo-sort branch from dee59f2 to 7c5f31b Compare September 4, 2024 22:44

kripken reviewed Sep 4, 2024

View reviewed changes

Base automatically changed from min-topo-sort-recgroups to main September 5, 2024 00:16

tlively added 4 commits September 6, 2024 18:02

function reference type

cc96ff7

tlively force-pushed the replace-old-topo-sort branch from 7c5f31b to 6d4f529 Compare September 7, 2024 03:03

tlively changed the base branch from main to use-preserve-type-order September 7, 2024 03:03

tlively commented Sep 7, 2024

View reviewed changes

format

e6825b2

kripken reviewed Sep 9, 2024

View reviewed changes

tlively added 3 commits September 9, 2024 14:57

Merge branch 'main' into preserve-type-order

be5b406

Merge branch 'preserve-type-order' into use-preserve-type-order

9af2d3a

Merge branch 'use-preserve-type-order' into replace-old-topo-sort

cc0dafb

kripken reviewed Sep 9, 2024

View reviewed changes

tlively mentioned this pull request Sep 9, 2024

Add a --preserve-type-order option to wasm-opt #6916

Merged

tlively added 6 commits September 9, 2024 16:37

remove stray newline

65dcc65

update tools to _not_ preserve order by default

ace8feb

Merge branch 'preserve-type-order' into use-preserve-type-order

a92a650

Merge branch 'use-preserve-type-order' into replace-old-topo-sort

c606ef8

Use --preserve-type-order in more tests

5e44e47

Update the remaining tests whose readability will be affected by the removal of the old topological sort in #6902, no matter how small their diffs would have been.

Merge branch 'more-preserve-type-order' into replace-old-topo-sort

edb81d1

tlively mentioned this pull request Sep 10, 2024

Use --preserve-type-order in more tests #6923

Merged

tlively changed the base branch from use-preserve-type-order to more-preserve-type-order September 10, 2024 03:16

tlively added a commit that referenced this pull request Sep 10, 2024

Use --preserve-type-order in more tests (#6923)

7ce8484

Update the remaining tests whose readability will be affected by the removal of the old topological sort in #6902, no matter how small their diffs would have been.

Base automatically changed from more-preserve-type-order to main September 10, 2024 19:01

Merge branch 'main' into replace-old-topo-sort

98f9ab0

tlively requested a review from kripken September 10, 2024 20:32

kripken approved these changes Sep 10, 2024

View reviewed changes

remove stray caret

dbbea90

tlively enabled auto-merge (squash) September 10, 2024 21:54

tlively merged commit 1a2d26f into main Sep 10, 2024
13 checks passed

tlively deleted the replace-old-topo-sort branch September 10, 2024 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace the old topological sort everywhere #6902

Replace the old topological sort everywhere #6902

tlively commented Sep 4, 2024

kripken Sep 4, 2024

tlively Sep 5, 2024

tlively commented Sep 7, 2024

tlively Sep 7, 2024

kripken Sep 9, 2024

tlively Sep 9, 2024

kripken Sep 9, 2024

tlively Sep 9, 2024

kripken Sep 9, 2024

tlively Sep 9, 2024

kripken Sep 9, 2024

tlively Sep 9, 2024

kripken Sep 9, 2024

kripken Sep 10, 2024

tlively Sep 10, 2024

Replace the old topological sort everywhere #6902

Replace the old topological sort everywhere #6902

Conversation

tlively commented Sep 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlively commented Sep 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment