You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When checking out the conclusion of the tutorial for 2:4 sparsity here, the claimed advantage of 2:4 sparsity over dense execution is given as 1.3x-2.0x. However, when checking the actual values that are output in the dense and sparse section terminal sections we get the following table:
bs
compile
Dense
Sparse
Speedup
4
n
9.56
16.77
0.57x
4
y
8.98
9.49
0.95x
16
n
31.86
62.27
0.51x
16
y
30.83
34.29
0.90x
64
n
123.97
243.16
0.51x
64
y
104.98
133.49
0.79x
256
n
476.03
1195.23
0.40x
256
y
397.13
542.3
0.73x
As can be seen, the sparse matrix computation does not beat the dense one even once. I rerun these experiments with torch 2.5.1+cu2.4 on a single H100 and observed similar results.
How come the values are this much worse?
The text was updated successfully, but these errors were encountered:
When checking out the conclusion of the tutorial for 2:4 sparsity here, the claimed advantage of 2:4 sparsity over dense execution is given as 1.3x-2.0x. However, when checking the actual values that are output in the dense and sparse section terminal sections we get the following table:
As can be seen, the sparse matrix computation does not beat the dense one even once. I rerun these experiments with torch 2.5.1+cu2.4 on a single H100 and observed similar results.
How come the values are this much worse?
The text was updated successfully, but these errors were encountered: