2:4 Sparsity acceleration does not deliver any benefit. #3236

Moritz-Tho123 · 2025-01-20T16:47:16Z

When checking out the conclusion of the tutorial for 2:4 sparsity here, the claimed advantage of 2:4 sparsity over dense execution is given as 1.3x-2.0x. However, when checking the actual values that are output in the dense and sparse section terminal sections we get the following table:

bs	compile	Dense	Sparse	Speedup
4	n	9.56	16.77	0.57x
4	y	8.98	9.49	0.95x
16	n	31.86	62.27	0.51x
16	y	30.83	34.29	0.90x
64	n	123.97	243.16	0.51x
64	y	104.98	133.49	0.79x
256	n	476.03	1195.23	0.40x
256	y	397.13	542.3	0.73x

As can be seen, the sparse matrix computation does not beat the dense one even once. I rerun these experiments with torch 2.5.1+cu2.4 on a single H100 and observed similar results.

How come the values are this much worse?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2:4 Sparsity acceleration does not deliver any benefit. #3236

2:4 Sparsity acceleration does not deliver any benefit. #3236

Moritz-Tho123 commented Jan 20, 2025

2:4 Sparsity acceleration does not deliver any benefit. #3236

2:4 Sparsity acceleration does not deliver any benefit. #3236

Comments

Moritz-Tho123 commented Jan 20, 2025