Skip to content

Commit 6227562

Browse files
authored
feat: add group size 3 to GQA decode dispatch (#558)
Llama 3.2 3B comes with 24 qo heads and 8 kv heads.
1 parent 9e10936 commit 6227562

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

include/flashinfer/utils.cuh

+3
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,9 @@
126126
} else if (group_size == 2) { \
127127
constexpr size_t GROUP_SIZE = 2; \
128128
__VA_ARGS__ \
129+
} else if (group_size == 3) { \
130+
constexpr size_t GROUP_SIZE = 3; \
131+
__VA_ARGS__ \
129132
} else if (group_size == 4) { \
130133
constexpr size_t GROUP_SIZE = 4; \
131134
__VA_ARGS__ \

0 commit comments

Comments
 (0)