-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k-quants with super-block size of 64 #2001
Merged
+1,915
−236
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
d2f12ac
k_quants: WIP super-blocks with 64 weights
Kawrakow 9fe2a2b
k_quants: WIP super-blocks with 64 weights
Kawrakow 1f6195c
k_quants: WIP super-blocks with 64 weights
Kawrakow aebd547
k_quants: WIP super-blocks with 64 weights
Kawrakow 2b2ab31
k_quants: WIP super-blocks with 64 weights
Kawrakow bcf8c5c
k_quants: WIP super-blocks with 64 weights
Kawrakow c6c3536
k_quants: WIP super-blocks with 64 weights
Kawrakow 5aae4b8
k_quants: WIP super-blocks with 64 weights
Kawrakow 41e46ec
k_quants: WIP super-blocks with 64 weights
Kawrakow 460dd84
k_quants: WIP super-blocks with 64 weights
Kawrakow 3bd9ae7
k_quants: WIP super-blocks with 64 weights
Kawrakow 03f30c8
k_quants: WIP super-blocks with 64 weights
Kawrakow cda47a6
k_quants: WIP super-blocks with 64 weights
Kawrakow 80c75fe
k_quants: WIP super-blocks with 64 weights
Kawrakow 2b2a13c
k_quants: WIP super-blocks with 64 weights
Kawrakow 9d27d8d
k_quants: WIP super-blocks with 64 weights
Kawrakow 2ff543c
k_quants: WIP super-blocks with 64 weights
Kawrakow d92c5a9
k_quants: WIP super-blocks with 64 weights
Kawrakow fae24af
k_quants: WIP super-blocks with 64 weights
Kawrakow e1bbcfc
k_quants: WIP super-blocks with 64 weights
Kawrakow 167a0bb
k_quants: WIP super-blocks with 64 weights
Kawrakow 6081a65
k_quants: WIP super-blocks with 64 weights
Kawrakow ff83e32
k_quants: WIP super-blocks with 64 weights
Kawrakow 285eeb1
k_quants: WIP super-blocks with 64 weights
Kawrakow 8b98d01
k_quants: call them _K, not _k, also on Metal
Kawrakow 558a194
k_quants: correctly define QK_K in llama.cpp
Kawrakow 333ffcc
Fixed bug in q4_K quantization added with the 64-block addition
Kawrakow 88412a1
Simplify via lambda
Kawrakow aeefd4e
k_quants: swicth Q3_K to 4-bit scales when QK_K = 64
Kawrakow ce19b96
k_quants: switch Q4_K to 4-bit scales when QK_K = 64
Kawrakow 4f61506
k_quants: forgot to add the Metal changes in last commit
Kawrakow ccf4901
k_quants: change Q5_K to be type 0 when QK_K = 64
Kawrakow 2da3a59
k_quants: AVX2 implementation for new 64-weight Q5_K
Kawrakow 53e81ca
k_quants: 10% faster ARM_NEON Q5_K dot product
Kawrakow 5fd8337
k_quants: fixed issue caused by merging with master
Kawrakow File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next
Next commit
k_quants: WIP super-blocks with 64 weights
commit d2f12ac354552bcfba1dbc9c8593296d81b70452
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postponed for later or did you missed to implement this?