Skip to content

Issues: EleutherAI/lm-evaluation-harness

reproduce llama 3 evals
#2557 opened Dec 10, 2024 by baberabb
Open 6
Beta
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

np.NaN
#2935 opened Apr 27, 2025 by upunaprosk
Support for OlympiadBench and AMC
#2930 opened Apr 25, 2025 by Edric-Zhao
eval bbh failed
#2922 opened Apr 18, 2025 by godlikehhd
clean up process results bug Something isn't working.
#2920 opened Apr 16, 2025 by baberabb
GPQA Preprocessing Function Results in Incorrect Physics Equations bug Something isn't working. validation For validation of task implementations.
#2907 opened Apr 14, 2025 by ShayekhBinIslam
Filter not extracting choice selection correctly validation For validation of task implementations.
#2905 opened Apr 14, 2025 by 1jamesthompson1
Additional data for evaluation
#2896 opened Apr 9, 2025 by harshakokel
low eval accuracy with gguf
#2887 opened Apr 7, 2025 by jerryzh168
Hellaswag filenotfound error
#2886 opened Apr 7, 2025 by dsvilarkovic
LLaMA-4 evaluation
#2885 opened Apr 7, 2025 by jybbjybb
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.