News
Newest
Ask
Show
Jobs
Open on GitHub
Why averaging LLM benchmark scores is fundamentally broken
(arxiv.org)
1 points | by
testofschool
2 hours ago
1 comments
testofschool
2 hours ago
[flagged]
1 comments