openbench provides standardized, reproducible benchmarking for LLMs across 30+ evaluation suites (and growing) spanning knowledge, math, reasoning, coding, science, reading comprehension, health, long ...
From fine-tuning open source models to building agentic frameworks on top of them, the open source world is ripe with ...
Dokimos is an evaluation framework for LLM applications in Java. It helps you evaluate responses, track quality over time, and catch regressions before they reach production.
Abstract: Numerous methodologies have been introduced for code summarization and associated activities, including the utilization of large language model (LLM)-based code summarization, to aid ...
Abstract: Recent studies proposed to leverage large language models (LLMs) with In-Context Learning (ICL) to handle code intelligence tasks without fine-tuning. ICL employs task instructions and a set ...
NEW DELHI, Jan 12 (Reuters) - India proposes requiring smartphone makers to share source code with the government and make several software changes as part of a raft of security measures, prompting ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results