Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

Terminal-bench: a benchmark for AI agents in terminal environments

tbench.ai

3 points by cpard 4 hours ago