AlpacaEval 2.0 Leaderboard

From GM-RKB
Jump to navigation Jump to search

A AlpacaEval 2.0 Leaderboard is a LLM benchmark leaderboard that evaluates instruction-following language models.



References

2024

  • (GitHub, 2024) ⇒ https://github.com/tatsu-lab/alpaca_eval
    • QUOTE: 🎉 AlpacaEval 2.0 is out and used by default! We improved the auto-annotator (better and cheaper) and use GPT-4 turbo as baseline. More details here. For the old version, set your environment variable IS_ALPACA_EVAL_2=False.

2024