MT-Bench

From GM-RKB
Jump to navigation Jump to search

A MT-Bench is a LLM inference evaluation task that can be used to assess the multi-turn conversational and instruction-following capabilities of large language models (LLMs) through a curated set of open-ended prompts and automated grading by a strong LLM judge.



References

2023a

2023a

2023c

2023d