LLM-based System Evaluation Framework
Jump to navigation
Jump to search
A LLM-based System Evaluation Framework is a AI system evaluation framework to builder LLM-based system evaluation systems.
- Context:
- It can include various LLM-based System Performance Measures such as Perplexity, BLEU, ROUGE, METEOR, Human Evaluation, Diversity, and Zero-shot Evaluation.
- It can simply access with LMM-based System Benchmarks, such as: Big Bench, GLUE Benchmark, SuperGLUE Benchmark, MMLU, LIT, ParlAI, CoQA, LAMBADA, and HellaSwag.
- ...
- Example(s):
- Counter-Example(s):
- ...
- See: Large Language Model, Model Evaluation, OpenAI Moderation API, EleutherAI LM Eval,
References
.