LLM-as-Judge Calibration Library

From GM-RKB
Jump to navigation Jump to search

A LLM-as-Judge Calibration Library is a python evaluation library that provides software tools and calibration techniques for measuring, adjusting, and improving the accuracy and reliability of llm-as-judge confidence scores, llm-as-judge probability estimates, and llm-as-judge uncertainty quantification in large language model judgment decisions.