Hallucination Detection Metric
(Redirected from Groundedness Measure)
Jump to navigation
Jump to search
A Hallucination Detection Metric is an llm safety metric that is a factuality measure quantifying llm-generated content deviation from factual accuracy and source grounding.
- AKA: Hallucination Score, Factuality Metric, Groundedness Measure, LLM Faithfulness Score, Confabulation Metric.
- Context:
- It can typically identify Intrinsic Hallucinations through source contradiction, internal inconsistency, and logical conflict.
- It can typically detect Extrinsic Hallucinations via unverifiable claims, fabricated information, and unsupported statements.
- It can typically measure Factual Consistency using claim verification, fact checking, and evidence alignment.
- It can typically assess Source Faithfulness through citation accuracy, reference grounding, and context adherence.
- It can typically evaluate Semantic Groundedness via meaning preservation, information accuracy, and content alignment.
- ...
- It can often quantify Entity Hallucinations through person fabrication, place invention, and organization creation.
- It can often identify Numeric Hallucinations via statistic fabrication, date invention, and quantity error.
- It can often detect Relational Hallucinations using false connections, incorrect attributions, and relationship errors.
- It can often measure Contextual Hallucinations through context violation, scope deviation, and domain inconsistency.
- It can often assess Temporal Hallucinations via anachronism detection, timeline errors, and sequence violation.
- ...
- It can range from being a Binary Hallucination Detection Metric to being a Continuous Hallucination Detection Metric, depending on its hallucination score granularity.
- It can range from being a Reference-Based Hallucination Detection Metric to being a Reference-Free Hallucination Detection Metric, depending on its hallucination ground truth requirement.
- It can range from being a Token-Level Hallucination Detection Metric to being a Sentence-Level Hallucination Detection Metric, depending on its hallucination detection granularity.
- It can range from being a Domain-Specific Hallucination Detection Metric to being a General Hallucination Detection Metric, depending on its hallucination domain coverage.
- It can range from being a Real-Time Hallucination Detection Metric to being an Offline Hallucination Detection Metric, depending on its hallucination detection latency.
- ...
- It can utilize Natural Language Inference Models through entailment checking, contradiction detection, and consistency verification.
- It can employ Fact Verification Systems via knowledge base lookup, claim validation, and evidence retrieval.
- It can leverage LLM-as-Judge Techniques using hallucination prompts, factuality scoring, and groundedness assessment.
- It can implement Uncertainty Quantification through confidence scoring, perplexity measurement, and entropy calculation.
- ...
- Example(s):
- Model-Based Hallucination Metrics, such as:
- Benchmark-Based Hallucination Metrics, such as:
- Reference-Based Hallucination Metrics, such as:
- Statistical Hallucination Metrics, such as:
- Task-Specific Hallucination Metrics, such as:
- ...
- Counter-Example(s):
- Fluency Metric, which measures text quality rather than factual accuracy.
- Relevance Metric, which assesses topical alignment rather than truthfulness.
- Coherence Metric, which evaluates logical flow rather than factual grounding.
- Creativity Metric, which rewards novelty rather than factual constraint.
- See: LLM Safety Metric, Hallucinated Content, Fact Verification, Retrieval-Augmented Generation, LLM Evaluation Method, LLM-as-Judge, Source Grounding, Natural Language Inference, HaluEval Benchmark, TruthfulQA, AI Safety.