LLM-based Text Summarization Algorithm

From GM-RKB
Jump to navigation Jump to search

An LLM-based Text Summarization Algorithm is a neural text summarization algorithm that uses a pre-trained LLM.



References

2023

  • (Adams, Fabbri et al., 2023) ⇒ Griffin Adams, Alexander Fabbri, Faisal Ladhak, Eric Lehman, and Noémie Elhadad. (2023). “From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting.” doi:10.48550/arXiv.2309.04269
    • SUMMARY:
      • It introduces the Chain of Density (CoD) prompting technique to generate dense GPT-4 summaries without extending their length.
      • It employs an iterative method where GPT-4 starts with an entity-sparse summary and then incorporates missing salient entities, maintaining the summary's original length.
      • It emphasizes that CoD summaries are more abstractive, show more fusion, and reduce lead bias compared to the summaries produced by a vanilla GPT-4 prompt.
      • High-level Algorithm:
        1. Generate Initial Summary: Prompt GPT-4 to produce a verbose, sparse initial summary with minimal entities.
        2. Identify Missing Entities: Extract 1-3 concise, relevant, novel entities from the source text not in previous summary.
        3. Fuse Entities: Prompt GPT-4 to rewrite previous summary fusing in missing entities without increasing length. Employ compression and abstraction techniques to make space.
        4. Iterate: Repeat Identify Missing Entities and Fuse Entities steps multiple times, incrementally densifying summary by packing in more entities per token through rewriting.
        5. Output Chain: The final output is a chain of fixed-length summaries with increasing density produced through iterative abstraction, fusion, and compression.