LLM-based Text Summarization Algorithm: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replacement - "[[::" to "[[")
m (Text replacement - " GPT-4 " to " GPT-4 ")
Line 10: Line 10:


=== 2023 ===
=== 2023 ===
* ([[Adams, Fabbri et al., 2023]]) ⇒ [[Griffin Adams]], [[Alexander Fabbri]], [[Faisal Ladhak]], [[Eric Lehman]], and [[Noémie Elhadad]]. ([[2023]]). “[https://arxiv.org/pdf/2309.04269.pdf From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting].”  [http://dx.doi.org/10.48550/arXiv.2309.04269 doi:10.48550/arXiv.2309.04269]  
* ([[Adams, Fabbri et al., 2023]]) ⇒ [[Griffin Adams]], [[Alexander Fabbri]], [[Faisal Ladhak]], [[Eric Lehman]], and [[Noémie Elhadad]]. ([[2023]]). “[https://arxiv.org/pdf/2309.04269.pdf From Sparse to Dense: [[GPT-4]] Summarization with Chain of Density Prompting].”  [http://dx.doi.org/10.48550/arXiv.2309.04269 doi:10.48550/arXiv.2309.04269]  
** SUMMARY:
** SUMMARY:
*** [[2023_FromSparsetoDenseGPT4Summarizat|It]] introduces the [[Chain of Density (CoD)]] prompting technique to generate dense [[GPT-4]] summaries without extending their length.
*** [[2023_FromSparsetoDenseGPT4Summarizat|It]] introduces the [[Chain of Density (CoD)]] prompting technique to generate dense [[GPT-4]] summaries without extending their length.
Line 16: Line 16:
*** [[2023_FromSparsetoDenseGPT4Summarizat|It]] emphasizes that CoD summaries are more abstractive, show more fusion, and reduce lead bias compared to the summaries produced by a vanilla [[GPT-4]] prompt.
*** [[2023_FromSparsetoDenseGPT4Summarizat|It]] emphasizes that CoD summaries are more abstractive, show more fusion, and reduce lead bias compared to the summaries produced by a vanilla [[GPT-4]] prompt.
*** High-level Algorithm:
*** High-level Algorithm:
***# Generate Initial Summary: Prompt GPT-4 to produce a verbose, sparse initial summary with minimal entities.
***# Generate Initial Summary: Prompt [[GPT-4]] to produce a verbose, sparse initial summary with minimal entities.
***# Identify Missing Entities: Extract 1-3 concise, relevant, novel entities from the source text not in previous summary.
***# Identify Missing Entities: Extract 1-3 concise, relevant, novel entities from the source text not in previous summary.
***# Fuse Entities: Prompt GPT-4 to rewrite previous summary fusing in missing entities without increasing length. Employ compression and abstraction techniques to make space.
***# Fuse Entities: Prompt [[GPT-4]] to rewrite previous summary fusing in missing entities without increasing length. Employ compression and abstraction techniques to make space.
***# Iterate: Repeat Identify Missing Entities and Fuse Entities steps multiple times, incrementally densifying summary by packing in more entities per token through rewriting.
***# Iterate: Repeat Identify Missing Entities and Fuse Entities steps multiple times, incrementally densifying summary by packing in more entities per token through rewriting.
***# Output Chain: The final output is a chain of fixed-length summaries with increasing density produced through iterative abstraction, fusion, and compression.
***# Output Chain: The final output is a chain of fixed-length summaries with increasing density produced through iterative abstraction, fusion, and compression.

Revision as of 19:01, 26 October 2023

An LLM-based Text Summarization Algorithm is a neural text summarization algorithm that uses a pre-trained LLM.



References

2023

  • (Adams, Fabbri et al., 2023) ⇒ Griffin Adams, Alexander Fabbri, Faisal Ladhak, Eric Lehman, and Noémie Elhadad. (2023). “From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting.” doi:10.48550/arXiv.2309.04269
    • SUMMARY:
      • It introduces the Chain of Density (CoD) prompting technique to generate dense GPT-4 summaries without extending their length.
      • It employs an iterative method where GPT-4 starts with an entity-sparse summary and then incorporates missing salient entities, maintaining the summary's original length.
      • It emphasizes that CoD summaries are more abstractive, show more fusion, and reduce lead bias compared to the summaries produced by a vanilla GPT-4 prompt.
      • High-level Algorithm:
        1. Generate Initial Summary: Prompt GPT-4 to produce a verbose, sparse initial summary with minimal entities.
        2. Identify Missing Entities: Extract 1-3 concise, relevant, novel entities from the source text not in previous summary.
        3. Fuse Entities: Prompt GPT-4 to rewrite previous summary fusing in missing entities without increasing length. Employ compression and abstraction techniques to make space.
        4. Iterate: Repeat Identify Missing Entities and Fuse Entities steps multiple times, incrementally densifying summary by packing in more entities per token through rewriting.
        5. Output Chain: The final output is a chain of fixed-length summaries with increasing density produced through iterative abstraction, fusion, and compression.