Blockwise Approximate KV Cache Technique

From GM-RKB
(Redirected from Approximate Block Caching)
Jump to navigation Jump to search

A Blockwise Approximate KV Cache Technique is an approximate block-based model inference optimization technique that reuses key-value computations for stable sequence blocks while recomputing only for changing regions in blockwise approximate KV cache inference (enabling efficient caching in diffusion models).