Vision Transformer Architecture

From GM-RKB
Jump to navigation Jump to search

A Vision Transformer Architecture is an image processing patch-based transformer-based neural network architecture that processes visual inputs by treating image patches as vision transformer token sequences.