Transformer-Based OCR
As you probably already know, Optical Character Recognition (OCR) is the electronic conversion of images of typed, handwritten, or printed text into machine-encoded text. The source can be a scanned document, a photo of a document, or a subtitle text imposed on an image. OCR converts such sources into machine-readable text. Let’s understand how an OCR pipeline works before we dig deeper into Transformer Based OCR.
A typical OCR pipeline consists of two modules.