So, I am working on a project where I have to extract sales tax invoice from the pdf document which contains other files along with the invoice. I researched on the topic, and am considering two solutions.
- Converting pdf to images and then performing image classification with vgg-16 etc.
- Using transformer model for document classification, it would convert the image/pdf page to text and then classify them. Both solutions have latency issues, in the first solution we'll have to convert pdf2image which slow and second uses ocr so it is also slow. So, I need some advice on how to approach this problem.