Questions tagged [ocr]

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text.

Optical Character Recognition (OCR) is a computer vision task where the input is an image containing characters and/or digits, and being able to recognize these characters, transforming into a machine representation that allows easier manipulation.

74 questions
9
votes
1 answer

How can you build a model that extracts data out from receipts?

I'm trying to build a model that is capable of identifying information on receipts and invoices. I have used google cloud vision api for text extraction from the receipt but the problem is it just returns all the text from a receipt. I am looking to…
8
votes
1 answer

How can you build a model that reads out receipts and invoices?

The objective is to build a model that is capable of identifying information on receipts and invoices that can look completely different. I've had a discussion with my brother about the right approach. I have attached an example, here the original…
Spurious
  • 181
  • 1
  • 3
6
votes
1 answer

Can I train two stacked models end-to-end on different resolutions?

Is it possible to stack two networks on top of each other that operate on different resolutions of input data? So here's my usecase: like Google, I want to recognize text in images. Unlike Google, I have access to very limited computational…
user42031
  • 63
  • 4
6
votes
1 answer

How to pass features extracted using CNN into RNN?

I have word images as below: Let's say it's a 256x64 image. My aim is to extract the text from the image as 73791096754314441539 which is basically what an OCR does. I am trying to build model which can recognise word from images. When I am saying…
lordzuko
  • 207
  • 2
  • 7
6
votes
2 answers

Document Layout Analysis - state-of-the-art?

What is the current state-of-the art within document layout analysis? I.e. detecting columns, separating images from text, distinguishing between page numbering and text and so on. I am looking for papers and algorithms on the topic.
sebastian
  • 211
  • 1
  • 7
5
votes
1 answer

Pretrained handwritten OCR model

I've been looking around for pretrained models dedicated to handwritten OCR. So far I've found very little. Could you please share, if you know any? I find tesseract hard to parse anything that isn't arial and perfectly captured.
Piotr Rarus
  • 814
  • 4
  • 15
4
votes
1 answer

Fake News Detection problem

I would like to work on a project for Fake News Detection especially for Indians news which are in different languages and different formats. Fake news as image with no or very less text Fake news on a blog site Fake news as Tweets Fake news in…
Akash
  • 235
  • 2
  • 7
4
votes
2 answers

how can I solve label shape problem in tensorflow when using one-hot encoding?

I used tensorflow to recognize text from natural images by using convolutional neural network; there is no specific number of characters in the text. To make a successful training I should convert the categorical labels into binary using one-hot…
AB_
  • 185
  • 2
  • 7
4
votes
1 answer

what is the loss function in char recognition using Tensorflow?

I have code in Tensorflow using convolution neural network to recognize the characters in street view Text (SVT) data. Since the label type is string, what should I use instead of tf.nn.sparse_softmax_cross_entropy_with_logits() in the loss…
AB_
  • 185
  • 2
  • 7
4
votes
1 answer

What is representation in optical character recognition?

I am learning OCR and reading this book The authors define 8 processes to implement OCR that follow one by one (2 after 1, 3 after 2 etc): Optical scanning Location segmentation Pre-processing Segmentation Representation Feature…
4
votes
2 answers

What is the best approach for specified optical character recognition?

I have a quite understandable request of extracting information (invoice number, invoice data, due date, total etc.) from scanned invoices (the digital format is image, not PDF), preferably in Python. The good thing is that the necessary information…
Hendrik
  • 8,377
  • 17
  • 40
  • 55
3
votes
0 answers

Multi-page document image classification

Sorry for the long post but I needed it to be able to capture all the details and questions. I am working on multi-page document image classification problem and am kind of confused on what approach or model architecture to follow. Here's is the…
2
votes
0 answers

What is the difference between ICR and OCR?

I've just found the term "Intelligent Character Recognition" (ICR) on Wikipedia and other pages. According to Wikipedia: In computer science, intelligent character recognition (ICR) is an advanced optical character recognition (OCR) or — rather…
Martin Thoma
  • 18,630
  • 31
  • 92
  • 167
2
votes
0 answers

Extracting document templates from similar documents

Using very basic techniques (zone segmentation + OPTICS) I was able to organize a set of around 10^4 business documents (invoices, receipts) into hierarchy of clusters of documents of similar layout. Now, for each cluster, I would like to extract a…
2
votes
1 answer

How to segment old digitized newspapers into articles

I'm working on a large corpus of french daily newspapers from the 19th century that have been digitized and where the data are in the form of raw OCR text files (one text file per day). In terms of size, one year of issues is around 350 000 words…
Tetro
  • 21
  • 2
1
2 3 4 5