Questions tagged [ocr]

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text.

Optical Character Recognition (OCR) is a computer vision task where the input is an image containing characters and/or digits, and being able to recognize these characters, transforming into a machine representation that allows easier manipulation.

74 questions

votes

1 answer

How can you build a model that extracts data out from receipts?

I'm trying to build a model that is capable of identifying information on receipts and invoices. I have used google cloud vision api for text extraction from the receipt but the problem is it just returns all the text from a receipt. I am looking to…

asked Dec 28 '19 at 10:20

user_12

votes

1 answer

How can you build a model that reads out receipts and invoices?

The objective is to build a model that is capable of identifying information on receipts and invoices that can look completely different. I've had a discussion with my brother about the right approach. I have attached an example, here the original…

image-recognition information-retrieval ocr

asked May 19 '18 at 10:19

Spurious

votes

1 answer

Can I train two stacked models end-to-end on different resolutions?

Is it possible to stack two networks on top of each other that operate on different resolutions of input data? So here's my usecase: like Google, I want to recognize text in images. Unlike Google, I have access to very limited computational…

machine-learning neural-network rnn ocr cnn

asked Nov 17 '17 at 15:23

user42031

votes

1 answer

How to pass features extracted using CNN into RNN?

I have word images as below: Let's say it's a 256x64 image. My aim is to extract the text from the image as 73791096754314441539 which is basically what an OCR does. I am trying to build model which can recognise word from images. When I am saying…

neural-network deep-learning tensorflow rnn ocr

asked Jul 11 '17 at 06:31

lordzuko

votes

2 answers

Document Layout Analysis - state-of-the-art?

What is the current state-of-the art within document layout analysis? I.e. detecting columns, separating images from text, distinguishing between page numbering and text and so on. I am looking for papers and algorithms on the topic.

ocr

asked Jun 01 '17 at 06:19

sebastian

votes

1 answer

Pretrained handwritten OCR model

I've been looking around for pretrained models dedicated to handwritten OCR. So far I've found very little. Could you please share, if you know any? I find tesseract hard to parse anything that isn't arial and perfectly captured.

machine-learning deep-learning nlp ocr

asked Jan 17 '20 at 12:23

Piotr Rarus

votes

1 answer

Fake News Detection problem

I would like to work on a project for Fake News Detection especially for Indians news which are in different languages and different formats. Fake news as image with no or very less text Fake news on a blog site Fake news as Tweets Fake news in…

machine-learning deep-learning ocr

asked Jan 13 '20 at 09:55

Akash

votes

2 answers

how can I solve label shape problem in tensorflow when using one-hot encoding?

I used tensorflow to recognize text from natural images by using convolutional neural network; there is no specific number of characters in the text. To make a successful training I should convert the categorical labels into binary using one-hot…

neural-network tensorflow convolution ocr

asked Aug 21 '17 at 18:23

AB_

votes

1 answer

what is the loss function in char recognition using Tensorflow?

I have code in Tensorflow using convolution neural network to recognize the characters in street view Text (SVT) data. Since the label type is string, what should I use instead of tf.nn.sparse_softmax_cross_entropy_with_logits() in the loss…

deep-learning tensorflow convolution ocr

asked Aug 14 '17 at 19:26

AB_

votes

1 answer

What is representation in optical character recognition?

I am learning OCR and reading this book The authors define 8 processes to implement OCR that follow one by one (2 after 1, 3 after 2 etc): Optical scanning Location segmentation Pre-processing Segmentation Representation Feature…

feature-selection feature-extraction feature-engineering ocr

asked Jun 06 '17 at 18:12

Pavel_K

votes

2 answers

What is the best approach for specified optical character recognition?

I have a quite understandable request of extracting information (invoice number, invoice data, due date, total etc.) from scanned invoices (the digital format is image, not PDF), preferably in Python. The good thing is that the necessary information…

python deep-learning nlp regex ocr

asked Mar 23 '17 at 08:53

Hendrik

8,377
17
40
55

votes

0 answers

Multi-page document image classification

Sorry for the long post but I needed it to be able to capture all the details and questions. I am working on multi-page document image classification problem and am kind of confused on what approach or model architecture to follow. Here's is the…

deep-learning pytorch computer-vision ocr document-understanding

asked Jun 15 '23 at 00:26

asanoop24

votes

0 answers

What is the difference between ICR and OCR?

I've just found the term "Intelligent Character Recognition" (ICR) on Wikipedia and other pages. According to Wikipedia: In computer science, intelligent character recognition (ICR) is an advanced optical character recognition (OCR) or — rather…

reference-request ocr

asked Jul 23 '20 at 08:50

Martin Thoma

18,630
31
92
167

votes

0 answers

Extracting document templates from similar documents

Using very basic techniques (zone segmentation + OPTICS) I was able to organize a set of around 10^4 business documents (invoices, receipts) into hierarchy of clusters of documents of similar layout. Now, for each cluster, I would like to extract a…

computer-vision ocr similar-documents information-extraction

asked Jan 23 '20 at 16:42

dzieciou

votes

1 answer

How to segment old digitized newspapers into articles

I'm working on a large corpus of french daily newspapers from the 19th century that have been digitized and where the data are in the form of raw OCR text files (one text file per day). In terms of size, one year of issues is around 350 000 words…

nlp text-mining ocr

asked Aug 06 '19 at 15:52

Tetro

2 3 4 5 Next