Lectures‎ > ‎

Lecture 07: Character Recognition



1. Cluster connected components of a binary page, aiming at discovering the letters. You can use scipy.ndimage.label() for connected component labeling. Define a metric on the characters; the suggested way is to rescale everything to standard size, then apply distance transform (see scipy.ndimage.distance_transform_edt) and penalize different pixels in two images by the distance transform value in some power (2 suggested). Based on pairwise measurements of your metric between the connected components, define a clustering. Visualize the result.

2. Given a bunch of alphabet letter images of a given height and a scanned line of the same height, apply level building algorithm to arrive at the best recognition. In more detail, consider a string of the letters in our alphabet and spaces; we define the rendering of this string as concatenation of letter samples we have, and we take 1-pixel wide clean image as the sample of the space. The goal is to find a string whose rendering 1) has the same size as our input image and 2) is the closest point to our input image in the Euclidean space of images of this size. This is equivalent to finding the shortest path in a graph where each node is an inter-column position, and each arc is a possibility of inserting a sample between the positions.
Thomas M. Breuel,
Jun 23, 2010, 8:27 AM
Ilya Mezhirov,
Jun 30, 2011, 8:14 AM