Lectures‎ > ‎

Lecture 06: Document Image Analysis



RapidMiner Exercise:
a) Install RapidMiner at your machine (download source: http://rapid-i.com/content/view/26/84/)
b) Make a pattern recognition system as illustrated in exercise
c) Split the attached data as 70% training , 30% testing 
d) send me the best percentage accuracy, the model used, parameter settings and time to complete the activity after installation of RapidMiner

Use Python for the following exercises.

Exercise 6.2 (Binarization)
Implement Otsu binarization algorithm and apply it on the image: sample.png.

Exercise 6.3 (Page Segmentation)
Implement the run-length smearing algorithm with all four steps:
    • Horizontally smear the binary image obtained in the previous task.
    • Vertically smear the binary image obtained in the previous task.
    • Obtain an image by performing pixel-wise AND operation on the two smeared images.
    • Horizontally smear the obtained image to obtain the final bitmap.

Exercise 6.4 (Connected Component Analysis)
Perform connected component analysis on the final image to extract image segments. Draw rect-
angular bounding boxes around the segments in the original input image.

Ilya Mezhirov,
May 26, 2011, 9:52 AM
Thomas M. Breuel,
May 27, 2010, 4:42 AM
Thomas M. Breuel,
May 20, 2010, 11:43 AM