Mobile Document Scanner
The main objective of this project is to build a document scanner with OpenCV and python. Given a photo of a document, our task is to give a top-down view of the document.
This task can be accomplished in step by step as in following block diagram:
- We load image using open cv2 library
- Under the preprocess stages we convert input RGB image to grayscale image and apply Gaussian Blurring over the gray image to remove high frequency noise
- We later perform Canny Edge detection
After performing these steps we obtain the following image.
Contour detection is very important step in this project which actually finds the four edges or outline of the document. It is not difficult as it sounds, we approach this problem with a simple understanding, “Any document is assumed to be a rectangle and any rectangle has has four edges”. Hence, our idea is to find the largest contour in the image with exactly four points of our document in image. Green line in the image below shows the contours.
The final step is to apply four point transformation means taking the four points obtained previously which represents the outline of the document and applying Perspective transformation to obtain a top-down, “birds eye view” of the image. Here we use a function called ‘four_point_transform’.
More Examples
Summary
In this project we showed how to build a mobile scanner by applying three major steps: Edge detection, Finding contours and four point transformation. Option advice is to apply thresholding on image to get a clean black and white image and then apply contours. There are few parameters which also determine how best we have an output black and white, they are the Gaussian blur inputs and Canny detection inputs. To know more information look for canny and gaussian blur documentation in opencv-python-tutorials.