Detecting the text within the rectangular box
Here I collected the datasets and processed these datasets to get better results. For my work, I collected the signature dataset on the A4 sheet of paper. Each paper had about 12 boxes in which it collected the data. Recognizing the location of the object for pre-processing was the main problem. Primarily, the location of the data extracted from the box for pre-processing. Here the explanation of the steps involved in detecting the boxes are given below:
For the extraction, I had followed the function provided by https://medium.com/coinmonks/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26. Where the author showed the extraction process of the data that were in row-column format.
The image which we upload can comprise noises which may give inappropriate results. So it becomes important to pre-process the data to get better results. The pre-processing involves the following steps:
The image might be in any format i.e. RGB format, it should be converted into grayscale because computer understands only binary values. Then apply the thresholding on the image which helps in partitioning the image between the data in the image and the noises and it also isolates the data by converting the grayscale image into binary values. The image after applying the thresholding it looks like
Now from the threshold image, we need to get the data for which we will use morphological operation. It would help us in detecting the data in rows and columns. The data will look something like:
From the above images, it's clear that the data gets erased and which according to the author gives us accurate results when contours method is applied. findContours() method is used in arranging/ to sort the data in the top-bottom approach. findContours() will detect the data which would be error-free and could be used for extracting the features, as the newly found images are saved in a different folder. This folder will comprise data/ images the user may not require, we can remove those data according to the one the user may require. In my code, I got nearly 32 images out of which 12 were unnecessary for my work.
The entire source code of this work is on my GitHub page