And what does it do
Our BIQE Archive version is for those who want to digitally preserve the paper version for the future.
This pre- and post-processing software uses a batch process, i.e. you process a few pages of a book/document and apply it to the whole document or book in batch.
Scanning
The first and most important step in the digitization process is scanning. The better the scan, the easier the post-processing will be.
Poor quality scans lose a lot of pixel information in the post-processing, resulting in a poorer OCR result.
For older works, always scan in color at 600 dpi.
Note:
BIQE Archive has a Twain driver that automatically recognizes your scanner. There is also our API driver for the more professional (more expensive) scanners.
Split and crop
The red line in the center divides the pages of the book into two.
The green frame is the crop of the left page.
The blue frame is the crop of the right page. (fig. 1)
Figure 1
Image profiles
To use batch processing, you must first create and save an image profile.
This means that you can choose from 39 different processing or image filters, which you place in a specific order for the best OCR result.
You can also use a particular image filter more than once in a profile, such as the Despeckle or Dewarp filters.
Optimal Character Recognition (OCR)
The ultimate goal of the previous steps is to achieve the best possible OCR result, so that the images you have scanned or pasted into the Galary will be searchable.
This means that after post-processing, we need to write the processed images to a specific file type that will allow us to search the OCR result.
The best known format type is the pdf file type. In BIQE Archive, you can save the images not only in PDF, but in many other file types (fig. 4), and you can choose (add) the appropriate OCR language to go with the different file types in BIQE Archive (fig 5).the