Select Region
Global Web Site - English
North America - English
Western Europe - English
Western Europe - Deutsch
Western Europe - Français
Western Europe - Italiano
Russia - Русский
Ukraine - Русский
China - 中文
Brazil - Português
Image Manipulation and Pre-processing
The ABBYY FineReader Engine 8.0 for Mac can receive images two ways: directly from memory or opening from files. It supports major imaging formats, including multi-page TIFFs, JPEG 2000 (part 1), and works with black-and-white, grayscale and color images. It also opens PDF files.
The FineReader Engine 8.0 for Mac also saves original and modified images into various formats. For the full list of input/output image formats, see the Specifications section.
Upon receiving images, the FineReader Engine 8.0 for Mac can perform the following pre-processing functions to improve the recognition:
Innovative Adaptive Binarization technology dynamically adjusts threshold of brightness for each image fragment during the recognition. And by this usage of individual recognition parameters it produces significantly accurate recognition results for documents with gray or color variable contrast background and textures.
* Auto-detection of page orientation (90, 180, 270 degrees).
This feature is very important if it is unknown which direction the image is scanned in. The FineReader system automatically detects the orientation of each page and corrects it, if needed.
* Adaptive image pre-processing for camera images. The new technology applies different processing algorithms and correct specific image distortion typically seen in digital camera images. This provides an average improvement of 40% better accuracy in digital camera OCR.
* Despeckling of an image in individual blocks (or zones), with the ability to specify the size of black dots.
The FineReader Engine also offers a number of useful pre-processing functions, allowing to manipulate images such as "image scaling", "image clipping", "creating previews", "rotating (90, 180, 270 degrees)", "mirroring" and "inverting".
Back to Top

Document Analysis and Full Layout Retention
The document analysis function set of FineReader Engine API solves such tasks as automatic document conversion with full-page layout retention, zoning OCR with manually located blocks, etc. It includes:
* auto-detection of page orientation - 90, 180, 270 degrees;
* auto-detection of text blocks, tables, barcodes and pictures;
* auto-detection of vertical text in table cells;
* manual block zoning (adding, removing and editing blocks);
The unique FineReader Engine features are:
* Document Analysis for Archiving Tasks
This function automatically detects and recognizes all text on documents including text embedded in pictures, charts, and diagrams. Developers may choose to use this mode of document analysis to extract exhaustive full-text information on documents needed for document index building (as in DMS, CMS, Archiving systems).
* Document Analysis for Invoices
A special document analysis function designed as a preprocessing engine for converting semi-structured documents, such as invoices, payment drafts, checks, transfers, business cards, agreement, health claim forms, resumes, etc. In this preprocessing role, this function has been designed to find as much text on these documents as possible, including characters and numbers — even if this information is located within stamps, pictures, logos or small-text areas. Unlike in standard full-page document analysis, this specialized document analysis assumes all printed information on the documents is text. It also ensures that important text information is not identified as graphic elements and that words or numerical values are not separated into multiple characters. As a result, maximum information about the text, including its coordinates, is available for analysis, field-by-field processing and parsing at subsequent processing stages by other systems.
* Export to Multiple Formats including PDF, RTF, RTF/DOC/WordML XML, and HTML with exact layout retention.
Back to Top

Recognition
OCR
* Recognize 175 languages for OCR.
* 170 languages with Latin, Cyrillic, Greek, Armenian and Hebrew characters.
* 43 languages have dictionary/morphology support.
* Recognition of multilingual documents.
* Recognition of dot-matrix document.
* Recognition of typewritten documents.
* Fast mode recognition.
Designed for high-volume document processing applications where speed is more important than accuracy. This mode increases processing speed by 200-250%, making it particularly useful with document management and archiving systems.
* Recognition of OCR-A and OCR-B.
* FineReader XIX.
There are many old documents, books, and newspapers published in the 17-20th century all over the world. The set of functions called “FineReader XIX” provides a unique capability to recognize texts published in the period from 1600 till 1937 in English, French, German, Italian, and Spanish. FineReader XIX supports special fonts such as Fraktur, Schwabacher and the majority of Gothic fonts.
For the entire list of supported OCR languages, see the Specifications section.
Barcode Recognition
* Recognition of 1D barcodes.
For the full list of supported 1D barcodes, see the Specifications.
* 2D barcode recognition (PDF417).
The 2D Barcode recognition recognizes PDF417, the industry standard for 2D barcodes. PDF417 encodes up to 1.1 kilobytes of data, including text and graphics information.
* Fast barcode extraction.
This feature enables automatic finding and recognizing barcodes at any angle on a document. It works both for 1D and 2D barcodes.
Field-level/Zonal Recognition Support
The SDK provides powerful field-level/zonal recognition capabilities ensuring accuracy and speed enhancement on small fields/zones. This functionality is crucial for processing tasks like data extraction, key-word indexing, and keyword classification. Key functionality for field level or zonal recognition includes multilingual OCR and barcode recognition, including:
* Definition of field content by setting alphabets and dictionaries.
* Detection of in-filed spacing - Detection and recognition of fields where the spaces are allowed. The FineReader Engine 8.0 also allows use of dictionaries which contain word-combinations with spaces.
* Text block despeckle, with the ability to specify the size of white or black «garbage».
* External recognition tuning features - Provides integrators with multiple word-level and character-level hypothesis and allowing integrators to influence the hypothesis choice by inserting additional ranking criteria during the recognition process.
Back to Top
PDF Conversion
The SDK includes powerful PDF conversion technology with extensive functions for PDF input and output including:
* PDF Security and Encryption Support:
o The SDK supports a variety of PDF security settings, increasing its applicability for government agencies and other organizations demanding high security.
o "Open File" password settings designed to prevent unauthorized access to a document.
o Restriction of certain operations, such as printing, editing or extracting file content, by assigning permission passwords.
o Support for the latest encryption standards.
* Output in tagged PDF format – that can be "reflowed" to fit different page or screen widths. Ideal for use with handheld devices (PDAs) or screen readers typically used by visually impaired users.
* Page size – Ability to set the size for all pages of a output PDF file.
* Links in PDF files – Re-creates hyperlinks within a PDF file.
Back to Top
Development Platform Functionality and Throughput Management
The FineReader Engine 8 for Mac provides several features allowing integrators to achieve optimal recognition accuracy and processing speed for their applications. It supports balance processing mode and provides «ready-to-load» samples to reduce time for choosing proper parameters for common usage scenarios (e.g. conversion to searchable PDF, field-level recognition, archiving, and indexing).
* Easy-to-use development tools.
In addition to API, the FineReader Engine 8.0 provides a popular Command Line Interface (CLI) and «ready-to-use» samples for rapid implementation.
* External recognition tuning features:
+ Providing integrators with multiple word-level and character-level hypothesis,
+ The ability to influence the hypothesis choice by inserting additional ranking criteria during the recognition process;
* Throughput management - The FineReader Engine 8.0 provides several features allowing integrators to achieve optimal recognition accuracy and processing speed for their applications. The speed-accuracy balance can be adjusted due to 3 content processing modes - thorough, balanced and fast.
* CLI-based 24/7 service, a background OCR process useful for customers who works in multitasking mode (e.g. ASPs and in-house services).
Back to Top
Receiving and Exporting Recognized Text
The FineReader Engine API provides a wide range of options for export of recognition results, including different levels of document reconstruction:
* A set of different levels of text format retention during export to external formats (from simple text with no formatting to complete page layout retention, including columns, tables, frames, fonts, font size, paragraph styles, borders, etc.).
* Providing access to detailed information about each recognized character.
* A set of functions to post-editing and post-formatting of the recognized text before its exporting.
* Exporting recognized text into various formats (full list of formats see in Specifications).
* Retaining full page layout of documents.
* Replacing uncertain characters with their corresponding images when saving in PDF format.
* Retaining picture and text color in full.