OCR
This group contains functions for Optical Character Recognition.
The OCR functions identify characters in an image and return the recognized characters as a string. The classification is based on classic Machine Learning models. The models are provided in a trained form so that the functions can be used directly. No additional settings or training are required. The supported characters and fonts depend on the data used in the training. For this reason, there are various models whose range of functions, as well as the general preconditions and requirements, are described below.
For all characters that are recognized in an image, the models assign each character to a known class with the highest match. The result therefore always consists of the known characters of the model used. Rejection or removal of unknown characters is not included.
The functions only support single-line character strings; several lines must be divided into individual ROIs and read with several function calls.
General requirements for recognizing characters:
- Character height min. 20 pixels
- Line width min. 3 pixels
- Points min. 3 x 3 pixels
- Lines min. 3 x 6 pixels
- Character spacing / separation min. 4 pixels
- Characters must not overlap
- Only horizontal alignment / arrangement of characters max. ± 6° deviation
- The line / contour of a character must not be interrupted
- The characters must not be mirrored or reversed
General requirements for the image:
- ROI only with the characters to be recognized and an interference-free zone around the characters
- Within the ROI, the characters must not be enclosed by another contour such as a rectangle
- Good contrast between character and background
- Homogeneous, non-noisy or disturbed, non-transparent background
Requirements for the fonts:
- Only proportional fonts with equal character widths
- Larger gaps are always recognized as a single space
- Only sans serif fonts such as Arial, Tahoma, Courier, Univers, Frutiger, Verdana, OCR-B
- No mixed fonts
- No dot print or italic fonts
Models:
The enum ETcVnOcrModelType allows access to the following models:
TCVN_OMT_NUMBERS
- enables the classification of numbers
- contains the characters of
0-9
TCVN_OMT_NUMBERS_SC
- enables the classification of numbers and special characters
- contains the characters from
0-9
and 6 special characters. / - : = +
TCVN_OMT_UCLETTERS
- enables the classification of capital letters
- contains the characters of
A-Z
TCVN_OMT_NUMBERS_SC_UCLETTERS
- enables the classification of numbers, special characters and capital letters
- contains the characters from
0-9
, the 6 special characters. / - : =
+ andA-Z
When using the combined model (TCVN_OMT_NUMBERS_SC_UCLETTERS
), confusion can occur due to the great similarity of certain characters. Examples of this are O and 0, S and 5 and B and 8.
If the position of a number or letter is known, the Expert function with the formatting default (sPattern
) and the combination of the separate models should be used as an alternative. Alternatively, the characters of the result string (ipCharacters
) can also be analyzed individually, so that you can decide individually whether, for example, a 0 is also accepted instead of O.
Initialize functions
In order to be able to use the OCR function with one or more models, the function must first be initialized with the respective models using the function block FB_VN_InitializeFunction.
The time required to load the models depends on the performance of the IPC used and the model size and can take several hundred milliseconds or even several seconds. Therefore, cycle time overruns are to be expected with shorter task cycle times.
If the combined model TCVN_OMT_NUMBERS_SC_UCLETTERS
is to be used, the router memory must be set to at least 512 MB due to the file size.
The exact size depends on the other use of the router memory and any existing fragmentation. If the function block FB_VN_InitializeFunction
returns a return code 0x80004005
, for example, this indicates insufficient router memory. The router memory should therefore be checked first and increased if necessary. After loading a model, a large part of the memory is released again and is available for the application.
Interpretation of the HRESULT
- The standard function returns
S_OK
if characters are recognized on the image. With the Exp function, it depends on whether asPattern
has been specified. If no formatting specification has been defined, the return corresponds to the standard function. If a formatting specification has been passed,S_OK
is only returned if the recognized characters match the pattern specification. - If the standard function was executed successfully and no characters were found or the characters in the Exp function do not match the pattern specification,
S_FALSE
is returned. - If a
sPattern
entry is made in the Expert function for which the transferred model is missing, aHRESULT = ADSERR_DEVICE_NOTINIT
is returned. - If several models are passed to the standard function, a
HRESULT = ADSERR_DEVICE_INVALIDPARM
is always returned, as the function only supports one model per call. With the Expert function, it depends on whether asPattern
has been specified. IfsPattern
is empty, only theTCVN_OMT_NUMBERS_SC_UCLETTERS
model is used. If this model has not been passed, aHRESULT = ADSERR_DEVICE_INVALIDPARM
is returned.
Examples for the use of sPattern
The F_VN_OCRExp provides additional options with the parameters sPattern
and eOcrOptions
. Depending on the combination of characters on the input image, the information from sPattern
and eOcrOptions
and the recognized characters, S_FALSE
is returned if there is a mismatch. The characters of the result string (ipCharacters
) can then also be analyzed individually, as S_FALSE
is also returned if the length differs, for example. The samples refer to the use of the TCVN_OMT_NUMBERS_SC_UCLETTERS
model.
Characters on the ipSrcImage | sPattern | eOcrOptions | ipCharacters | HRESULT |
---|---|---|---|---|
12/34 | dd.dd |
| 12/34 | S_OK |
12534 | dd.dd |
| 12534 | S_OK |
12/34 | dd!dd |
| 1234 | S_OK |
12 34 | dd!d |
| 124 | S_OK |
12534 | dd!dd |
| 1234 | S_OK |
12 34 | dd_dd |
| 12 34 | S_OK |
AB12/ | uudd# |
| AB12/ | S_OK |
12 34 | dddd | WITHBLANKS | 12 34 | S_OK |
12/4 | dd#d | WITHBLANKS | 12/4 | S_OK |
12 34 | dd_dd | WITHBLANKS | 12 34 | S_OK |
12 34 56 | dd_dddd | WITHBLANKS | 12 34 56 | S_OK |
12 3 4 | dd.dd | WITHBLANKS | 12 3 4 | S_FALSE |
12 34 | dd!dd |
| 124 | S_FALSE |
12 34 | dd.dd |
| 1234 | S_FALSE |
1234 | !dddd |
| 234 | S_FALSE |
12 34 | dd_dd |
| 12.34 | S_FALSE |
12 3 | dd_dd |
| 123 | S_FALSE |
12 3 4 | dd_d_d |
| 123 4 | S_FALSE |
AB12/ | uddd# |
| AB12/ | S_FALSE |
AB12/ | uuddd |
| AB12/ | S_FALSE |