OCR

This sample shows how characters such as dates and batch numbers on product packaging can be recognized using the OCR (Optical Character Recognition) functionality. The code illustrates the application of the F_VN_OCR standard function as well as the F_VN_OCRExp extended function, including the required pre- and post-processing.

Explanation

The functions enable the recognition of alphanumerical characters in binary images (white characters on a black background). The classification is based on Machine Learning models with different function and character ranges.

The ETcVnOcrModelType enum can be used to access various models:

TCVN_OMT_NUMBERS: Numbers
TCVN_OMT_NUMBERS_SC: Numbers + special characters
TCVN_OMT_UCLETTERS: Capital letters
TCVN_OMT_NUMBERS_SC_UCLETTERS: Combined character set

The F_VN_OCRExp function offers extended options:

sPattern for defined format and character specifications (e.g. "uudd" for "AB12")
eOcrOptions additional ETcVnOcrOptions that influence the functionality and result output
ipBoundingBoxes for the return of bounding boxes for the recognized characters
ipConfidences for the confidence values per character

The complete description, the requirements and restrictions as well as further information can be found in the chapter OCR and under the respective functions.

Model initialization

For OCR recognition, the underlying character recognition model must be initialized before the first use. This is done via the FB_VN_InitializeFunction function block, whereby the function to be initialized must first be specified at eFunction. The models to be used later when calling up the OCR function are defined with nOptions. Several models can be initialized simultaneously with one function call.

F_VN_CheckFunctionInitialization can then be used to check whether the respective model has been initialized correctly. If the model is to be changed at runtime or is no longer required, the model can be deinitialized with F_VN_DeinitializeFunction in order to release the memory again.

Variables

bInitialized    : BOOL := FALSE;
fbInit          : FB_VN_InitializeFunction;
nReturnCode     : UDINT;

Code

IF NOT bInitialized THEN
    fbInit(
        eFunction := TCVN_IF_OCR,
        nOptions  := ETcVnOcrModelType.TCVN_OMT_NUMBERS_SC OR ETcVnOcrModelType.TCVN_OMT_UCLETTERS, 
        bStart    := TRUE);

    IF NOT fbInit.bBusy THEN
        fbInit(bStart := FALSE);
        IF NOT fbInit.bError THEN
            bInitialized := TRUE;
            nReturnCode := fbInit.nErrorId AND 16#FFF;
        ELSE
            nReturnCode := fbInit.nErrorId AND 16#FFF;
        END_IF
    END_IF
END_IF

Preprocessing

Some pre-processing steps are necessary or useful for using the OCR functions. First, an ROI must be defined that contains only the characters to be recognized in a line and an interference-free zone around the characters. Furthermore, this image region must be converted into a 1-channel binary image that contains only the white characters to be recognized on a black background. In this example, the F_VN_Threshold is used, which is applied normally or inverted depending on the background and the character color.

As images of different products with different scenarios are used in the sample project, some parameters such as stRoi and fThreshold were stored individually for each test image within the F_GetROI function and retrieved using the image name. In practice, the different parameter values are usually handled via a recipe in the user interface. The following image, in which the specified ROI is shown in red, is used to further describe the example application.

In addition to the necessary pre-processing, as shown in the example, filter operations for noise suppression and morphological operations such as opening or closing can be used to remove small disturbances and to smooth or complete character shapes. Contrast enhancement functions are also frequently used. The choice of suitable filter functions and parameters depends heavily on the specific properties of the image material, so other functions, e.g. from ImageColorAndContrastProcessing, ImageFiltering or ImageSegmentation, can also contribute to the improvement.

Another helpful step can be the removal of border objects with the F_VN_BrightBorderObjects and F_VN_SubtractImages functions. This eliminates distracting objects at the edge of the image region. These often occur when the ROI definition is imprecise, such as when parts of adjacent characters or lines are inadvertently included in the section, especially if the distances between them are quite small.

The following pre-processing steps are carried out in the sample project:

Code

// Set ROI
hr := F_VN_SetRoi_TcVnRectangle_UDINT(stRoi, ipBinaryImage, hr);

// Filter image 
hr := F_VN_CreateStructuringElement(ipStructElem, ETcVnStructuringElementShape.TCVN_SES_RECTANGLE, 3,3, hr);
hr := F_VN_MorphologicalOperator(ipBinaryImage, ipBinaryImage, ETcvnMorphologicalOperator.TCVN_MO_OPENING, ipStructElem, hr);

// Binarize image depending on bright or dark text color
IF bInvertImage THEN
    hr := F_VN_Threshold(ipBinaryImage, ipBinaryImage, fThreshold, 255, TCVN_TT_BINARY_INV, hr);
ELSE
    hr := F_VN_Threshold(ipBinaryImage, ipBinaryImage, fThreshold, 255, TCVN_TT_BINARY, hr);
END_IF

// Remove border objects
hr := F_VN_BrightBorderObjects(ipBinaryImage, ipThreshBorder, hr);
hr := F_VN_SubtractImages(ipBinaryImage, ipThreshBorder, ipBinaryImage, hr);

OCR application

After successful pre-processing, the OCR function can be used. In the sample project, the bUseExpFunction parameter can be used to switch between the standard and the expert version. The following description refers to the use of the expert version and the additional options.

The sPattern parameter is used to specify the character format to be recognized. This option is particularly useful if the sequence and type of characters to be recognized are known in advance. You can then specify which character is expected for each position in the character string. This gives the function the information about which recognition model should be used for each character, which can lead to better results. The specific models contain fewer characters, which reduces the risk of confusion. Furthermore, the function directly returns S_FALSE if the recognized characters do not match the default.

For the image shown, the pattern "uu#dddddddddd" is used to recognize the character string. This means that the TCVN_OMT_UCLETTERS model is used for the first two characters to identify capital letters. The function then uses the model TCVN_OMT_NUMBERS_SC for the following nine characters, whereby a special character is expected in the first position and a digit in each of the remaining eight positions.

Code

hrOCR := F_VN_OCRExp(
            ipSrcImage      := ipBinaryImage,
            eModel          := ETcVnOcrModelType.TCVN_OMT_NUMBERS_SC or ETcVnOcrModelType.TCVN_OMT_UCLETTERS,
            ipCharacters    := ipOCRResult, 
            sPattern        := sPattern,
            eOcrOptions     := eOcrOptions,
            ipBoundingBoxes := ipBoundingBoxes, 
            ipConfidences   := ipConfidences,
            hrPrev          := hr,
            fMinConfidence  => fMinConfidence);

// Check if characters were found
hr := F_VN_GetNumberOfElements(ipOCRResult, nNumberOfElements, hrOCR);
IF SUCCEEDED(hr) AND nNumberOfElements > 0 THEN
    // Export character to string
    hr := F_VN_ExportSubContainer_String(ipOCRResult, 0, sText, 255, hr);
    // Write text result to image
    hr := F_VN_PutText(sText, ipOriginalImage, stRoi.nX + 5, stRoi.nY + 25, TCVN_FT_HERSHEY_DUPLEX, 1, aGreenColor, hr);
    // Further processing …
END_IF

OCR result evaluation

After calling the OCR function, the HRESULT return value should first be evaluated and checked to see whether or how many characters were recognized. If S_OK is returned, the standard function has recognized characters, with the expert version it depends on whether a sPattern default has been passed. If there is a default, S_OK is only returned if the recognized characters match it. Otherwise, S_False is returned and you can examine the recognized characters yourself.

If characters are recognized, they can be exported to a string and, for example, drawn into an image for visual feedback. Furthermore, the number of characters, the entire string or individual characters can be compared with stored expected values. As these assessments are very individual and can be implemented with standard PLC functions, they are not used in this example.

When using the expert version, the individual classification confidences of the recognized characters can be exported, viewed and used as acceptance criteria for a more detailed analysis.

In order to be able to assign the boundaries, overlapping or possibly incorrectly recognized characters, it is possible to call up the bounding box modules of the individual recognized characters, evaluate them and draw in an image. When calculating and drawing, the previously set ROI may need to be taken into account.

Code

// Get number of Confidence elements and export them if array is large enough 
hr := F_VN_GetNumberOfElements(ipConfidences, nNumberOfElements, hr);

// Check if number of elements fits to array size
IF nNumberOfElements > 0 AND nNumberOfElements <= 12 THEN
    hr := F_VN_ExportContainer(ipConfidences, ADR(aConfidences), SIZEOF(aConfidences), hr);
END_IF

// Get bounding box rectangle and draw it to filtered and original image
hr := F_VN_GetNumberOfElements(ipBoundingBoxes, nNumberOfElements, hr);
IF nNumberOfElements > 0 THEN
    FOR nIterator := 0 TO nNumberOfElements -1 DO
        hr := F_VN_GetAt_TcVnRectangle_DINT(ipBoundingBoxes, stRectangle, nIterator, hr);
        hr := F_VN_DrawRectangle_TcVnRectangle_DINT(stRectangle, ipBinaryImage, aWhiteColor, 1, hr);
        // Add ROI Offset
        stRectangle.nX := stRectangle.nX + UDINT_TO_DINT(stRoi.nX);
        stRectangle.nY := stRectangle.nY + UDINT_TO_DINT(stRoi.nY);
        hr := F_VN_DrawRectangle_TcVnRectangle_DINT(stRectangle, ipOriginalImage, aBlueColor, 1, hr);
    END_FOR
END_IF

Presentation of results

The following image shows the result display of the expert function. The recognized characters are shown in green and the optional bounding box modules in blue.

The confidence values for each recognized character are available for further evaluation in the aConfidences array.