Machine Learning

In Machine Learning, a model is optimized to solve a task. This process is known as training. Sophisticated data analyses can be carried out automatically with the help of trained models. This means that complex, manually created program constructs can be replaced. The Machine Learning models are created, trained and executed in the TwinCAT 3 real-time environment using the Machine Learning functions provided. The tasks that can be solved with it are diverse and can be roughly divided into five task areas (classification, anomaly detection, regression, cluster analysis and feature transformation). Depending on the task, different model types are available, each of which has general or task-specific advantages and disadvantages.

In Machine Learning, a distinction is made between supervised and unsupervised learning. Supervised learning requires additional information such as annotated (e.g. labeled) data sets so that the model can learn a mapping to known categories (classes) or to desired target variables. The model can then make a prediction of the categories or target variables for the unknown data. In contrast, unsupervised learning without additional information is based solely on the features themselves.

All classification and regression models can generally be assigned to supervised learning. In contrast, all models for anomaly detection and clustering are assigned to unsupervised learning. Both options are available for feature transformation, with the Linear Discriminant Analysis method based on supervised learning and Principal Component Analysis based on unsupervised learning.

The following assignment of model types to task areas helps with the choice of suitable models. It should be noted that the handling of machine learning in TwinCAT Vision is shown here specifically and not in general.

Model types	Classification	Anomaly detection	Regression	Clustering	Feature transformation
Boost Classifier (Boosting)	x
K-Means++ (KMPP)		x		x
K-Nearest Neighbors (KNN)	x	x	x
Linde-Buzo-Gray (LBG)		x		x
Linear Discriminant Analysis (LDA)					x
Normal Bayes Classifier (NBC)	x	x
Principal Component Analysis (PCA)					x
Random Forest (RTrees)	x		x
Simplified TopoART (STA)	x	x	x	x
Support Vector Machine (SVM)	x	x	x
Support Vector Machine- Stochastic Gradient Descent (SVM-SGD)	x

Only models that were created using one of the F_VN_Create functions from the Machine Learning group can be used. The models created and subsequently trained in this way can be saved or loaded to the hard disk via the function blocks FB_VN_WriteMlModel and FB_VN_ReadMlModel. The import of externally trained models, e.g. as ONNX files, is not intended.

Basic procedure for machine learning:

Analyze the problem and define the desired result:

Among other things, this determines the area of application and the model types that can be used.

Collecting data:

Training, validation and test data
The data must contain information that is suitable for solving the problem.
The data must cover all variants and characteristics that may occur later.
With supervised learning, the data must also be annotated.

Selection and extraction of suitable features:

Which features are unique, special or provide relevant information for differentiation?
Redundant features should be avoided.
A feature transformation can be helpful here.
Finally, a feature normalization should be carried out so that the value ranges of the different features are comparable and the requirements of the model are met.

Determination of a suitable model and the hyperparameters (validation and optimization).
(Training of the selected model with the determined hyperparameters on the training and validation data).
Final test with previously unused, independent data.

The data must be available in the form of samples. A sample is a vector consisting of features. The features can be both tangible variables of the associated image (e.g. color values) or contour values (e.g. in the form of the center of gravity) etc., as well as abstract variables (e.g. the result of a PCA), which describe the image statistically in a meaningful way, but which cannot be interpreted directly by the user. For the training of supervised models, additional information such as a class assignment is required for each sample. A collection of several samples is referred to as a batch.

Depending on the model type, functions for sample and / or batch training and prediction are available. Some models can be retrained in order to further optimize the model during runtime or to adapt it to new conditions.

Feature extraction

Feature extraction involves the extraction of numerical features from image data and is an essential component of a Machine Learning solution. A distinction must be made between whether a feature can be determined directly by a function, whether a combination of functions is required, or whether logical features of an object, such as the number of holes, can be used for differentiation. You will find an exemplary list in the chapter Feature extraction functions.

Feature normalization

The scaling of features for normalization is primarily performed to ensure that all features contribute equally to the learning process of the model. Features with larger values can dominate distance calculations, making distance-based models more sensitive to these features. To avoid this and to meet the requirements of the model, we generally recommend performing a feature normalization.

The function F_VN_GetFeatureScales provides three scaling options (ETcVnFeatureScalingType). You can also create and use your own scaling. The scaling is applied using the function F_VN_FeatureScaling.

	Same feature extraction and normalization It should be noted that all processing steps must be carried out both during training and when using a model. The functions must therefore be executed in both places and, if necessary, parameters must be saved for later use.

Further Information

Feature extraction functions