k-Means

The k-Means algorithm is one of the unsupervised learning methods and is used for cluster analysis. k-Means attempts to divide a random sample into k-clusters of the same variance; however, the number of clusters k must be known in advance. The algorithm scales well to a large number of samples and is one of the most widely used clustering methods.

Unsupervised means that the k-Means does not need to be trained with annotated (labeled) data. This property makes the algorithm very popular. As soon as the training has been executed and the clusters have been defined, new data can be assigned to the already known clusters in the inference.

Supported properties

ONNX support

So far, only export from Scikit-learn is supported. The specification of the ONNX Custom Attributes Key: “sklearn_model” value: “KMeans” is necessary for k-Means models so that the conversion step works in XML and BML.

k-Means 1:

Restriction

With classification models, only the output of the labels is mapped in the PLC. The scores/probabilities are not available in the PLC.

An example of the export of an ONNX file from Scikit-learn for use in TwinCAT can be found here: ONNX export of a k-Means.

Supported data types

A distinction must be made between "supported datatype" and "preferred datatype". The preferred datatype corresponds to the precision of the execution engine.

The preferred datatype is floating point 64 (E_MLLDT_FP64-LREAL).

When using a supported datatype, an efficient type conversion automatically takes place in the library. Slight losses of performance can occur due to the type conversion.

A list of the supported datatypes can be found in ETcMllDataType.