Histogram-based Gradient Boosting

A histogram-based Gradient Boosting model can be used both for classification and for regression.

The model is based on the Gradient Boosting; here, however, the continual inputs are discretized in bins with the help of a histogram. This hugely accelerates the training of the model, in particular with very large data sets.

Supported properties

ONNX support

Samples of the export of Hist Gradient Boosting models can be found here: ONNX export of Hist Gradient Boosting.

Histogram-based Gradient Boosting 1:

Classification limitation

With classification models, only the output of the labels is mapped in the PLC. The scores/probabilities are not available in the PLC.

Supported data types

A distinction must be made between "supported datatype" and "preferred datatype". The preferred datatype corresponds to the precision of the execution engine.

The preferred datatype is floating point 64 (E_MLLDT_FP64-LREAL).

When using a supported datatype, an efficient type conversion automatically takes place in the library. Slight losses of performance can occur due to the type conversion.

A list of the supported datatypes can be found in ETcMllDataType.