Multitask, Concurrent Execution and OpenMP

In Simulink® you can configure your models to run on Multicore Target systems. Further details can be found in the MathWorks® documentation. Beckhoff targets usually offer a multi-core architecture, which can be used efficiently with TwinCAT 3. This is also possible with the TwinCAT Target for Simulink® as shown below.

A distinction is made in this description between Multitask, Concurrent Execution and OpenMP.

Multitask and Concurrent Execution

The following multirate system in Simulink® is considered for the descriptions of the options Multitask and Concurrent Execution. The model has an explicit and an implicit rate transition.

Multitask, Concurrent Execution and OpenMP 1:

Go to Configuration Parameters and select Solver. Here you can choose between:

Multitask, Concurrent Execution and OpenMP 2:

Treat each discrete rate as separate task: Multitask

If a TcCOM object is created with the Treat each discrete rate as separate task option enabled, you will get an object to which you can assign multiple task contexts. 3 tasks in this case.

Multitask, Concurrent Execution and OpenMP 3:

The inputs, outputs and all other DataAreas are divided into the different contexts, so there are 3 Input DataAreas and 3 Output DataAreas in this case.

Multitask, Concurrent Execution and OpenMP 4:

In this case, the cyclic tasks must all be placed on the same core. There is no parallel processing of the tasks.

The advantage over a TcCOM with only one task interface is that now not all calculations have to be completed within the fastest task cycle time (see Scheduling). If the above Simulink model were created with default setting without Treat each discrete rate as separate task, only one task with 10 ms (fastest task) would be linkable. This means that all calculations must be completed within this time. By distributing to multiple tasks on the same core, this rule is disabled because tasks can interrupt each other (see Priorities).

Multitask, Concurrent Execution and OpenMP 5:

Properties:

Scheduling Details:

The graphic below describes an example of how the computing times can be distributed. The hatched areas indicate that a task may not work during this time due to a higher priority task. The full blue area indicates that the task is working. Note that the surfaces have only been subsequently overlaid on the real-time monitor image to aid comprehension and are not real images.

Multitask, Concurrent Execution and OpenMP 6:

If cycle time overruns occur and scheduling cannot be adhered to, the execution of the respective task context is skipped until all relevant contexts are in the appropriate state. In the TcCOM object this behavior can be observed via the online parameter SkippedExecutionCount.

Allow tasks to execute concurrently on target: Concurrent Execution

If a TcCOM object is created with the Allow tasks to execute concurrently on target option enabled, you will get an object to which you can assign multiple task contexts. In this case, as in the example above, 3 tasks.

Again, the DataAreas are separated into the different contexts. The difference to the multitask object is that you can now distribute the tasks to different cores so that the processing is actually parallelized.

Multitask, Concurrent Execution and OpenMP 7:

Properties:

Scheduling Details:

The graphic below describes an example of how the computing times can be distributed. The full blue area indicates that the task is working. Note that the surfaces have only been subsequently overlaid on the real-time monitor image to aid comprehension and are not real images.

Multitask, Concurrent Execution and OpenMP 8:

If cycle time overruns occur and scheduling cannot be adhered to, the execution of the respective task context is skipped until all relevant contexts are in the appropriate state. In the TcCOM object this behavior can be observed via the online parameter SkippedExecutionCount.

OpenMP

The Simulink CoderTM or the MATLAB CoderTM can generate openMP code. Please refer to the MathWorks® documentation for the exact cases in which this happens.

The following is an example using a MATLAB® Function in Simulink®. A MATLAB® example can be found in conjunction with the TE1401 TwinCAT Target for MATLAB® in the examples:
TwinCAT.ModuleGenerator.Samples.Start('Code parallelization with OpenMP').

Multitask, Concurrent Execution and OpenMP 9:

The parfor command is used to parallelize the FOR loop in the MATLAB® function. In this case, the number of parallel workers is limited to 4.

function y = MyFunction(u) %#codegen

A = ones(20,50);
t = 42;

parfor (i = 1:10,4) 
    A(i,1) = A(i,1) + t; 
end

y = A(1,4) + u;

No special settings regarding openMP have to be made for the TwinCAT target. You generate your TwinCAT objects as usual. The Simulink Coder compiles this code into openMP code, so that the C/C++ code is parallelized accordingly. The Embedded Coder is not required for this feature.

In the TwinCAT XAE you can now instantiate the created TcCOM or the PLC-FB and configure it accordingly. As usual, the object instance offers only a cyclic task interface under the Context tab. A Task 2 with 200 ms cycle time is created and assigned to the object in this example.

There is a parameter JobPoolID under Parameter (Init). Here, as far as known from the C/C++ code, it is also shown how many workers can work in parallel. A JobPool is an organization unit for JobTasks, which can be created in the Tasks node.

Multitask, Concurrent Execution and OpenMP 10:

Accordingly, an object of type TcJobPool must be added under TcCOM Objects with "Add new item". Under Parameter (Init) on the TcJobPool object, the JobPoolId is to be entered and a group of JobTasks is to be referenced. First define how many JobTasks the pool should combine and then select the JobTasks with the drop-down menu. Multitask, Concurrent Execution and OpenMP 11:

Under System > Realtime you can distribute JobTasks to different cores.

Multitask, Concurrent Execution and OpenMP 12:

Execution in the configuration shown above then takes place as follows. Task 2 is executed on core 4 and cyclically drives the openmp object. The code fragments generated as openMP code can then outsource tasks to the configured JobTasks via the JobPool. When the JobTasks have finished their calculations, all partial results are bundled again and Task 2 on core 4 executes the code to the end.