Description
Crocoddyl parallelization seems to face issues.
When measuring execution time while parallelizing the function calcdiff
of the class IntegratedActionModelEuler
on the robot Talos, we can observe a decrease in performance. On the repository https://gitlab.laas.fr/gsaurel/croco-benchs, we have plots of it that shows the results on multiple computers :
This is what the execution time looks like on multiple machines as we increase the number of cores. Now if we plot the execution time multiplied by the number of cores, if the parallelization was successfull, we would have something constant when using more cores. Actually, the more we use cores, the higher the multiplication is :
The increase of this multiplication shows the loss of performance.
I am writing this issue in the prospect of solving this problem. I am currently investigating the causes of it.
I think I have found one of the cause of this issue : the function calcdiff
in the class CostModelSum
. I can open a PR to discuss about it. If we agree, I could open a PR to discuss this first step with more details.
I'm also looking at the calcdiff
in the class IntegratedActionModelEuler
, there may be a problem of aliasing that also decrease the performance.