##bbobECDFslegendrlbased## Bootstrapped empirical cumulative distribution of the number of objective function evaluations divided by dimension (FEvals/DIM) for all functions and subgroups in DIMVALUE-D. The targets are chosen from 10[−8..2] such that the best algorithm from BBOB 2009 just not reached them within a given budget of k × DIM, with 31 different values of k chosen equidistant in logscale within the interval {0.5, ..., 50}. The "best 2009" line corresponds to the best aRT observed during BBOB 2009 for each selected target.
##bbobppfigslegendrlbased## Average running time (aRT in number of f-evaluations as log10 value) divided by dimension versus dimension. The target function value is chosen such that !!THE-REF-ALG!! just failed to achieve an aRT of !!PPFIGS−FTARGET!!×DIM. Different symbols correspond to different algorithms given in the legend of f1 and f24. Light symbols give the maximum number of function evaluations from the longest trial divided by dimension. Black stars indicate a statistically better result compared to all other algorithms with p < 0.01 and Bonferroni correction number of dimensions (six).
##bbobpprldistrlegendrlbased## Empirical cumulative distribution functions (ECDF), plotting the fraction of trials with an outcome not larger than the respective value on the x-axis. Left subplots: ECDF of number of function evaluations (FEvals) divided by search space dimension D, to fall below fopt+∆f where ∆f is the target just not reached by the best algorithm from BBOB 2009 within a budget of k×DIM evaluations, where k is the first value in the legend. Legends indicate for each target the number of functions that were solved in at least one trial within the displayed budget. Right subplots: ECDF of the best achieved ∆f for running times of 0.5D, 1.2D, 3D, 10D, 100D, 1000D,... function evaluations (from right to left cycling cyan-magenta-black...) and final ∆f-value (red), where ∆fand Df denote the difference to the optimal function value. Light brown lines in the background show ECDFs for the most difficult target of all algorithms benchmarked during BBOB-2009.
##bbobpprldistrlegendtworlbased## Empirical cumulative distributions (ECDF) of run lengths and speed-up ratios in 5-D (left) and 20-D (right). Left sub-columns: ECDF of the number of function evaluations divided by dimension D (FEvals/D) to fall below fopt+∆f for algorithmA (°) and algorithmB ( ) where ∆f is the target just not reached by the best algorithm from BBOB 2009 within a budget of k×DIM evaluations, with k being the value in the legend. Right sub-columns: ECDF of FEval ratios of algorithmA divided by algorithmB for run-length-based targets; all trial pairs for each function. Pairs where both trials failed are disregarded, pairs where one trial failed are visible in the limits being > 0 or < 1. The legends indicate the target budget of k×DIM evaluations and, after the colon, the number of functions that were solved in at least one trial (algorithmA first).
##bbobppfigdimlegendrlbased## Scaling of runtime with dimension to reach certain target values ∆f. Lines: average runtime (aRT); Cross (+): median runtime of successful runs to reach the most difficult target that was reached at least once (but not always); Cross (×): maximum number of f-evaluations in any trial. Notched boxes: interquartile range with median of simulated runs; All values are divided by dimension and plotted as log10 values versus dimension. Shown is the aRT for targets just not reached by the best algorithm from BBOB 2009 within the given budget k×DIM, where k is shown in the legend. Numbers above aRT-symbols (if appearing) indicate the number of trials reaching the respective target. The light thick line with diamonds indicates the best algorithm from BBOB 2009 for the most difficult target. Slanted grid lines indicate a scaling with O(DIM) compared to O(1) when using the respective reference algorithm.
##bbobpptablecaptionrlbased## Average running time (aRT in number of function evaluations) divided by the aRT of the best algorithm from BBOB 2009 in different dimensions. The aRT  and in braces, as dispersion measure, the half difference between 90 and 10%-tile of bootstrapped run lengths appear in the second row of each cell, the best aRT  in the first. The different target ∆f-values are shown in the top row. #succ is the number of trials that reached the (final) target fopt + 10−8. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Bold entries are statistically significantly better (according to the rank-sum test) compared to the best algorithm from BBOB 2009, with p = 0.05 or p = 10−k when the number k > 1 is following the ↓ symbol, with Bonferroni correction by the number of functions.??COCOVERSION??
##bbobpptablestwolegendrlbased## Average running time (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2009 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and run-length based target, the corresponding reference aRT  (preceded by the target ∆f-value in italics) in the first row. #succ is the number of trials that reached the target value of the last column. The median number of conducted function evaluations is additionally given in italics, if the last target was never reached. 1:algorithmAshort is algorithmA and 2:algorithmBshort is algorithmB. Bold entries are statistically significantly better compared to the other algorithm, with p=0.05 or p=10−k where k ∈ {2,3,4,...} is the number following the ∗ symbol, with Bonferroni correction of 48. A ↓ indicates the same tested against the best algorithm from BBOB 2009. ??COCOVERSION??
##bbobpptablesmanylegendrlbased## Average runtime (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2009 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and run-length based target, the corresponding reference aRT  (preceded by the target ∆f-value in italics) in the first row. #succ is the number of trials that reached the target value of the last column. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Entries, succeeded by a star, are statistically significantly better (according to the rank-sum test) when compared to all other algorithms of the table, with p = 0.05 or p = 10−k when the number k following the star is larger than 1, with Bonferroni correction of 48. A ↓ indicates the same tested against the best algorithm from BBOB 2009. Best results are printed in bold. ??COCOVERSION??
##bbobppscatterlegendrlbased## Average running time (aRT in log10 of number of function evaluations) of algorithmA (y-axis) versus algorithmB (x-axis) for !!NBTARGETS!! runlength-based target values for budgets between !!NBLOW!! and !!NBUP!! evaluations. Each runlength-based target !!F!!-value is chosen such that the aRTs of !!THE-REF-ALG!! for the given and a slightly easier target bracket the reference budget. Markers on the upper or right edge indicate that the respective target value was never reached. Markers represent dimension: 2:+, 3:\triangledown, 5:, 10:°, 20:[¯], 40:\Diamond.
##bbobloglosstablecaptionrlbased## aRT loss ratio versus the budget in number of f-evaluations divided by dimension. For each given budget FEvals, the target value ft is computed as the best target f-value reached within the budget by the given algorithm. Shown is then the aRT to reach ft for the given algorithm or the budget, if the best algorithm from BBOB 2009 reached a better target within the budget, divided by the aRT of the best algorithm from BBOB 2009 to reach ft. Line: geometric mean. Box-Whisker error bar: 25-75%-ile with median (box), 10-90%-ile (caps), and minimum and maximum aRT loss ratio (points). The vertical line gives the maximal number of function evaluations in a single trial in this function subset. See also the following figure for results on each function subgroup.??COCOVERSION??
##bbobloglossfigurecaptionrlbased## aRT loss ratios (see the previous figure for details).
Each cross (+) represents a single function, the line is the geometric mean.
##bbobECDFslegendfixed## Bootstrapped empirical cumulative distribution of the number of objective function evaluations divided by dimension (FEvals/DIM) for 51 targets with target precision in 10[−8..2] for all functions and subgroups in DIMVALUE-D. The "best 2009" line corresponds to the best aRT observed during BBOB 2009 for each selected target.
##bbobppfigslegendfixed## Average running time (aRT in number of f-evaluations as log10 value), divided by dimension for target function value !!PPFIGS−FTARGET!! versus dimension. Slanted grid lines indicate quadratic scaling with the dimension. Different symbols correspond to different algorithms given in the legend of f1 and f24. Light symbols give the maximum number of function evaluations from the longest trial divided by dimension. Black stars indicate a statistically better result compared to all other algorithms with p < 0.01 and Bonferroni correction number of dimensions (six).
##bbobpprldistrlegendfixed## Empirical cumulative distribution functions (ECDF), plotting the fraction of trials with an outcome not larger than the respective value on the x-axis. Left subplots: ECDF of the number of function evaluations (FEvals) divided by search space dimension D, to fall below fopt+∆f with ∆f =10k, where k is the first value in the legend. The thick red line represents the most difficult target value fopt+ 10−8. Legends indicate for each target the number of functions that were solved in at least one trial within the displayed budget. Right subplots: ECDF of the best achieved ∆f for running times of 0.5D, 1.2D, 3D, 10D, 100D, 1000D,... function evaluations (from right to left cycling cyan-magenta-black...) and final ∆f-value (red), where ∆fand Df denote the difference to the optimal function value. Light brown lines in the background show ECDFs for the most difficult target of all algorithms benchmarked during BBOB-2009.
##bbobpprldistrlegendtwofixed## Empirical cumulative distributions (ECDF) of run lengths and speed-up ratios in 5-D (left) and 20-D (right). Left sub-columns: ECDF of the number of function evaluations divided by dimension D (FEvals/D) to reach a target value fopt+∆f with ∆f =10k, where k is given by the first value in the legend, for algorithmA (°) and algorithmB () . Light beige lines show the ECDF of FEvals for target value ∆f =10−8 of all algorithms benchmarked during BBOB-2009. Right sub-columns: ECDF of FEval ratios of algorithmA divided by algorithmB for target function values 10k with k given in the legend; all trial pairs for each function. Pairs where both trials failed are disregarded, pairs where one trial failed are visible in the limits being > 0 or < 1. The legend also indicates, after the colon, the number of functions that were solved in at least one trial (algorithmA first).
##bbobppfigdimlegendfixed## Scaling of runtime with dimension to reach certain target values ∆f. Lines: average runtime (aRT); Cross (+): median runtime of successful runs to reach the most difficult target that was reached at least once (but not always); Cross (×): maximum number of f-evaluations in any trial. Notched boxes: interquartile range with median of simulated runs; All values are divided by dimension and plotted as log10 values versus dimension. Shown is the aRT for fixed values of ∆f = 10k with k given in the legend. Numbers above aRT-symbols (if appearing) indicate the number of trials reaching the respective target. The light thick line with diamonds indicates the best algorithm from BBOB 2009 for the most difficult target. Horizontal lines mean linear scaling, slanted grid lines depict quadratic scaling.
##bbobpptablecaptionfixed## Average running time (aRT in number of function evaluations) divided by the aRT of the best algorithm from BBOB 2009 in different dimensions. The aRT  and in braces, as dispersion measure, the half difference between 90 and 10%-tile of bootstrapped run lengths appear in the second row of each cell, the best aRT  (preceded by the target ∆f-value in italics) in the first. #succ is the number of trials that reached the target value of the last column. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Bold entries are statistically significantly better (according to the rank-sum test) compared to the best algorithm from BBOB 2009, with p = 0.05 or p = 10−k when the number k > 1 is following the ↓ symbol, with Bonferroni correction by the number of functions.??COCOVERSION??
##bbobpptablestwolegendfixed## Average running time (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2009 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and target, the corresponding reference aRT  in the first row. The different target ∆f-values are shown in the top row. #succ is the number of trials that reached the (final) target fopt+ 10−8. The median number of conducted function evaluations is additionally given in italics, if the last target was never reached. 1:algorithmAshort is algorithmA and 2:algorithmBshort is algorithmB. Bold entries are statistically significantly better compared to the other algorithm, with p=0.05 or p=10−k where k ∈ {2,3,4,...} is the number following the ∗ symbol, with Bonferroni correction of 48. A ↓ indicates the same tested against the best algorithm from BBOB 2009. ??COCOVERSION??
##bbobpptablesmanylegendfixed## Average runtime (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2009 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and target, the corresponding reference aRT  in the first row. The different target ∆f-values are shown in the top row. #succ is the number of trials that reached the (final) target fopt+ 10−8. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Entries, succeeded by a star, are statistically significantly better (according to the rank-sum test) when compared to all other algorithms of the table, with p = 0.05 or p = 10−k when the number k following the star is larger than 1, with Bonferroni correction of 48. A ↓ indicates the same tested against the best algorithm from BBOB 2009. Best results are printed in bold. ??COCOVERSION??
##bbobppscatterlegendfixed## Average running time (aRT in log10 of number of function evaluations) of algorithmA (y-axis) versus algorithmB (x-axis) for !!NBTARGETS!! target values !!DF!! ∈ [!!NBLOW!!, !!NBUP!!] in each dimension on functions f1 - f24. Markers on the upper or right edge indicate that the respective target value was never reached. Markers represent dimension: 2:+, 3:\triangledown, 5:, 10:°, 20:[¯], 40:\Diamond.
##bbobloglosstablecaptionfixed## aRT loss ratio versus the budget in number of f-evaluations divided by dimension. For each given budget FEvals, the target value ft is computed as the best target f-value reached within the budget by the given algorithm. Shown is then the aRT to reach ft for the given algorithm or the budget, if the best algorithm from BBOB 2009 reached a better target within the budget, divided by the aRT of the best algorithm from BBOB 2009 to reach ft. Line: geometric mean. Box-Whisker error bar: 25-75%-ile with median (box), 10-90%-ile (caps), and minimum and maximum aRT loss ratio (points). The vertical line gives the maximal number of function evaluations in a single trial in this function subset. See also the following figure for results on each function subgroup.??COCOVERSION??
##bbobloglossfigurecaptionfixed## aRT loss ratios (see the previous figure for details).
Each cross (+) represents a single function, the line is the geometric mean.
##bbobECDFslegendbiobjfixed## Bootstrapped empirical cumulative distribution of the number of objective function evaluations divided by dimension (FEvals/DIM) for 58 targets with target precision in {−10−4, −10−4.2, −10−4.4, −10−4.6, −10−4.8, −10−5, 0, 10−5, 10−4.9, 10−4.8, ..., 10−0.1, 100} for all functions and subgroups in DIMVALUE-D. The "best 2016" line corresponds to the best aRT observed during BBOB 2016 for each selected target.
##bbobppfigslegendbiobjfixed## Average running time (aRT in number of f-evaluations as log10 value), divided by dimension for target function value !!PPFIGS−FTARGET!! versus dimension. Slanted grid lines indicate quadratic scaling with the dimension. Different symbols correspond to different algorithms given in the legend of f1 and f55. Light symbols give the maximum number of function evaluations from the longest trial divided by dimension. Black stars indicate a statistically better result compared to all other algorithms with p < 0.01 and Bonferroni correction number of dimensions (six).
##bbobpprldistrlegendbiobjfixed## Empirical cumulative distribution functions (ECDF), plotting the fraction of trials with an outcome not larger than the respective value on the x-axis. Left subplots: ECDF of the number of function evaluations (FEvals) divided by search space dimension D, to fall below Iref+∆I with ∆I =10k, where k is the first value in the legend. The thick red line represents the most difficult target value Iref+ 10−5. Legends indicate for each target the number of functions that were solved in at least one trial within the displayed budget. Right subplots: ECDF of the best achieved ∆I for running times of 0.5D, 1.2D, 3D, 10D, 100D, 1000D,... function evaluations (from right to left cycling cyan-magenta-black...) and final ∆I-value (red), where ∆Iand Df denote the difference to the optimal function value. Shown are aggregations over functions where the single objectives are in the same BBOB function class, as indicated on the left side and the aggregation over all 55 functions in the last row.
##bbobpprldistrlegendtwobiobjfixed## Empirical cumulative distributions (ECDF) of run lengths and speed-up ratios in 5-D (left) and 20-D (right). Left sub-columns: ECDF of the number of function evaluations divided by dimension D (FEvals/D) to reach a target value Iref+∆I with ∆I =10k, where k is given by the first value in the legend, for algorithmA (°) and algorithmB () . Right sub-columns: ECDF of FEval ratios of algorithmA divided by algorithmB for target function values 10k with k given in the legend; all trial pairs for each function. Pairs where both trials failed are disregarded, pairs where one trial failed are visible in the limits being > 0 or < 1. The legend also indicates, after the colon, the number of functions that were solved in at least one trial (algorithmA first).
##bbobppfigdimlegendbiobjfixed## Scaling of runtime with dimension to reach certain target values ∆I. Lines: average runtime (aRT); Cross (+): median runtime of successful runs to reach the most difficult target that was reached at least once (but not always); Cross (×): maximum number of f-evaluations in any trial. Notched boxes: interquartile range with median of simulated runs; All values are divided by dimension and plotted as log10 values versus dimension. Shown is the aRT for fixed values of ∆I = 10k with k given in the legend. Numbers above aRT-symbols (if appearing) indicate the number of trials reaching the respective target. The light thick line with diamonds indicates the best algorithm from BBOB 2016 for the most difficult target. Horizontal lines mean linear scaling, slanted grid lines depict quadratic scaling.
##bbobpptablecaptionbiobjfixed## Average running time (aRT in number of function evaluations) divided by the aRT of the best algorithm from BBOB 2016 in different dimensions. The aRT  and in braces, as dispersion measure, the half difference between 90 and 10%-tile of bootstrapped run lengths appear in the second row of each cell, the best aRT  (preceded by the target ∆I-value in italics) in the first. #succ is the number of trials that reached the target value of the last column. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Bold entries are statistically significantly better (according to the rank-sum test) compared to the best algorithm from BBOB 2016, with p = 0.05 or p = 10−k when the number k > 1 is following the ↓ symbol, with Bonferroni correction by the number of functions.??COCOVERSION??
##bbobpptablestwolegendbiobjfixed## Average running time (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2016 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and target, the corresponding reference aRT  in the first row. The different target ∆I-values are shown in the top row. #succ is the number of trials that reached the (final) target Iref+ 10−5. The median number of conducted function evaluations is additionally given in italics, if the last target was never reached. 1:algorithmAshort is algorithmA and 2:algorithmBshort is algorithmB. Bold entries are statistically significantly better compared to the other algorithm, with p=0.05 or p=10−k where k ∈ {2,3,4,...} is the number following the ∗ symbol, with Bonferroni correction of 48. A ↓ indicates the same tested against the best algorithm from BBOB 2016. ??COCOVERSION??
##bbobpptablesmanylegendbiobjfixed## Average runtime (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2016 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and target, the corresponding reference aRT  in the first row. The different target ∆I-values are shown in the top row. #succ is the number of trials that reached the (final) target Iref+ 10−5. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Entries, succeeded by a star, are statistically significantly better (according to the rank-sum test) when compared to all other algorithms of the table, with p = 0.05 or p = 10−k when the number k following the star is larger than 1, with Bonferroni correction of 110. A ↓ indicates the same tested against the best algorithm from BBOB 2016. Best results are printed in bold. ??COCOVERSION??
##bbobppscatterlegendbiobjfixed## Average running time (aRT in log10 of number of function evaluations) of algorithmA (y-axis) versus algorithmB (x-axis) for !!NBTARGETS!! target values !!DF!! ∈ [!!NBLOW!!, !!NBUP!!] in each dimension on functions f1 - f55. Markers on the upper or right edge indicate that the respective target value was never reached. Markers represent dimension: 2:+, 3:\triangledown, 5:, 10:°, 20:[¯], 40:\Diamond.
##bbobloglosstablecaptionbiobjfixed## aRT loss ratio versus the budget in number of f-evaluations divided by dimension. For each given budget FEvals, the target value ft is computed as the best target IHVCOCO-value reached within the budget by the given algorithm. Shown is then the aRT to reach ft for the given algorithm or the budget, if the best algorithm from BBOB 2016 reached a better target within the budget, divided by the aRT of the best algorithm from BBOB 2016 to reach ft. Line: geometric mean. Box-Whisker error bar: 25-75%-ile with median (box), 10-90%-ile (caps), and minimum and maximum aRT loss ratio (points). The vertical line gives the maximal number of function evaluations in a single trial in this function subset. See also the following figure for results on each function subgroup.??COCOVERSION??
##bbobloglossfigurecaptionbiobjfixed## aRT loss ratios (see the previous figure for details).
Each cross (+) represents a single function, the line is the geometric mean.
##bbobECDFslegendbiobjrlbased## Bootstrapped empirical cumulative distribution of the number of objective function evaluations divided by dimension (FEvals/DIM) for all functions and subgroups in DIMVALUE-D. The targets are chosen from {−10−4, −10−4.2, −10−4.4, −10−4.6, −10−4.8, −10−5, 0, 10−5, 10−4.9, 10−4.8, ..., 10−0.1, 100} such that the best algorithm from BBOB 2016 just not reached them within a given budget of k × DIM, with 31 different values of k chosen equidistant in logscale within the interval {0.5, ..., 50}. The "best 2016" line corresponds to the best aRT observed during BBOB 2016 for each selected target.
##bbobppfigslegendbiobjrlbased## Average running time (aRT in number of f-evaluations as log10 value) divided by dimension versus dimension. The target function value is chosen such that !!THE-REF-ALG!! just failed to achieve an aRT of !!PPFIGS−FTARGET!!×DIM. Different symbols correspond to different algorithms given in the legend of f1 and f55. Light symbols give the maximum number of function evaluations from the longest trial divided by dimension. Black stars indicate a statistically better result compared to all other algorithms with p < 0.01 and Bonferroni correction number of dimensions (six).
##bbobpprldistrlegendbiobjrlbased## Empirical cumulative distribution functions (ECDF), plotting the fraction of trials with an outcome not larger than the respective value on the x-axis. Left subplots: ECDF of number of function evaluations (FEvals) divided by search space dimension D, to fall below Iref+∆I where ∆I is the target just not reached by the best algorithm from BBOB 2016 within a budget of k×DIM evaluations, where k is the first value in the legend. Legends indicate for each target the number of functions that were solved in at least one trial within the displayed budget. Right subplots: ECDF of the best achieved ∆I for running times of 0.5D, 1.2D, 3D, 10D, 100D, 1000D,... function evaluations (from right to left cycling cyan-magenta-black...) and final ∆I-value (red), where ∆Iand Df denote the difference to the optimal function value. Shown are aggregations over functions where the single objectives are in the same BBOB function class, as indicated on the left side and the aggregation over all 55 functions in the last row.
##bbobpprldistrlegendtwobiobjrlbased## Empirical cumulative distributions (ECDF) of run lengths and speed-up ratios in 5-D (left) and 20-D (right). Left sub-columns: ECDF of the number of function evaluations divided by dimension D (FEvals/D) to fall below Iref+∆I for algorithmA (°) and algorithmB ( ) where ∆I is the target just not reached by the best algorithm from BBOB 2016 within a budget of k×DIM evaluations, with k being the value in the legend. Right sub-columns: ECDF of FEval ratios of algorithmA divided by algorithmB for run-length-based targets; all trial pairs for each function. Pairs where both trials failed are disregarded, pairs where one trial failed are visible in the limits being > 0 or < 1. The legends indicate the target budget of k×DIM evaluations and, after the colon, the number of functions that were solved in at least one trial (algorithmA first).
##bbobppfigdimlegendbiobjrlbased## Scaling of runtime with dimension to reach certain target values ∆I. Lines: average runtime (aRT); Cross (+): median runtime of successful runs to reach the most difficult target that was reached at least once (but not always); Cross (×): maximum number of f-evaluations in any trial. Notched boxes: interquartile range with median of simulated runs; All values are divided by dimension and plotted as log10 values versus dimension. Shown is the aRT for targets just not reached by the best algorithm from BBOB 2016 within the given budget k×DIM, where k is shown in the legend. Numbers above aRT-symbols (if appearing) indicate the number of trials reaching the respective target. The light thick line with diamonds indicates the best algorithm from BBOB 2016 for the most difficult target. Slanted grid lines indicate a scaling with O(DIM) compared to O(1) when using the respective reference algorithm.
##bbobpptablecaptionbiobjrlbased## Average running time (aRT in number of function evaluations) divided by the aRT of the best algorithm from BBOB 2016 in different dimensions. The aRT  and in braces, as dispersion measure, the half difference between 90 and 10%-tile of bootstrapped run lengths appear in the second row of each cell, the best aRT  in the first. The different target ∆I-values are shown in the top row. #succ is the number of trials that reached the (final) target Iref + 10−5. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Bold entries are statistically significantly better (according to the rank-sum test) compared to the best algorithm from BBOB 2016, with p = 0.05 or p = 10−k when the number k > 1 is following the ↓ symbol, with Bonferroni correction by the number of functions.??COCOVERSION??
##bbobpptablestwolegendbiobjrlbased## Average running time (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2016 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and run-length based target, the corresponding reference aRT  (preceded by the target ∆I-value in italics) in the first row. #succ is the number of trials that reached the target value of the last column. The median number of conducted function evaluations is additionally given in italics, if the last target was never reached. 1:algorithmAshort is algorithmA and 2:algorithmBshort is algorithmB. Bold entries are statistically significantly better compared to the other algorithm, with p=0.05 or p=10−k where k ∈ {2,3,4,...} is the number following the ∗ symbol, with Bonferroni correction of 48. A ↓ indicates the same tested against the best algorithm from BBOB 2016. ??COCOVERSION??
##bbobpptablesmanylegendbiobjrlbased## Average runtime (aRT in number of function evaluations) divided by the respective best aRT measured during BBOB-2016 in different dimensions. The aRT and in braces, as dispersion measure, the half difference between 10 and 90%-tile of bootstrapped run lengths appear for each algorithm and run-length based target, the corresponding reference aRT  (preceded by the target ∆I-value in italics) in the first row. #succ is the number of trials that reached the target value of the last column. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Entries, succeeded by a star, are statistically significantly better (according to the rank-sum test) when compared to all other algorithms of the table, with p = 0.05 or p = 10−k when the number k following the star is larger than 1, with Bonferroni correction of 110. A ↓ indicates the same tested against the best algorithm from BBOB 2016. Best results are printed in bold. ??COCOVERSION??
##bbobppscatterlegendbiobjrlbased## Average running time (aRT in log10 of number of function evaluations) of algorithmA (y-axis) versus algorithmB (x-axis) for !!NBTARGETS!! runlength-based target values for budgets between !!NBLOW!! and !!NBUP!! evaluations. Each runlength-based target !!F!!-value is chosen such that the aRTs of !!THE-REF-ALG!! for the given and a slightly easier target bracket the reference budget. Markers on the upper or right edge indicate that the respective target value was never reached. Markers represent dimension: 2:+, 3:\triangledown, 5:, 10:°, 20:[¯], 40:\Diamond.
##bbobloglosstablecaptionbiobjrlbased## aRT loss ratio versus the budget in number of f-evaluations divided by dimension. For each given budget FEvals, the target value ft is computed as the best target IHVCOCO-value reached within the budget by the given algorithm. Shown is then the aRT to reach ft for the given algorithm or the budget, if the best algorithm from BBOB 2016 reached a better target within the budget, divided by the aRT of the best algorithm from BBOB 2016 to reach ft. Line: geometric mean. Box-Whisker error bar: 25-75%-ile with median (box), 10-90%-ile (caps), and minimum and maximum aRT loss ratio (points). The vertical line gives the maximal number of function evaluations in a single trial in this function subset. See also the following figure for results on each function subgroup.??COCOVERSION??
##bbobloglossfigurecaptionbiobjrlbased## aRT loss ratios (see the previous figure for details).
Each cross (+) represents a single function, the line is the geometric mean.
##bbobECDFslegendbiobjextfixed## Bootstrapped empirical cumulative distribution of the number of objective function evaluations divided by dimension (FEvals/DIM) for 58 targets with target precision in {−10−4, −10−4.2, −10−4.4, −10−4.6, −10−4.8, −10−5, 0, 10−5, 10−4.9, 10−4.8, ..., 10−0.1, 100} for all functions and subgroups in DIMVALUE-D.
##bbobppfigslegendbiobjextfixed## Average running time (aRT in number of f-evaluations as log10 value), divided by dimension for target function value !!PPFIGS−FTARGET!! versus dimension. Slanted grid lines indicate quadratic scaling with the dimension. Different symbols correspond to different algorithms given in the legend of f1 and f92. Light symbols give the maximum number of function evaluations from the longest trial divided by dimension. Black stars indicate a statistically better result compared to all other algorithms with p < 0.01 and Bonferroni correction number of dimensions (six).
##bbobpprldistrlegendbiobjextfixed## Empirical cumulative distribution functions (ECDF), plotting the fraction of trials with an outcome not larger than the respective value on the x-axis. Left subplots: ECDF of the number of function evaluations (FEvals) divided by search space dimension D, to fall below Iref+∆I with ∆I =10k, where k is the first value in the legend. The thick red line represents the most difficult target value Iref+ 10−5. Legends indicate for each target the number of functions that were solved in at least one trial within the displayed budget. Right subplots: ECDF of the best achieved ∆I for running times of 0.5D, 1.2D, 3D, 10D, 100D, 1000D,... function evaluations (from right to left cycling cyan-magenta-black...) and final ∆I-value (red), where ∆Iand Df denote the difference to the optimal function value. Shown are aggregations over functions where the single objectives are in the same BBOB function class, as indicated on the left side and the aggregation over all 92 functions in the last row.
##bbobpprldistrlegendtwobiobjextfixed## Empirical cumulative distributions (ECDF) of run lengths and speed-up ratios in 5-D (left) and 20-D (right). Left sub-columns: ECDF of the number of function evaluations divided by dimension D (FEvals/D) to reach a target value Iref+∆I with ∆I =10k, where k is given by the first value in the legend, for algorithmA (°) and algorithmB () . Right sub-columns: ECDF of FEval ratios of algorithmA divided by algorithmB for target function values 10k with k given in the legend; all trial pairs for each function. Pairs where both trials failed are disregarded, pairs where one trial failed are visible in the limits being > 0 or < 1. The legend also indicates, after the colon, the number of functions that were solved in at least one trial (algorithmA first).
##bbobppfigdimlegendbiobjextfixed## Scaling of runtime with dimension to reach certain target values ∆I. Lines: average runtime (aRT); Cross (+): median runtime of successful runs to reach the most difficult target that was reached at least once (but not always); Cross (×): maximum number of f-evaluations in any trial. Notched boxes: interquartile range with median of simulated runs; All values are divided by dimension and plotted as log10 values versus dimension. Shown is the aRT for fixed values of ∆I = 10k with k given in the legend. Numbers above aRT-symbols (if appearing) indicate the number of trials reaching the respective target. Horizontal lines mean linear scaling, slanted grid lines depict quadratic scaling.
##bbobpptablecaptionbiobjextfixed## Average runtime (aRT) to reach given targets, measured in number of function evaluations. For each function, the aRT  and, in braces as dispersion measure, the half difference between 10 and 90%-tile of (bootstrapped) runtimes is shown for the different target ∆I-values as shown in the top row. #succ is the number of trials that reached the last target Iref+ 10−5. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached.
##bbobpptablestwolegendbiobjextfixed## Average runtime (aRT) to reach given targets, measured in number of function evaluations in different dimensions. For each function, the aRT  and, in braces as dispersion measure, the half difference between 10 and 90%-tile of (bootstrapped) runtimes is shown for the different target ∆I-values as shown in the top row. #succ is the number of trials that reached the last target Iref+ 10−5. target, the corresponding reference aRT  in the first row. The different target ∆I-values are shown in the top row. #succ is the number of trials that reached the (final) target Iref+ 10−5. The median number of conducted function evaluations is additionally given in italics, if the last target was never reached. 1:algorithmAshort is algorithmA and 2:algorithmBshort is algorithmB. Bold entries are statistically significantly better compared to the other algorithm, with p=0.05 or p=10−k where k ∈ {2,3,4,...} is the number following the ∗ symbol, with Bonferroni correction of 48. ??COCOVERSION??
##bbobpptablesmanylegendbiobjextfixed## Average runtime (aRT) to reach given targets, measured in number of function evaluations, in different dimensions. For each function, the aRT  and, in braces as dispersion measure, the half difference between 10 and 90%-tile of (bootstrapped) runtimes is shown for the different target !!DI!!-values as shown in the top row. #succ is the number of trials that reached the last target Iref+ 10−5. The median number of conducted function evaluations is additionally given in italics, if the target in the last column was never reached. Entries, succeeded by a star, are statistically significantly better (according to the rank-sum test) when compared to all other algorithms of the table, with p = 0.05 or p = 10−k when the number k following the star is larger than 1, with Bonferroni correction of 184. Best results are printed in bold. ??COCOVERSION??
##bbobppscatterlegendbiobjextfixed## Average running time (aRT in log10 of number of function evaluations) of algorithmA (y-axis) versus algorithmB (x-axis) for !!NBTARGETS!! target values !!DF!! ∈ [!!NBLOW!!, !!NBUP!!] in each dimension on functions f1 - f92. Markers on the upper or right edge indicate that the respective target value was never reached. Markers represent dimension: 2:+, 3:\triangledown, 5:, 10:°, 20:[¯], 40:\Diamond.
##bbobloglosstablecaptionbiobjextfixed## aRT loss ratio versus the budget in number of f-evaluations divided by dimension. For each given budget FEvals, the target value ft is computed as the best target IHVCOCO-value reached within the budget by the given algorithm. Shown is then the aRT to reach ft for the given algorithm or the budget, if reached a better target within the budget, divided by the aRT of to reach ft. Line: geometric mean. Box-Whisker error bar: 25-75%-ile with median (box), 10-90%-ile (caps), and minimum and maximum aRT loss ratio (points). The vertical line gives the maximal number of function evaluations in a single trial in this function subset. See also the following figure for results on each function subgroup.??COCOVERSION??
##bbobloglossfigurecaptionbiobjextfixed## aRT loss ratios (see the previous figure for details).
Each cross (+) represents a single function, the line is the geometric mean.
###


File translated from TEX by TTH, version 4.08.
On 30 Mar 2017, 15:15.