Add a threadpool and run the comparison of each file with all files after it in a different thread #71

cgkantidis · 2025-05-17T18:38:40Z

Duplo's job is embarrassingly parallel and thus the usage of a threadpool is the easiest way to speed up Duplo's runtime.

This is a classic case of trading larger memory consumption for faster runtime.

Even the most basic laptops nowadays have at least 8GB of RAM, and not using all of the available memory is a waste.

For enterprise machines, which can have 100s of GB of RAM is trade-off of memory for runtime reduction is especially appreciated.

Users can elect to run the tool with a single thread (-j 1) in order to revert to the old behavior, with no extra cost in memory.

The only synchronization between the threads is the logging messages by the exporter, which are printed by each thread when it finishes comparing a file with all other files after it.

Quake 2 runtime and memory scaling:

GCC 15.1.0 runtime and memory scaling:

…fter it in a different thread Duplo's job is [embarrassingly parallel](https://en.wikipedia.org/wiki/Embarrassingly_parallel) and thus the usage of a threadpool is the easiest way to speed up Duplo's runtime. This is a classic case of trading larger memory consumption for faster runtime. Even the most basic laptops nowadays have at least 8GB of RAM, and not using all of the available memory is a waste. For enterprise machines, which can have 100s of GB of RAM is trade-off of memory for runtime reduction is especially appreciated. Users can elect to run the tool with a single thread (-j 1) in order to revert to the old behavior, with no extra cost in memory. The only synchronization between the threads is the logging messages by the exporter, which are printed by each thread when it finishes comparing a file with all other files after it.

cgkantidis · 2025-05-17T19:04:53Z

Closing for a small fix regarding the tests.
I will re-open the PR shortly.

cgkantidis closed this May 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a threadpool and run the comparison of each file with all files after it in a different thread #71

Add a threadpool and run the comparison of each file with all files after it in a different thread #71

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add a threadpool and run the comparison of each file with all files after it in a different thread #71

Add a threadpool and run the comparison of each file with all files after it in a different thread #71

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!