A Windows console application that lists file usage by grouping files based on their extensions. This program counts the total number of files and calculates their cumulative sizes.
- Lists files in a directory and groups them by their file extensions.
- Displays the total number of files per extension.
- Calculates the cumulative size of files for each extension.
- Simple, fast, and lightweight console-based interface.
For efficient and modern file management, the program utilizes the <filesystem>
library, introduced in C++17. Compared to legacy and platform-specific methods (e.g. FindFirstFile
, FindNextFile
on Windows), which were often bulky and mistake-susceptible, this library introduces a standardized, cross-platform, and intuitive approach for handling files and directories.
-
Cross-Platform Compatibility: The
filesystem
library works seamlessly across Windows, Linux, and macOS, eliminating the need for platform-specific code. -
Ease of Use: This library streamlines straightforward and high-level APIs for common operations like iterating directories, retrieving file sizes, and filtering files based on extensions.
for (const auto& entry : std::filesystem::directory_iterator(dir_path)) { if (entry.is_regular_file()) { auto size{ entry.file_size() }; } }
-
Performance: Optimized for modern hardware,
filesystem
offers better performance compared to legacy implementations. -
Built-in Error Handling: This modern approach automatically throws exceptions for invalid paths, permission issues, etc. simplifying error management.
To improve the program's performance, a custom ThreadPool
is used for concurrent execution. By utilizing multiple threads, the program can efficiently process files in parallel, significantly reducing execution time, especially when dealing with large directories containing numerous files (e.g. C:\
).
-
Parallel Task Execution: The
ThreadPool
manages a pool of worker threads that handle tasks concurrently, ensuring faster file scanning. -
Efficient Resource Utilization: The number of threads is determined based on the hardware's logical processors, maximizing CPU utilization without overwhelming the system.
size_t num_threads{ thread::hardware_concurrency() };
Usage: fileusage [--help] [-hdrsv(x regularexpression)] [folder]
switches:
h help
d reverse the order of listing
r suppress recursion
s sort by file sizes
v verbose
b benchmark
x filter with a regular expression
folder
starting folder or current folder if not specified
The following files are intentionally excluded from the repository:
- scan_directory.cpp
- ThreadPool.hpp/ThreadPool.cpp
- benchmark.cpp
These files are excluded for personal and copyright reasons. If you need these files, please contact me via my email: manhkhang0305@gmail.com