A GUI application for extracting URLs from CSV files with master list management, deduplication, and configurable settings.
- Process multiple CSV files concurrently
- Extract URLs from specified column
- Maintain a master list of previously processed URLs
- Exclude URLs using an exclude list file
- Auto-deduplicate URLs against master list and current batch
- Dark mode interface with three main sections:
- Main: Primary processing controls
- Statistics: Processing metrics and master list cleaning
- Settings: Application configuration
- Timestamp Output Files: Automatically add timestamps to output filenames (e.g., output_20240216_235959.txt)
- Workers: Configure number of concurrent processing threads (1-16)
- Skip Header: Skip first data row in CSV files
- Continue on Error: Keep processing if individual files fail
- Master List: Configure path to master list file for URL tracking
- Sample CSV: Set a sample CSV to automatically detect URL column headers
- Track total files processed
- Count total, unique, excluded and duplicate URLs
- Display processing time and last run timestamp
- Reset statistics as needed
- Clean master list to remove any duplicates
- Enhanced visualization features:
- Interactive domain distribution chart
- Top 10 domains bar chart with frequency analysis
- Historical processing trends visualization
- Detailed statistics report generation
- Automatic www prefix removal for cleaner domain analysis
- Charts and reports saved in 'statistics' directory:
domain_distribution.png
: Visual breakdown of top domainshistorical_trends.png
: URL processing trends over timestatistics_report.md
: Comprehensive statistics report
[Main Interface]
Dark theme interface with URL processing controls
[Statistics Dashboard]
Real-time processing statistics and history
[Settings Interface]
Configure application settings
- Set your processing options in Settings
- Select input directory containing CSV files
- Choose output file location
- Select URL column from detected headers
- Optional: Configure exclude file path
- Click Process to begin extraction
All settings are automatically saved between sessions.
cargo build --release
The compiled application will be available in target/release/export_csv_links.exe
- Windows operating system
- CSV files with consistent column headers
- URLs must be in standard HTTP/HTTPS format