A powerful and modular URL cleaning library and CLI tool that removes tracking parameters and decodes redirects.
- 🧹 Clean URLs by removing tracking parameters
- 🔄 Decode redirect URLs (Google, Facebook, etc.)
- ⚙️ Customizable parameter whitelisting/blacklisting
- 🧰 Supports both Python API and CLI usage
- 📋 Process URLs from clipboard, files, or standard input
- 🔧 Configurable via JSON or YAML files
You can install Sanitizr from PyPI:
pip install sanitizr
For development setup:
pip install -e ".[dev]"
# Clean a single URL
sanitizr -u "https://example.com?id=123&utm_source=newsletter"
# Clean URLs from a file
sanitizr -i urls.txt -o cleaned_urls.txt
# Clean URLs from stdin
cat urls.txt | sanitizr > cleaned_urls.txt
# Use verbose output to see the changes
sanitizr -u "https://example.com?id=123&utm_source=newsletter" -v
from sanitizr.sanitizr import URLCleaner
cleaner = URLCleaner()
clean_url = cleaner.clean_url("https://example.com?id=123&utm_source=newsletter")
print(clean_url) # https://example.com?id=123
Sanitizr can be configured via JSON or YAML files:
# config.yaml
tracking_params:
- custom_tracker
- another_tracker
redirect_params:
custom.com:
- redirect
- goto
whitelist_params:
- keep_this_param
blacklist_params:
- remove_this_param
Use the configuration with the --config
option:
sanitizr -u "https://example.com?id=123&custom_tracker=abc" --config config.yaml
Sanitizr is licensed under the GNU General Public License v3.0 or later - see the LICENSE file for details.