A collection of tools and utilities for language learning and processing.
A tool that extracts unique words from EPUB books to aid vocabulary learning.
Features:
- Extracts words from EPUB files
- Filters out short words, known words, and misspellings
- Converts words to their base form
- Outputs a clean list of unique words
Usage:
- Place an EPUB file as
input.epub
in theproduct/words
directory - Add known words to
ignore_words.txt
- Run the script
- Get your word list in
output.txt
# Install dependencies
yarn
# Navigate to the words project
cd product/words
# Run the tool
yarn start
Designed for language learners who listen to audiobooks and want to enhance their vocabulary.