Comic Translate

Intro

Many Automatic Manga Translators exist. Very few properly support comics of other kinds in other languages. This project was created to utilize the ability of State of the Art (SOTA) Large Language Models (LLMs) like GPT-4 and translate comics from all over the world. Currently, it supports translating to and from English, Korean, Japanese, French, Simplified Chinese, Traditional Chinese, Russian, German, Dutch, Spanish and Italian. It can translate to (but not from) Turkish, Polish, Portuguese and Brazillian Portuguese.

Comic Translate
The State of Machine Translation

For a couple dozen languages, the best Machine Translator is not Google Translate, Papago or even DeepL, but a SOTA LLM like GPT-4o, and by far. This is very apparent for distant language pairs (Korean<->English, Japanese<->English etc) where other translators still often devolve into gibberish. Excerpt from "The Walking Practice"(보행 연습) by Dolki Min(돌기민)

Comic Samples

GPT-4 as Translator. Note: Some of these also have Official English Translations

The Wretched of the High Seas

Journey to the West

The Wormworld Saga

Frieren: Beyond Journey's End

Days of Sand

Player (OH Hyeon-Jun)

Carbon & Silicon

Installation

Python

Install Python 3.12. Tick "Add python.exe to PATH" during the setup.
```
https://www.python.org/downloads/
```
Install git
```
https://git-scm.com/
```
Install uv
```
https://docs.astral.sh/uv/getting-started/installation/
```
Then, in the command line
```
git clone https://github.com/ogkalu2/comic-translate
cd comic-translate
uv init --python 3.12
```
and install the requirements
```
uv add -r requirements.txt --compile-bytecode
```
To Update, run this in the comic-translate folder
```
git pull
uv init --python 3.12 (Note: only run this line if you did not use uv for the first time installation)
uv add -r requirements.txt --compile-bytecode
```
If you have an NVIDIA GPU, then it is recommended to run
```
uv remove torch torchvision
uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
```
Note: The 126 in cu126 represents the CUDA version - 12.6. Replace 126 with your CUDA version (or the version closest to yours). E.g 118 if you are running CUDA 11.8

Usage

In the comic-translate directory, run
```
uv run comic.py
```
This will launch the GUI

Tips
- If you have a CBR file, you'll need to install Winrar or 7-Zip then add the folder it's installed to (e.g "C:\Program Files\WinRAR" for Windows) to Path. If it's installed but not to Path, you may get the error,
```
raise RarCannotExec("Cannot find working tool")
```
In that case, Instructions for Windows, Linux, Mac
- Make sure the selected Font supports characters of the target language
- v2.0 introduces a Manual Mode. When you run into issues with Automatic Mode (No text detected, Incorrect OCR, Insufficient Cleaning etc), you are now able to make corrections. Simply Undo the Image and toggle Manual Mode.
- In Automatic Mode, Once an Image has been processed, it is loaded in the Viewer or stored to be loaded on switch so you can keep reading in the app as the other Images are being translated.
- Ctrl + Mouse Wheel to Zoom otherwise Vertical Scrolling
- The Usual Trackpad Gestures work for viewing the Image
- Right, Left Keys to Navigate Between Images
API Keys

To following selections will require access to closed resources and subsequently, API Keys:
- GPT-4o or 4o-mini for Translation (Paid, about $0.01 USD/Page for 4o)
- DeepL Translator (Free for 500,000 characters/month)
- GPT-4o for OCR (Default Option for French, Russian, German, Dutch, Spanish, Italian) (Paid, about $0.02 USD/Page)
- Microsoft Azure Vision for OCR (Free for 5000 images/month)
- Google Cloud Vision for OCR (Free for 1000 images/month) You can set your API Keys by going to Settings > Credentials
Getting API Keys

Open AI (GPT)
- Go to OpenAI's Platform website at platform.openai.com and sign in with (or create) an OpenAI account.
- Hover your Mouse over the right taskbar of the page and select "API Keys."
- Click "Create New Secret Key" to generate a new API key. Copy and store it.
Google Cloud Vision
- Sign in/Create a Google Cloud account. Go to Cloud Resource Manager and click "Create Project". Set your project name.
- Select your project here then select "Billing" then "Create Account". In the pop-up, "Enable billing account", and accept the offer of a free trial account. Your "Account type" should be individual. Fill in a valid credit card.
- Enable Google Cloud Vison for your project here
- In the Google Cloud Credentials page, click "Create Credentials" then API Key. Copy and store it.
How it works

Speech Bubble Detection and Text Segmentation

speech-bubble-detector, text-segmenter. Two yolov8m models trained on 8k and 3k images of comics (Manga, Webtoons, Western) respectively.

OCR

By Default:
- doctr for English, French, German, Dutch, Spanish and Italian.
- manga-ocr for Japanese
- Pororo for Korean
- PaddleOCR for Chinese
- GPT-4o for Russian. Paid, Requires an API Key.
Optional:

These can be used for any of the supported languages. An API Key is required.
- Google Cloud Vision
- Microsoft Azure Vision
Inpainting

A Manga/Anime finetuned lama checkpoint to remove text detected by the segmenter. Implementation courtsey of lama-cleaner

Translation

Currently, this supports using GPT-4o, GPT-4o mini, DeepL, Claude-3-Opus, Claude-3.5-Sonnet, Claude-3-Haiku, Gemini-2.5-Flash, Gemini-2.5-Pro, Yandex, Google Translate and Microsoft Translator.

All LLMs are fed the entire page text to aid translations. There is also the Option to provide the Image itself for further context.

Text Rendering

Wrapped text in bounding boxes obtained from bubbles and text.

Acknowledgements

Name		Name	Last commit message	Last commit date
Latest commit History 568 Commits
app		app
docs		docs
fonts		fonts
models		models
modules		modules
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
comic.py		comic.py
controller.py		controller.py
pipeline.py		pipeline.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comic Translate

Intro

The State of Machine Translation

Comic Samples

Installation

Python

Usage

Tips

API Keys

Getting API Keys

Open AI (GPT)

Google Cloud Vision

How it works

Speech Bubble Detection and Text Segmentation

OCR

Inpainting

Translation

Text Rendering

Acknowledgements

About

Releases

Packages

Languages

License

sweetburble/comic-translate

Folders and files

Latest commit

History

Repository files navigation

Comic Translate

Intro

The State of Machine Translation

Comic Samples

Installation

Python

Usage

Tips

API Keys

Getting API Keys

Open AI (GPT)

Google Cloud Vision

How it works

Speech Bubble Detection and Text Segmentation

OCR

Inpainting

Translation

Text Rendering

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages