loose coupling transcription and translation steps

The steps of transcription and translation currently appear to be relatively tightly coupled. We can see that the subtitles generated by the transcription are processed in the translation step.

# file: openlrc/openlrc.py
def process_translation(base_name, target_lang, transcribed_opt_sub, skip_trans):
    ...
    if skip_trans:
        shutil.copy(transcribed_opt_sub.filename, final_json_path)
        transcribed_opt_sub.filename = final_json_path
        return transcribed_opt_sub
   ...

And finally generated in translation worker.

def translation_worker(self, transcription_queue, target_lang, skip_trans, bilingual_sub):
       ...
        # Handle translation
        final_subtitle = process_translation(base_name, target_lang, transcribed_opt_sub, skip_trans)

        # Generate and move subtitle files
        generate_subtitle_files(final_subtitle, base_name, subtitle_format)
       ...

This seems to violate the SRP.
At the same time, even specified skip trans=True , the translation thread will still be started. Users pay for the additional performance overhead even though they are not using it.

I wish we could decouple the two steps of transcription and translation:
- The translation step no longer processes transcribed files.
- The translation thread is no longer started when skip_trans=False is specified.

I am not familiar with nlp related knowledge. But if you agree, maybe I can try to complete this improvement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions