8000 loose coupling transcription and translation steps · Issue #59 · zh-plus/openlrc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
loose coupling transcription and translation steps #59
Open
@MaleicAcid

Description

@MaleicAcid

The steps of transcription and translation currently appear to be relatively tightly coupled. We can see that the subtitles generated by the transcription are processed in the translation step.

# file: openlrc/openlrc.py
def process_translation(base_name, target_lang, transcribed_opt_sub, skip_trans):
    ...
    if skip_trans:
        shutil.copy(transcribed_opt_sub.filename, final_json_path)
        transcribed_opt_sub.filename = final_json_path
        return transcribed_opt_sub
   ...

And finally generated in translation worker.

def translation_worker(self, transcription_queue, target_lang, skip_trans, bilingual_sub):
       ...
        # Handle translation
        final_subtitle = process_translation(base_name, target_lang, transcribed_opt_sub, skip_trans)

        # Generate and move subtitle files
        generate_subtitle_files(final_subtitle, base_name, subtitle_format)
       ...

This seems to violate the SRP.
At the same time, even specified skip trans=True , the translation thread will still be started. Users pay for the additional performance overhead even though they are not using it.

I wish we could decouple the two steps of transcription and translation:
- The translation step no longer processes transcribed files.
- The translation thread is no longer started when skip_trans=False is specified.

I am not familiar with nlp related knowledge. But if you agree, maybe I can try to complete this improvement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0