• infjarchninja@lemmy.ml
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 day ago

    I have used Open-whisper and Fast-whisper to do subtitles.

    Open whisper is easy to set up and install locally. I tried various models.

    Recently I tried to do the French series En Therapie (In Therapy) which has 35 short episodes.

    https://www.arte.tv/fr/videos/RC-020578/en-therapie/

    Each episode is only 20 minutes long, so I thought that open whisper would be great to translate from French to English.

    However. It failed dismally. Constant, regurgitation of repeated sentences. Throughout entire episodes open whisper used “him” instead of “her” and many other instances of misspelling. It would fail if there was music playing in the background.

    I extracted the audio from the videos into small .wav format and .mp3 format but both failed.

    I spent over a week trying to create suitable subtitles to no avail.

    • eRac@lemmings.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 day ago

      Recognition in general is the main thing it’s powerful for. Speech to text, OCR, etc.