Commit Graph

31 Commits (3b5b050a98e0a79a2b1f75658278e16b75dbddfb)

Author SHA1 Message Date
Chocobozzz b66963fe6f
Runner can choose job type 2024-06-28 08:44:59 +02:00
Chocobozzz 1bfb791e05
Integrate transcription in PeerTube 2024-06-28 08:44:58 +02:00
lutangar ef14cf4a5c
feat(transcription): groundwork
chore: fiddling around some more

chore: add ctranslate2 and timestamped

chore: add performance markers

chore: refactor test

chore: change worflow name

chore: ensure Python3

chore(duration): convert to chai/mocha syntahx

chore(transcription): add individual tests for others transcribers

chore(transcription): implement formats test of all implementations

Also compare result of other implementation to the reference implementation

chore(transcription): add more test case with other language and models size and local model

chore(test): wip ctranslate 2 adapat

chore(transcription): wip transcript file and benchmark

chore(test): clean a bit

chore(test): clean a bit

chore(test): refacto timestamed spec

chore(test): update workflow

chore(test): fix glob expansion with sh

chore(test): extract some hw info

chore(test): fix async tests

chore(benchmark): add model info

feat(transcription): allow use of a local mode in timestamped-whisper

feat(transcription): extract run and profiling info in own value object

feat(transcription): extract run concept in own class an run more bench

chore(transcription): somplify run object only a uuid is now needed and add more benchmark scenario

docs(transcription): creates own package readme

docs(transcription): add local model usage

docs(transcription): update README

fix(transcription): use fr video for better comparison

chore(transcription): make openai comparison passed

docs(timestamped): clea

chore(transcription): change transcribers transcribe method signature

Introduce whisper builtin model.

fix(transcription): activate language detection

Forbid transcript creation without a language.
Add `languageDetection` flag to an engine and some assertions.

Fix an issue in `whisper-ctranslate2` :
https://github.com/Softcatala/whisper-ctranslate2/pull/93

chore(transcription): use PeerTube time helpers instead of custom ones

Update existing time function to output an integer number of seconds and add a ms human-readable time formatter with hints of tests.

chore(transcription): use PeerTube UUID helpers

chore(transcription): enable CER evaluation

Thanks to this recent fix in Jiwer <3
https://github.com/jitsi/jiwer/issues/873

chore(jiwer): creates JiWer package

I'm not very happy with the TranscriptFileEvaluator constructor... suggestions ?

chore(JiWer): add usage in README

docs(jiwer): update JiWer readme

chore(transcription): use FunMOOC video in fixtures

chore(transcription): add proper english video fixture

chore(transcription): use os tmp directory where relevant

chore(transcription): fix jiwer cli test reference.txt

chore(transcription): move benchmark out of tests

chore(transcription): remove transcription workflow

docs(transcription): add benchmark info

fix(transcription): use ms precision in other transcribers

chore(transcription): simplify most of the tests

chore(transcription): remove slashes when building path with join

chore(transcription): make fromPath method async

chore(transcription): assert path to model is a directory for CTranslate2 transcriber

chore(transcription): ctranslate2 assertion

chore(transcription): ctranslate2 assertion

chore(transcription): add preinstall script for Python dependencies

chore(transcription): add download and unzip utils functions

chore(transcription): add download and unzip utils functions

chore(transcription): download & unzip models fixtures

chore(transcription): zip

chore(transcription): raise download file test timeout

chore(transcription): simplify download file test

chore(transcription): add transcriptions test to CI

chore(transcription): raise test preconditions timeout

chore(transcription): run preinstall scripts before running ci

chore(transcription): create dedicated tmp folder for transcriber tests

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): use short video for local model test

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): setup verbosity based on NODE_ENV value
2024-06-28 08:43:40 +02:00
Chocobozzz 4f4d3adf73
Update apps dependencies 2024-06-21 14:49:17 +02:00
Chocobozzz 29329d6c45 Implement auto tag on comments and videos
* Comments and videos can be automatically tagged using core rules or
   watched word lists
 * These tags can be used to automatically filter videos and comments
 * Introduce a new video comment policy where comments must be approved
   first
 * Comments may have to be approved if the user auto block them using
   core rules or watched word lists
 * Implement FEP-5624 to federate reply control policies
2024-05-29 15:03:14 +02:00
Chocobozzz 3bfecf4890
Update runner version 2024-04-04 16:33:06 +02:00
Chocobozzz 93b09bf891
Fix stuck runner 2024-04-04 16:31:01 +02:00
Chocobozzz faabe996ba
Update runner version 2024-04-03 14:25:14 +02:00
Chocobozzz 17fb4fd6d0
Update runner version 2024-04-03 09:17:58 +02:00
Chocobozzz 0794fe2ac1
Fix runner ffmpeg logger 2024-04-03 09:17:45 +02:00
Chocobozzz 121efedde2
Update peertube-runner version 2024-03-29 15:04:44 +01:00
Chocobozzz 33607e3268
Add ping debug in peertube-runner 2024-03-29 15:04:03 +01:00
Chocobozzz 128748e6e4
Update peertube-runner version 2024-03-19 09:59:47 +01:00
Chocobozzz c09e27d77a
Optimize transcoding profile building 2024-03-19 09:53:59 +01:00
Chocobozzz 4e98d843da
Success on update "not in processing state" error
Or the job is never "ended"
2024-03-19 09:26:40 +01:00
Chocobozzz c727a34cb6
Prevent aborting another live session 2024-03-18 16:09:22 +01:00
Chocobozzz 1a8b20ba30
Less verbose on expected error 2024-03-18 11:28:43 +01:00
Chocobozzz 54a7183b11
Check available jobs on reconnection 2024-03-08 10:29:41 +01:00
Chocobozzz 473840e890
Update apps versions 2024-02-23 15:49:09 +01:00
Chocobozzz ed77d65699
Upgrade peertube runner dependencies 2024-02-23 15:26:08 +01:00
Chocobozzz 1e17dece73
Update peertube-cli dependencies 2024-02-23 15:23:53 +01:00
Chocobozzz 9e2700b89d
Fix lint 2024-02-22 10:32:28 +01:00
Chocobozzz 780f17f116
Fix upload script preview short option 2024-02-22 10:07:03 +01:00
Chocobozzz 88006beeb3
Fix peertube-runner with node >= 20.11
See https://github.com/Chocobozzz/PeerTube/issues/6171
2024-01-19 10:53:57 +01:00
Chocobozzz f0b8938a80
Update peertube runner version 2023-12-20 15:30:14 +01:00
Henri BAUDESSON 83f8ea5c14 Runner download videoFileUrl follow redirect
The runner downloads the video file from the url set in the paylaod
of a transcoding job. This url is pointing to our API and the runner
will make POST request  to it with an jobToken and a runnerToken.
Doing this ensure we can verify the tokens and return the video file.
But returning the video file also means that we are using server
resources to serve the file. If the runner is able to follow the
redirect, we can do our usual verification and return a redirect
response to the url of the video file, the runner will download it
using his own resources.
2023-12-20 15:26:43 +01:00
Chocobozzz bda1d751a5
Add warning for web_videos directory name 2023-11-29 09:28:12 +01:00
Chocobozzz 6b44f0b03c
Publish new version of peertube runner 2023-08-28 17:50:24 +02:00
Chocobozzz 6a85ec0480
Also handle SIGTERM to cleanup jobs 2023-08-28 16:52:08 +02:00
Chocobozzz 276f5fa24f
Fix peertube runner build 2023-08-18 07:54:31 +02:00
Chocobozzz 3a4992633e
Migrate server to ESM
Sorry for the very big commit that may lead to git log issues and merge
conflicts, but it's a major step forward:

 * Server can be faster at startup because imports() are async and we can
   easily lazy import big modules
 * Angular doesn't seem to support ES import (with .js extension), so we
   had to correctly organize peertube into a monorepo:
    * Use yarn workspace feature
    * Use typescript reference projects for dependencies
    * Shared projects have been moved into "packages", each one is now a
      node module (with a dedicated package.json/tsconfig.json)
    * server/tools have been moved into apps/ and is now a dedicated app
      bundled and published on NPM so users don't have to build peertube
      cli tools manually
    * server/tests have been moved into packages/ so we don't compile
      them every time we want to run the server
 * Use isolatedModule option:
   * Had to move from const enum to const
     (https://www.typescriptlang.org/docs/handbook/enums.html#objects-vs-enums)
   * Had to explictely specify "type" imports when used in decorators
 * Prefer tsx (that uses esbuild under the hood) instead of ts-node to
   load typescript files (tests with mocha or scripts):
     * To reduce test complexity as esbuild doesn't support decorator
       metadata, we only test server files that do not import server
       models
     * We still build tests files into js files for a faster CI
 * Remove unmaintained peertube CLI import script
 * Removed some barrels to speed up execution (less imports)
2023-08-11 15:02:33 +02:00