# Transcription Video **transcription** consists in transcribing the audio content of a video to a text. > This process might be called __Automatic Speech Recognition__ or __Speech to Text__ in more general context. Provide a common API to many transcription backend, currently: - `openai-whisper` CLI - `faster-whisper` (*via* `whisper-ctranslate2` CLI) > Potential candidates could be: whisper-cpp, vosk, ... ## Requirements - Python 3 - PIP And at least one of the following transcription backend: - Python: - `openai-whisper` - `whisper-ctranslate2>=0.4.3` ## Usage Create a transcriber manually: ```typescript import { OpenaiTranscriber } from '@peertube/peertube-transcription' (async () => { // Optional if you want to use a local installation of transcribe engines const binDirectory = 'local/pip/path/bin' // Create a transcriber powered by OpenAI Whisper CLI const transcriber = new OpenaiTranscriber({ name: 'openai-whisper', command: 'whisper', languageDetection: true, binDirectory }); // If not installed globally, install the transcriber engine (use pip under the hood) await transcriber.install('local/pip/path') // Transcribe const transcriptFile = await transcriber.transcribe({ mediaFilePath: './myVideo.mp4', model: 'tiny', format: 'txt' }); console.log(transcriptFile.path); console.log(await transcriptFile.read()); })(); ``` Using a local model file: ```typescript import { WhisperBuiltinModel } from '@peertube/peertube-transcription/dist' const transcriptFile = await transcriber.transcribe({ mediaFilePath: './myVideo.mp4', model: await WhisperBuiltinModel.fromPath('./models/large.pt'), format: 'txt' }); ``` You may use the builtin Factory if you're happy with the default configuration: ```Typescript import { transcriberFactory } from '@peertube/peertube-transcription' transcriberFactory.createFromEngineName({ engineName: transcriberName, logger: compatibleWinstonLogger, transcriptDirectory: '/tmp/transcription' }) ``` > For further usage [../tests/src/transcription/whisper/transcriber/openai-transcriber.spec.ts](../tests/src/transcription/whisper/transcriber/openai-transcriber.spec.ts) ## Lexicon - ONNX: Open Neural Network eXchange. A specification, the ONNX Runtime run these models. - GPTs: Generative Pre-Trained Transformers - LLM: Large Language Models - NLP: Natural Language Processing - MLP: Multilayer Perceptron - ASR: Automatic Speech Recognition - WER: Word Error Rate - CER: Character Error Rate