23 packages found

youtube-transcript-api

This is an python API which allows you to get the transcripts/subtitles for a given YouTube video. It also works for automatically generated subtitles, supports translating subtitles and it does not require a headless browser, like other selenium based solutions do!
  1. cli
  2. subtitle
  3. subtitles
  4. transcript
  5. transcripts
  6. youtube
  7. youtube-api
  8. youtube-subtitles
  9. youtube-transcripts
  10. asr
  11. captions
  12. python
  13. translating-transcripts
  14. youtube-asr
  15. youtube-captions
  16. youtube-transcript
  17. youtube-video
17 Contributors
1.0.3published 3 weeks agoMIT

whisper-timestamped

Multi-lingual Automatic Speech Recognition (ASR) based on Whisper models, with accurate word timestamps, access to language detection confidence, several options for Voice Activity Detection (VAD), and more.
  1. asr
  2. attention-is-all-you-need
  3. attention-mechanism
  4. attention-model
  5. attention-network
  6. attention-seq2seq
  7. attention-visualization
  8. deep-learning
  9. machine-learning
  10. multilingual-models
  11. python
  12. python3
  13. pytorch
  14. speaker-diarization
  15. speech
  16. speech-processing
  17. speech-recognition
  18. speech-to-text
  19. transformers
  20. whisper
8 Contributors
1.15.8published 5 months agoGPL-3.0

deepgram-sdk

The official Python SDK for the Deepgram automated speech recognition platform.
  1. deepgram
  2. speech-to-text
  3. asr
  4. automated-speech-recognition
  5. hacktoberfest
  6. python
  7. speech-recognition
  8. text-to-speech
  9. voice-agent
  10. voice-ai
39 Contributors
3.11.0published 3 days agoMIT

paddlespeech

Speech tools and models based on Paddlepaddle
  1. SSLspeech
  2. asr
  3. tts
  4. speaker
  5. verfication
  6. speech
  7. classfication
  8. text
  9. frontend
  10. MFA
  11. paddlepaddle
  12. paddleaudio
  13. streaming
  14. beam
  15. search
  16. ctcdecoder
  17. deepspeech2
  18. wav2vec2
  19. hubert
  20. wavlm
  21. transformer
  22. conformer
  23. fastspeech2
  24. hifigan
  25. gan
  26. vocoders
  27. code-switch
  28. kws
  29. punctuation-restoration
  30. self-supervised-learning
  31. sound-classification
  32. speech-alignment
  33. speech-recognition
  34. speech-synthesis
  35. speech-translation
  36. streaming-asr
  37. streaming-tts
  38. vocoder
  39. voice-cloning
  40. voice-recognition
  41. whisper
125 Contributors
1.4.2published 10 months agoApache-2.0

sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 11 programming languages
  1. aarch64
  2. android
  3. arm32
  4. asr
  5. cpp
  6. csharp
  7. dotnet
  8. ios
  9. lazarus
  10. linux
  11. macos
  12. mfc
  13. object-pascal
  14. onnx
  15. raspberry-pi
  16. risc-v
  17. speech-to-text
  18. text-to-speech
  19. vits
  20. windows
121 Contributors
1.11.3published 2 weeks agoApache-2.0, Sendmail

vosk

Offline open source speech recognition API based on Kaldi and Vosk
  1. android
  2. asr
  3. deep-learning
  4. deep-neural-networks
  5. deepspeech
  6. google-speech-to-text
  7. ios
  8. kaldi
  9. offline
  10. privacy
  11. python
  12. raspberry-pi
  13. speaker-identification
  14. speaker-verification
  15. speech-recognition
  16. speech-to-text
  17. speech-to-text-android
  18. stt
  19. voice-recognition
  20. vosk
40 Contributors
0.3.45published 2 years agoApache-2.0

rustfst-python

Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
  1. fst
  2. openfst
  3. graph
  4. transducer
  5. acceptor
  6. shortest-path
  7. minimize
  8. determinize
  9. wfst
  10. asr
  11. automata
  12. composition
  13. finite-state-acceptors
  14. finite-state-transducers
  15. fsts
  16. kaldi
  17. kaldi-asr
  18. rust
  19. rust-crate
  20. rust-lang
  21. speech-recognition
  22. tokenizer
  23. transducers
1.1.2published 8 months agoApache-2.0

paddleaudio

Speech audio tools based on Paddlepaddle
  1. audio
  2. processpaddlepaddle
  3. asr
  4. code-switch
  5. conformer
  6. kws
  7. punctuation-restoration
  8. self-supervised-learning
  9. sound-classification
  10. speech-alignment
  11. speech-recognition
  12. speech-synthesis
  13. speech-translation
  14. streaming-asr
  15. streaming-tts
  16. transformer
  17. tts
  18. vocoder
  19. voice-cloning
  20. voice-recognition
  21. wav2vec2
  22. whisper
125 Contributors
1.1.0published 2 years agoApache-2.0

rapid-paraformer

Tool of speech recognition.
  1. asr
  2. paraformer
  3. wenet
  4. paddlespeech
2.0.5published 11 months agoApache-2.0

nemo-toolkit

NeMo - a toolkit for Conversational AI
  1. NLP
  2. NeMo
  3. deep
  4. gpu
  5. language
  6. learning
  7. machine
  8. nvidia
  9. pytorch
  10. speech
  11. torch
  12. tts
  13. asr
  14. deeplearning
  15. generative-ai
  16. large-language-models
  17. machine-translation
  18. multimodal
  19. neural-networks
  20. speaker-diariazation
  21. speaker-recognition
  22. speech-synthesis
  23. speech-translation
384 Contributors
2.2.1published 2 weeks agoOther
Showing 1 to 10 of 23 results