-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcribe OGG(OPUS) format #351
Comments
Currently there isn't support for compressing the JavaScript SDK->Service audio stream. We have an item in our backlog for it, but it hasn't been scheduled. |
Hi @rhurey, We are also interested with another missing coder: MULAW. Is it more complicated than I just described? Thanks, |
This would be a huge win to reduce network usage for the JS SDK which is often used client-side. Any estimates when it can be expected? 🙏🏼 |
For your information, there is a workaround that uses a private API of the SDK. Sample code: import * as sdk from 'microsoft-cognitiveservices-speech-sdk';
import {
AudioStreamFormatImpl,
AudioFormatTag
} from 'microsoft-cognitiveservices-speech-sdk/distrib/lib/src/sdk/Audio/AudioStreamFormat';
function startRecognition() {
// Note: AudioStreamFormatImpl is private (not exposed on Azure SDK).
const audioFormat = new AudioStreamFormatImpl(8000, 8, 1, AudioFormatTag.MuLaw);
const pushStream = sdk.AudioInputStream.createPushStream(audioFormat);
const audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);
const speechConfig = sdk.SpeechConfig.fromSubscription(/*...*/);
const recognizer = new sdk.SpeechRecognizer(speechConfig, audioConfig);
// ...
} We have tested it with MuLaw but I guess it should also work for OPUS. |
I'm expecting a feature for transcribing OGG/OPUS audio data.
Current SDK version seems not to be adapted for that.
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/releasenotes
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-codec-compressed-audio-input-streams?tabs=debian&pivots=programming-language-csharp
or, is there any solutions with current SDK version?
here is my platfrom
Windows10
Electron@6.3.3 (electron can use the browser js feature, Node.js, and wasm)
OGG/OPUS is very good for network. This feature influences whether we choose Azure for transcribing.
thank you
The text was updated successfully, but these errors were encountered: