[Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information #823

rohit1coding · 2024-05-09T05:33:05Z

What happened?

Issue Description:
I am currently working on a project where real-time audio playback is required while text-to-speech conversion is still in progress. To achieve lower latency, I am attempting to play partial speech segments as they become available instead of waiting for the complete text-to-speech data.

Problem:
The issue arises with the partial audio buffers: these buffers are raw and lack the necessary header information that browsers require for decoding and playback. Consequently, while complete speech data from the speakSsmlAsync method works correctly, partial speech data does not function as expected due to this missing header.

Code Implementation:
Backend Code:
const pushStream = SpeechSdk.AudioOutputStream.createPullStream(); const audioConfig = SpeechSdk.AudioConfig.fromStreamOutput(pushStream); synthesizer = new SpeechSdk.SpeechSynthesizer(speechConfig, audioConfig); pushStream.write = (audioData) => { playAudio(audioData); };

Frontend Code:
const playAudio = async (audioData) => { const audioDataBufferArray = Uint8Array.from(audioData).buffer; try { const decodedAudioBuffer = await audioContext.decodeAudioData(audioDataBufferArray); } catch (error) { console.error('Error decoding audio data:', error); } };

Expected Behavior:
The browser should be able to decode and play partial speech segments without any issues.

Current Behavior:
The browser fails to decode the partial audio data due to the absence of header information, leading to errors and inability to play the speech segments.

Steps to Reproduce:
Initiate the text-to-speech conversion process.
Attempt to play audio as it is being synthesized.
Observe that while complete audio data plays without issues, partial segments fail to decode and play.

Potential Solutions:
A possible approach to resolve this issue could involve dynamically adding the necessary header information to the partial buffers before attempting playback, or implementing a method to handle raw audio data more effectively in the browser.

This issue significantly affects the usability of real-time audio features in our application, and any guidance or solutions would be greatly appreciated.

Version

1.36.0 (Latest)

What browser/platform are you seeing the problem on?

Chrome

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

glharper · 2024-05-17T14:28:43Z

@rohit1coding Thank you for using JS Speech SDK, and writing this issue up. About how many bytes are these partial speech segments you're wanting to decode? There is code in the JS Speech SDK for creating a wav header here, if you'd like to reuse it in your own code to prepend to the audio stream before writing to the pushStream.

rohit1coding added the bug Something isn't working label May 9, 2024

rohit1coding assigned glharper May 9, 2024

glharper added the question Further information is requested label May 22, 2024

glharper added pending close Ready for closure pending follow-up or prolonged inactivity and removed bug Something isn't working labels May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information #823

[Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information #823

rohit1coding commented May 9, 2024 •

edited

glharper commented May 17, 2024

[Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information #823

[Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information #823

Comments

rohit1coding commented May 9, 2024 • edited

What happened?

Version

What browser/platform are you seeing the problem on?

Relevant log output

glharper commented May 17, 2024

rohit1coding commented May 9, 2024 •

edited