synthesizer.speakSsmlAsync fails with mstts tags only #528

gad2103 · 2022-05-01T04:34:53Z

When I use the sdk to generate speech it works fine with the following ssml:

let ssmlThatWorks = "<speak version=\"1.0\" xmlns=\"https://www.w3.org/2001/10/synthesis\" xml:lang=\"en-US\">\n" +
        "  <voice name=\"en-US-JennyNeural\">\n" +
        toContentString(parsedXml) + " \n" +
        "  </voice>\n" +
        "</speak>"

however, if i use the variant that includes the mstts tags i get the following,
Error Message

SpeechSynthesisResult {
  privResultId: '147A8275E1E34BE3AD49F7892846A194',
  privReason: 1,
  privErrorDetails: "Unexpected TextToSpeech.Protocols.Universal.Messages.AudioMetadataResponseMessage' message for Reque websocket error code: 1002",
  privProperties: PropertyCollection {
    privKeys: [ 'CancellationErrorCode' ],
    privValues: [ 'ConnectionFailure' ]
  },
  privAudioData: undefined,
  privAudioDuration: undefined
}

the ssml that reproduces that error consistently looks like, ( replacing the bracketed content with any public audio file)

let ssmlThatProducesErrors = "<speak version=\"1.0\" xmlns:mstts=\"http://www.w3.org/2001/mstts\" xml:lang=\"en-US\">\n" +
            "<mstts:backgroundaudio src=\"https://[PUT PUBLIC AUDIO FILE HERE TO REPRODUCE].mp3\" volume=\"0.7\" fadein=\"0\" fadeout=\"0\" />  <voice name=\"en-US-JennyNeural\">\n" +
            "Hello  \n" +
            "  </voice>\n" +
            "</speak>"

If I test the bad ssml in the browser on the official azure tts site, everything is generated correctly...

I would love to be able to use background music in my application!

Other things i tried:

generating a new api key
upgrading my account to pay as you go

Any help would be greatly appreciated! Thanks in advance.

The text was updated successfully, but these errors were encountered:

glharper · 2022-05-05T18:45:29Z

@yulin-li Is there a service contact we can pass this to?

yulin-li · 2022-05-11T05:13:37Z

Hi @gad2103, I still cannot repro your error, could you share the resultId with us? We can check at service side.

gad2103 · 2022-05-11T19:26:42Z

@yulin-li can you share how you're trying to repro? is the result id not in the original error message i posted ☝️

SpeechSynthesisResult {
  privResultId: '147A8275E1E34BE3AD49F7892846A194',
  privReason: 1,
  privErrorDetails: "Unexpected TextToSpeech.Protocols.Universal.Messages.AudioMetadataResponseMessage' message for Reque websocket error code: 1002",
  privProperties: PropertyCollection {
    privKeys: [ 'CancellationErrorCode' ],
    privValues: [ 'ConnectionFailure' ]
  },
  privAudioData: undefined,
  privAudioDuration: undefined
}

if no, where do i find the correct id?

gad2103 · 2022-05-11T19:30:16Z

looks possibly related Azure-Samples/cognitive-services-speech-sdk#1492

johnmalatras · 2022-05-11T22:27:46Z

I'm also seeing issues with background audio (that is my issue @gad2103 linked above). I've spent several hours debugging and have yet to get it to work - unfortunately this is necessary for our use case

yulin-li · 2022-05-12T16:10:38Z

Hi @gad2103, sorry for missing the result id in your error message.

I can repro the issue now, if I set the audio output format to Raw8Khz8BitMonoMULaw as you set. I report this issue to service guys and they will take a look.

As a workaround, could you try to use formats other than 8kHz ones?

johnmalatras · 2022-05-12T16:41:56Z

For what it's worth I'm also seeing the issue with audio-48khz-96kbitrate-mono-mp3

gad2103 · 2022-05-12T21:07:46Z

Hi @gad2103, sorry for missing the result id in your error message.

I can repro the issue now, if I set the audio output format to Raw8Khz8BitMonoMULaw as you set. I report this issue to service guys and they will take a look.

As a workaround, could you try to use formats other than 8kHz ones?

@yulin-li i can try to see if that resolves the error, however that's the audio format i need for my application.

yulin-li · 2022-05-13T03:59:46Z

I understand, the service guys are working on this bug

gad2103 · 2022-05-24T00:46:33Z

I understand, the service guys are working on this bug

just checking in on the status here

johnmalatras · 2022-06-01T19:21:30Z

Also wanting to follow up on this. To add another data point - long form synthesis fails entirely when I include the background audio tag.

ciaran-parloa · 2023-08-23T08:07:03Z

We are also affected by this issue, any updates @yulin-li ?

sebvieux · 2024-04-11T16:33:21Z

Hi, any updates on this issue ?

yulin-li self-assigned this May 11, 2022

yulin-li mentioned this issue May 12, 2022

Background audio SSML tag not working Azure-Samples/cognitive-services-speech-sdk#1492

Closed

yulin-li added bug Something isn't working text-to-speech labels May 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

synthesizer.speakSsmlAsync fails with mstts tags only #528

synthesizer.speakSsmlAsync fails with mstts tags only #528

gad2103 commented May 1, 2022

glharper commented May 5, 2022

yulin-li commented May 11, 2022

gad2103 commented May 11, 2022

gad2103 commented May 11, 2022

johnmalatras commented May 11, 2022

yulin-li commented May 12, 2022 •

edited

johnmalatras commented May 12, 2022

gad2103 commented May 12, 2022

yulin-li commented May 13, 2022

gad2103 commented May 24, 2022

johnmalatras commented Jun 1, 2022

ciaran-parloa commented Aug 23, 2023

sebvieux commented Apr 11, 2024

synthesizer.speakSsmlAsync fails with mstts tags only #528

synthesizer.speakSsmlAsync fails with mstts tags only #528

Comments

gad2103 commented May 1, 2022

glharper commented May 5, 2022

yulin-li commented May 11, 2022

gad2103 commented May 11, 2022

gad2103 commented May 11, 2022

johnmalatras commented May 11, 2022

yulin-li commented May 12, 2022 • edited

johnmalatras commented May 12, 2022

gad2103 commented May 12, 2022

yulin-li commented May 13, 2022

gad2103 commented May 24, 2022

johnmalatras commented Jun 1, 2022

ciaran-parloa commented Aug 23, 2023

sebvieux commented Apr 11, 2024

yulin-li commented May 12, 2022 •

edited