Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No PunctuationBoundary reported when a sentence ends with the word sun. #648

Open
fresswolf opened this issue Mar 13, 2023 · 3 comments
Open
Assignees

Comments

@fresswolf
Copy link

I couldn't find any exact definition of the word boundary types reported by the API. So I'm not sure if this is a bug.

I found that usually when a sentence end with a dot "." a word boundary event with the type SpeechSynthesisBoundaryType.Punctuation and the text "." is fired.

However, if the sentence ends with "sun", only an event with type SpeechSynthesisBoundaryType.Word is fired and the punctuation gets included in that word "sun." The issue can't be reproduced with similar short words like "son", where it seems to work as expected.

The following function reproduces the unexpected behaviour.

const wordBoundaryTest = async() => {
    const text = "Here comes the sun. Enjoy it!";
    const speechConfig = SpeechConfig.fromSubscription(process.env.AZURE_SPEECH_KEY, process.env.AZURE_SPEECH_REGION);
    speechConfig.speechSynthesisOutputFormat = SpeechSynthesisOutputFormat.Audio24Khz96KBitRateMonoMp3;
    speechConfig.speechSynthesisVoiceName = "en-US-EricNeural";
    const synthesizer = new SpeechSynthesizer(speechConfig);

    synthesizer.wordBoundary = function (s, e) {
        console.log(`Word "${e.text}" with type ${e.boundaryType}"`);
    }

    synthesizer.speakTextAsync(text);
}

Actual behaviour:
Word "Here" with type WordBoundary"
Word "comes" with type WordBoundary"
Word "the" with type WordBoundary"
Word "sun." with type WordBoundary"
Word "Enjoy" with type WordBoundary"
Word "it" with type WordBoundary"
Word "!" with type PunctuationBoundary"

Expected behaviour:
Word "Here" with type WordBoundary"
Word "comes" with type WordBoundary"
Word "the" with type WordBoundary"
Word "sun" with type WordBoundary"
Word "." with type PunctuationBoundary"
Word "Enjoy" with type WordBoundary"
Word "it" with type WordBoundary"
Word "!" with type PunctuationBoundary"

Also, the SpeechSynthesisBoundaryType.Sentence seems to never fire, but as there is no documentation about these events it's unclear what the event actually means.

@glharper
Copy link
Member

@yulin-li Could you take a look at this? Thanks.

@yulin-li
Copy link
Contributor

@fresswolf thanks for reporting this bug. I can confirm this is a BUG, I guess the reason is sun. is recognized by the system as a single word. We are investigating.

@fresswolf
Copy link
Author

@yulin-li Meanwhile I came across situations where it doesn't only fail to report a punctuation, but it's also ignored by the speech synthesis.
Example:
<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' name='en-US-TonyNeural'>The brothers stand. Mario doesn't.</voice></speak>

In the resulting audio the word "stand" and "Mario" are weirdly melted together in a new word "Standmario" and the punctuation isn't reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants