Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(German) hyphenation derailed by punctuation characters #37

Closed
allefeld opened this issue Apr 18, 2022 · 2 comments
Closed

(German) hyphenation derailed by punctuation characters #37

allefeld opened this issue Apr 18, 2022 · 2 comments

Comments

@allefeld
Copy link

I found this strange behavior:

> dic = pyphen.Pyphen(lang='de')

> dic.inserted('begreifbar')
'be-greif-bar'

> dic.inserted('begreifbar.')
'be-greif-ba-r.'

> dic.inserted('begreifbar«.')
'be-greif-ba-r«.'

The first hyphenation is correct. The second and third have trailing punctuation characters (« is a common closing-quote in German printing), which leads to an additional incorrect hyphenation point being inserted.

I tried to use the local hunspell dictionary instead (/usr/share/hyphen/hyph_de_DE.dic), with the same result.

In this case, I could fix it by removing punctuation characters myself, but I'd still consider it to be a bug, possibly related to #24 and #26.

@liZe
Copy link
Member

liZe commented Apr 19, 2022

Hello!

In this case, I could fix it by removing punctuation characters myself

Yes, that’s a "problem" already answered in this comment. Short answer: as some details are specific to each language (and probably to each application), it’s easier to remove the punctuation in your application.

@liZe
Copy link
Member

liZe commented Mar 12, 2023

Closing, as we don’t plan to handle punctuation in Pyphen.

@liZe liZe closed this as not planned Won't fix, can't repro, duplicate, stale Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants