Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError when computing bleu_score #2204

Closed
betterenvi opened this issue Dec 16, 2018 · 4 comments
Closed

ZeroDivisionError when computing bleu_score #2204

betterenvi opened this issue Dec 16, 2018 · 4 comments

Comments

@betterenvi
Copy link

Python version: 3.5
NLTK version: 3.2.5

  File "/usr/local/lib/python3.5/dist-packages/nltk/translate/bleu_score.py", line 89, in sentence_bleu
    emulate_multibleu)
  File "/usr/local/lib/python3.5/dist-packages/nltk/translate/bleu_score.py", line 199, in corpus_bleu
    hyp_len=hyp_len, emulate_multibleu=emulate_multibleu)
  File "/usr/local/lib/python3.5/dist-packages/nltk/translate/bleu_score.py", line 544, in method4
    incvnt = i+1 * self.k / math.log(hyp_len) # Note that this K is different from the K from NIST.
ZeroDivisionError: float division by zero
@alvations
Copy link
Contributor

@betterenvi thank you for raising the issue. Could you provide the hypothesis and reference(s) input that caused the zero division error?

@truthless11
Copy link

truthless11 commented Jan 17, 2019

@alvations ZeroDivisionError also happens to me, here's an example of inputs:

from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
smoothie = SmoothingFunction().method4
ref = ['a', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
hyp = ['a']
sentence_bleu([ref], hyp, smoothing_function=smoothie)

@purificant
Copy link
Member

purificant commented Jan 18, 2019

@alvations ZeroDivisionError also happens to me, here's an example of inputs:

from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
smoothie = SmoothingFunction().method4
ref = ['a', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
hyp = ['a']
sentence_bleu([ref], hyp, smoothing_function=smoothie)

len(hyp) needs to be > 1

It makes no sense to use method 4 smoothing ( see http://acl2014.org/acl2014/W14-33/pdf/W14-3346.pdf ) as it causes a 1 / ln(len(hyp)) operation, which is division by zero for len(hyp) = 1

Even if method 4 is patched to work around that and essentially do nothing, there are still issues with calculating BLUE due to ln(0) operations for the example input.

@tomaarsen
Copy link
Member

I suspect this has been solved through #2839. Let us know if any normal uses of BLEU still throw unexpected exceptions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants