New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One-shot BLEU-[2, 3, 4] computation #2320
Comments
This would be nice to have... it's a question of who would like to implement it. |
Hi, I was looking at getting involved with contributing to NLTK and saw this with the 'goodfirstbug' tag. I will take a crack at this problem if that is OK. |
Feel free to take a look at https://github.com/nltk/nltk/blob/develop/CONTRIBUTING.md and create a pull-request and someone will review it before merging. |
nltk#2320 corpus_bleu function runs inefficiently when being used with different weightings by recalculating the underlying values each time the function is called instead of reusing them. * Creates a unit test with the expected behavior of a more general function that can take multiple weightings and return multiple BLEU scores
nltk#2320 corpus_bleu function runs inefficiently when being used with different weightings by recalculating the underlying values each time the function is called instead of reusing them. * Creates a more generalized function that can calculate BLEU scores for multiple pre-specified weighings * Adjusts other functions/tests impacted accordingly
nltk#2320 corpus_bleu function runs inefficiently when being used with different weightings by recalculating the underlying values each time the function is called instead of reusing them. * Creates a more generalized function that can calculate BLEU scores for multiple pre-specified weightings * Adjusts other functions/tests impacted accordingly
nltk#2320 corpus_bleu function runs inefficiently when being used with different weightings by recalculating the underlying values each time the function is called instead of reusing them. * Creates a more generalized function that can calculate BLEU scores for multiple pre-specified weightings * Adjusts other functions/tests impacted accordingly
Hi @stevenbird @alvations if its currently open can I get involved with it?? |
@BatMrE Yes, you may. We welcome contributions! |
@ tomaarsen @alvations I will ultimately be calling the Bleu funnction for each of the 2 , 3 ,4 ... len(weights) , so how will it be saving computation waste of calculating the same values more than one time?? |
@BatMrE For each BLEU-k, the computation has to compute all k-grams, all k-1-grams, k-2-grams, ..., 2-grams, 1-grams. This can be noticed in the following section of the corpus_bleu algorithm: nltk/nltk/translate/bleu_score.py Lines 174 to 179 in f989fe6
In short, calculating BLEU-4 requires going over all 4-grams, 3-grams, 2-grams and 1-grams. If we then also want to compute BLEU-3, then There's a few options to implement this. One option is to always compute the lower ngram order BLEU's and cache them, so if they are requested, no recomputation is needed. However, this would increase the overhead a lot when a user isn't interested in the lower order ngram BLEU's. Another option is to add a boolean parameter to Something to think about: Or should the user be able to optionally supply a list of weights? In the event that this occurs, multiple BLEU scores will be computed, one for each of the weights. Perhaps it would then still be possible to reuse computation of ngrams.
|
Hello everyone,
I need to compute the BLEU score with more than one ngram length (ideally, BLEU2, BLEU3, BLEU4, and BLEU5). In my case, this is a very long task, as every hypothesis has some thousand references.
Reading the implementation of the corpus_bleu function, which takes weights:Tuple between its parameters - and thus calculating BLEU-[len(weights)] - , I found out that it gets all the information to compute BLEU-m s.t. 2 <= m < len(weights).
Wouldn't it be nice to have a more general function that can compute BLEU with different ngram lengths at the same time? A possible implementation would be the weights parameter accepting a list of weight tuples as value, computing precisions that are useful for the longest tuple and using those precisions to get all the desired BLEU scores using the weights list.
This would result in a more general implementation and it would avoid the computation waste of calculating the same values more than one time.
The text was updated successfully, but these errors were encountered: