New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic in merge thread #1125
Comments
Thank you for the thorough bug report (again) :) Can you share the size of your documents (in KB is ok) and the number of documents you have in your segments today? Tantivy panics if it tries to merge segments and end up with a result amounting to more than 4 billions tokens in the segment. |
Oops no I didn't analyze the bug correctly I think |
This is pretty bad... The position file must be corrupted. Right now it acts as if one document has 4 billions tokens. |
See https://gist.github.com/appaquet/54c30d8c7f82712934f58c82a6592e10 for list of files and some more logging about the merged segments and their size. As you can see, it's definitely not hitting the 4B tokens limit. |
Question would be if the corruption is coming from the merge code or somewhere else. |
Running on 0.15.3, the merge thread panics while trying to list segment positions. I couldn't find a small reproducible code that triggers the issue yet, and this seems to only happen in my project after a few days of usage with probably about a hundred mutations per day. After trying to reproduce the issue with random mutations in my quest of finding a small reproducing example, I have the feeling that it's the result of some corruption after many merges.
Backtrace
After adding some debugging right before the panic point, and it seems that the positions explode in value which gets summed in loop and eventually overflows the u32.
Debugging code
Outputs
Let me know if there is anything I could provide to help. Even though my code that uses Tantivy is open source, it's probably too complex to reproduce easily on your side and would require my corrupted index that contains private data. Feel free to contact me privately if it can help debugging (my gh username @ gmail.com)
The text was updated successfully, but these errors were encountered: