Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where empty metadata List can result in "Index 0 out of bounds for length 0" exceptions in several scenarios #9509

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tdonohue
Copy link
Member

@tdonohue tdonohue commented Apr 29, 2024

References

Description

This PR fixes scenarios where an empty metadata List can result in an "Index 0 out of bounds for length 0" exception.

There are (at least) two scenarios where I've encountered this exception:

  1. When versioning an Entity, as described in Versioning an Entity can throw an "Index 0 out of bounds for length 0" exception dspace-angular#3072
  2. While testing 8.0, I ran into a scenario where an AIP import threw this error:
    Replacing DSpace object(s) with package located at AIPs/COLLECTION@123456789-1145.zip
    java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
        at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
        at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
        at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
        at java.base/java.util.Objects.checkIndex(Objects.java:361)
        at java.base/java.util.ArrayList.get(ArrayList.java:427)
        at org.dspace.content.DSpaceObjectServiceImpl.addMetadata(DSpaceObjectServiceImpl.java:318)
        at org.dspace.content.crosswalk.XSLTIngestionCrosswalk.applyDimField(XSLTIngestionCrosswalk.java:115)
        at org.dspace.content.crosswalk.XSLTIngestionCrosswalk.applyDim(XSLTIngestionCrosswalk.java:81)
        at org.dspace.content.crosswalk.XSLTIngestionCrosswalk.applyDim(XSLTIngestionCrosswalk.java:84)
        at org.dspace.content.crosswalk.XSLTIngestionCrosswalk.ingestDIM(XSLTIngestionCrosswalk.java:231)
        at org.dspace.content.crosswalk.AIPDIMCrosswalk.ingest(AIPDIMCrosswalk.java:185)
        at org.dspace.content.packager.METSManifest.crosswalkXmd(METSManifest.java:1128)
        at org.dspace.content.packager.METSManifest.crosswalkItemDmd(METSManifest.java:1018)
        at org.dspace.content.packager.DSpaceAIPIngester.crosswalkObjectDmd(DSpaceAIPIngester.java:155)
        at org.dspace.content.packager.AbstractMETSIngester.ingestObject(AbstractMETSIngester.java:451)
        at org.dspace.content.packager.AbstractMETSIngester.replace(AbstractMETSIngester.java:1075)
        at org.dspace.content.packager.AbstractPackageIngester.replaceAll(AbstractPackageIngester.java:275)
        at org.dspace.content.packager.AbstractPackageIngester.replaceAll(AbstractPackageIngester.java:303)
        at org.dspace.app.packager.Packager.replace(Packager.java:684)
    
    • The code which is referenced is obviously buggy. If the List in question is ever empty (because code called it with an empty metadata field or some other error occurs), then it fails entirely.

This PR is a small fix to avoid the error. All it does is ensure the List is not empty before returning first value

Instructions for Reviewers

  • See Versioning an Entity can throw an "Index 0 out of bounds for length 0" exception dspace-angular#3072 for how to reproduce the error with Entity versioning.
  • The error with AIP import may be more difficult to reproduce. However, the code should be very self explanatory, and it matches the code from the getMetadataFirstValue() methods in the same DSpaceObjectServiceImpl
    • With this code in place, I was able to complete a successful AIP restoration (with no metadata loss). Without it, I could not restore from AIPs.

@tdonohue tdonohue added bug tools: packager Related to package or AIP importer/exporter 1 APPROVAL pull request only requires a single approval to merge. port to dspace-7_x This PR needs to be ported to `dspace-7_x` branch for next bug-fix release labels Apr 29, 2024
@tdonohue tdonohue added this to the 8.0 milestone Apr 29, 2024
@tdonohue tdonohue added the testathon Reported by a tester during Community Testathon label Apr 29, 2024
@tdonohue tdonohue requested a review from abollini May 9, 2024 14:44
Copy link
Member

@abollini abollini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tdonohue,
IMHO the issue is caused by this hack org.dspace.content.DSpaceObjectServiceImpl.addMetadata(Context, T, MetadataField, String, List<String>, List<String>, List<Integer>, Supplier<Integer>) around line 255

 if (authorities != null && authorities.size() >= i) {
                    if (StringUtils.startsWith(authorities.get(i), Constants.VIRTUAL_AUTHORITY_PREFIX)) {
                        continue;
                    }
                }

that was obviously introduced to deal with management of relationship and virtual field.

I argue that it was a hack because it looks odd to me that a method named addMetadata should eventually return an empty list (i.e. don't add anything).

It is clear that modify the addMetadata method was an easier solution compared to rethink the way that virtual metadata are managed in dspace or to check all the places that would potentially pass a virtual metadata to the ItemServiceImpl but, IMHO, this is a clear sign of bad design that we should note as an "architecture technical debt".

That said, I'm ok with the changes that you have provided but I would suggest two improvements:

  • we need to add a test that demonstrate the bug and can be used to avoid regression in future refactoring (if my analysis is right an AIP containing virtual metadata should be enough)
  • we should check in the addMetadata method that the list received are not empty, otherwise we should throw an IllegalArgumentException. This to avoid that the changes would hide future bugs where the calling code is populating the list wrongly

@tdonohue
Copy link
Member Author

tdonohue commented May 23, 2024

This PR also fixes the newly reported bug DSpace/dspace-angular#3072, which can be reproduced by attempting to version an Entity that has relationships, as this same empty metadata list can be encountered (at least in some scenarios).

I'm working on addressing @abollini 's feedback to ensure this can be merged.

UPDATE: In the case of DSpace/dspace-angular#3072, the error there also seems to be caused by virtual metadata existing on the Entity that is being versioned. So, this bug is generally an issue with encountering empty metadata lists which seems to be caused by virtual metadata.

@tdonohue tdonohue changed the title Fix bug where empty metadata List can cause AIP import/restore to fail Fix bug where empty metadata List can result in "Index 0 out of bounds for length 0" exceptions in several scenarios May 23, 2024
…adata_5args_1. It appears this second test was meant to test a different addMetadata() method which accepts a single Value instead of a List
…y values. Minor refactors to MetadataImportIT to make findItemByName more efficient.
@tdonohue
Copy link
Member Author

tdonohue commented May 24, 2024

@abollini : Thank you for your feedback on this PR! As I dug into your suspicion, I found that you are 100% correct in your analysis. The IndexOutOfBoundsException was caused by virtual metadata fields being passed to addMetadata()... namely it occurs when addMetadata() is called with a single value and that value is a virtual metadata field. I also found this same IndexOutOfBoundsException can be triggered when versioning an Entity as described in DSpace/dspace-angular#3072 (so the issue is not unique to AIPs)

I took the approaches you suggested with a few minor modifications:

  1. I updated addMetadata() to throw an IllegalArgumentException if it is called with empty value(s). I added Unit tests to ItemTest to verify that is functioning properly. See commit ed918a8
    • This change resulted in test failures in MetadataImportIT. These failures were the result of an undiscovered bug in MetadataImport where it would sometimes call addMetadata() with empty values. I fixed that bug & cleaned up the tests (made them more efficient) in commit f8ac8ed
  2. I added two new tests to ItemTest which can reproduce the IndexOutOfBoundsException easily by passing virtual metadata values to addMetadata(). The specific test which triggers this exception is the new testAddMetadata_single_virtual_metadata() test. See commit 2eb7dbc
  3. Finally, while I was doing this work, I noticed an duplicate test in ItemTest. The existing testAddMetadata_5args_2 and testAddMetadata_5args_1 tests are identical on main. So, I updated the former test to call a different addMetadata() method (which was missing from our tests). See commit 25f722e

When you have a chance, please give this another review. I believe it now has sufficient tests to prove the bug exists on main and prove it is fixed by the code in this PR.

@tdonohue tdonohue requested review from abollini and J4bbi May 24, 2024 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 APPROVAL pull request only requires a single approval to merge. bug high priority port to dspace-7_x This PR needs to be ported to `dspace-7_x` branch for next bug-fix release testathon Reported by a tester during Community Testathon tools: packager Related to package or AIP importer/exporter
Projects
Status: 👀 Under Review
Development

Successfully merging this pull request may close these issues.

Versioning an Entity can throw an "Index 0 out of bounds for length 0" exception
2 participants