Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10777. S3 gateway error when parsing xml in concurrent execution #6609

Merged
merged 3 commits into from May 8, 2024

Conversation

guohao-rosicky
Copy link
Contributor

@guohao-rosicky guohao-rosicky commented Apr 30, 2024

What changes were proposed in this pull request?

An exception occurs occasionally when the business uses s3 gateway to perform Complete Multipart Upload or Delete Objects.

It was analyzed as a result of concurrent requests; See details: https://issues.apache.org/jira/browse/HDDS-10777

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10777

How was this patch tested?

UT

passed ci: https://github.com/guohao-rosicky/ozone/actions/runs/8890601157

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix. Left a few comments.

I did some research on the issue and yes it seems the concurrent requests might cause xml parsing to fail.

It seems the error (FWK005...) is thrown from XMLParserConfiguration#parse implementation (e.g. XML11Configuration#parse) because it's found that fParseInProgress flag is set during the parse. The fParseInProgress is set at the start of the method and unset at the end of it (not really sure why not use a synchronized method instead).

If we see the XmlNamespaceFilter#parse method, it will call XMLReader#parse method which will eventually calls XMLParserConfiguration#parse. In the unmarshaller code it instantiates XMLReader once in the constructor and the XMLReader instance will be passed in setParent call for each request. So, it seems the issue was because of the same XMLReader is used by concurrent requests.

Therefore, instead of instantiating JAXBContext, SAXParserFactory for every request, maybe we can try to only instantiate a new XMLReader in the readFrom and see whether if it resolves the issue?

Another possible solution might be to use a ThreadLocal for XMLReader. From my understanding currently the S3G Jetty thread model is "thread-per-request" model, so IMO it should be fine.

Comment on lines 59 to 61
JAXBContext context = JAXBContext.newInstance(MultiDeleteRequest.class);
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
saxParserFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need to instantiate this for every request. Could you help check whether instantiating only XMLReader for each request is enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivandika3 Thanks for the review.
Yes, this modification solves the problem as well, I've done a test, please see the new code.

Comment on lines 50 to 58
try {
context = JAXBContext.newInstance(CompleteMultipartUploadRequest.class);
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
saxParserFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
xmlReader = saxParserFactory.newSAXParser().getXMLReader();
} catch (Exception ex) {
throw new AssertionError("Can not instantiate " +
"CompleteMultipartUploadRequest parser", ex);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also need to change for PutBucketAclRequestUnmarshaller since it has a similar pattern (although not used as often).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update. LGTM +1. Let's wait for the CI run.

@ivandika3
Copy link
Contributor

@kerneltime Would you like to take a look?

@adoroszlai adoroszlai merged commit ff78dc8 into apache:master May 8, 2024
39 checks passed
@adoroszlai
Copy link
Contributor

Thanks @guohao-rosicky for the fix, @ivandika3 for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants