New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10777. S3 gateway error when parsing xml in concurrent execution #6609
HDDS-10777. S3 gateway error when parsing xml in concurrent execution #6609
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix. Left a few comments.
I did some research on the issue and yes it seems the concurrent requests might cause xml parsing to fail.
It seems the error (FWK005...
) is thrown from XMLParserConfiguration#parse
implementation (e.g. XML11Configuration#parse
) because it's found that fParseInProgress
flag is set during the parse. The fParseInProgress
is set at the start of the method and unset at the end of it (not really sure why not use a synchronized
method instead).
If we see the XmlNamespaceFilter#parse
method, it will call XMLReader#parse
method which will eventually calls XMLParserConfiguration#parse
. In the unmarshaller code it instantiates XMLReader
once in the constructor and the XMLReader
instance will be passed in setParent
call for each request. So, it seems the issue was because of the same XMLReader
is used by concurrent requests.
Therefore, instead of instantiating JAXBContext
, SAXParserFactory
for every request, maybe we can try to only instantiate a new XMLReader
in the readFrom
and see whether if it resolves the issue?
Another possible solution might be to use a ThreadLocal
for XMLReader
. From my understanding currently the S3G Jetty thread model is "thread-per-request" model, so IMO it should be fine.
JAXBContext context = JAXBContext.newInstance(MultiDeleteRequest.class); | ||
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance(); | ||
saxParserFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not need to instantiate this for every request. Could you help check whether instantiating only XMLReader
for each request is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivandika3 Thanks for the review.
Yes, this modification solves the problem as well, I've done a test, please see the new code.
try { | ||
context = JAXBContext.newInstance(CompleteMultipartUploadRequest.class); | ||
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance(); | ||
saxParserFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); | ||
xmlReader = saxParserFactory.newSAXParser().getXMLReader(); | ||
} catch (Exception ex) { | ||
throw new AssertionError("Can not instantiate " + | ||
"CompleteMultipartUploadRequest parser", ex); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also need to change for PutBucketAclRequestUnmarshaller
since it has a similar pattern (although not used as often).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update. LGTM +1. Let's wait for the CI run.
@kerneltime Would you like to take a look? |
Thanks @guohao-rosicky for the fix, @ivandika3 for the review. |
What changes were proposed in this pull request?
An exception occurs occasionally when the business uses s3 gateway to perform Complete Multipart Upload or Delete Objects.
It was analyzed as a result of concurrent requests; See details: https://issues.apache.org/jira/browse/HDDS-10777
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10777
How was this patch tested?
UT
passed ci: https://github.com/guohao-rosicky/ozone/actions/runs/8890601157