Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WstxValidationException: Unknown reason (at end element </nl:nillableIntElement>) when validating a document with nillable elements #179

Open
ppalaga opened this issue Oct 28, 2023 · 14 comments · May be fixed by #187
Labels
has-failing-test Indicates that there exists a test case (in git or inlined in issue) to reproduce the issue

Comments

@ppalaga
Copy link
Contributor

ppalaga commented Oct 28, 2023

This issue occurs when CXF is validating an incoming SOAP message against W3CSchema and the message contains a an xs:dateTime or xs:int elements having xsi:nil="true":

2023-10-28 22:40:16,639 WARN  [org.apa.cxf.pha.PhaseInterceptorChain] (executor-thread-0) Interceptor for {http://nl.schiphol.asb.services/SPLDataService}SPLDataService#{http://nl.schiphol.asb.services/SPLDataService}AddSPLData has thrown exception, unwinding now: org.apache.cxf.interceptor.Fault: Unknown reason (at end element </spl:intElem>)
        at org.apache.cxf.staxutils.validation.Stax2ValidationUtils$2.reportProblem(Stax2ValidationUtils.java:147)
        at com.ctc.wstx.sw.BaseStreamWriter.reportProblem(BaseStreamWriter.java:1230)
        at com.ctc.wstx.msv.GenericMsvValidator.reportError(GenericMsvValidator.java:562)
        at com.ctc.wstx.msv.GenericMsvValidator.reportError(GenericMsvValidator.java:554)
        at com.ctc.wstx.msv.GenericMsvValidator.reportError(GenericMsvValidator.java:548)
        at com.ctc.wstx.msv.GenericMsvValidator.validateElementEnd(GenericMsvValidator.java:390)
        at com.ctc.wstx.sw.BaseNsStreamWriter.doWriteEndTag(BaseNsStreamWriter.java:713)
        at com.ctc.wstx.sw.BaseNsStreamWriter.writeEndElement(BaseNsStreamWriter.java:285)
        at org.apache.cxf.staxutils.StaxUtils.copy(StaxUtils.java:745)
        at org.apache.cxf.staxutils.StaxUtils.copy(StaxUtils.java:707)
        at org.apache.cxf.databinding.source.XMLStreamDataReader.validate(XMLStreamDataReader.java:244)
        at org.apache.cxf.databinding.source.XMLStreamDataReader.read(XMLStreamDataReader.java:115)
        at org.apache.cxf.databinding.source.XMLStreamDataReader.read(XMLStreamDataReader.java:83)
        at org.apache.cxf.databinding.source.XMLStreamDataReader.read(XMLStreamDataReader.java:67)
        at org.apache.cxf.wsdl.interceptors.DocLiteralInInterceptor.handleMessage(DocLiteralInInterceptor.java:192)
        at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
        at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
        at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)
        at org.apache.cxf.transport.servlet.ServletController.invokeDestination(ServletController.java:234)
        at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:208)
        at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:160)
        at io.quarkiverse.cxf.transport.CxfHandler.process(CxfHandler.java:288)
        at io.quarkiverse.cxf.transport.CxfHandler.handle(CxfHandler.java:225)
        at io.quarkiverse.cxf.transport.CxfHandler.handle(CxfHandler.java:46)
        at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
        at io.vertx.core.impl.ContextBase.lambda$executeBlocking$0(ContextBase.java:137)
        at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)
        at io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135)
        at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:576)
        at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449)
        at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478)
        at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29)
        at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:840)

CXF calls validateAgainst(schema) on a writer which gets the events copied from the reader.

A plain Woodstox reproducer is available in PR #180 .
It also makes sure that the failing XML documents are accepted by javax.xml.validation.

ppalaga added a commit to ppalaga/woodstox that referenced this issue Oct 28, 2023
…(at end element </nl:nillableIntElement>) when validating a document with nillable elements
@cowtowncoder cowtowncoder added the has-failing-test Indicates that there exists a test case (in git or inlined in issue) to reproduce the issue label Nov 4, 2023
cowtowncoder pushed a commit that referenced this issue Nov 4, 2023
…lement </nl:nillableIntElement>) when validating a document with nillable elements (#180)
@cowtowncoder
Copy link
Member

cowtowncoder commented Nov 4, 2023

Failing test merged; interesting that validation works for string value but fails for other scalars (int, date-time).

And apparently only for writer-side. I wonder if typed write methods are missing some callbacks...

Oh wait. No, can't be since this is just read/write'ing content, not using typed API.

@cowtowncoder
Copy link
Member

Hmmh. Not super easy to figure out. But the difference wrt xs:string is in GenericMsvValidator.validateElementAndAttributes():

int stringChecks = mCurrAcceptor.getStringCareLevel();

where xs:string gets Acceptor.STRING_IGNORE (i.e. anything goes) but others Acceptor.STRING_STRICT.

@cowtowncoder
Copy link
Member

Ok, can't quite figure it out, but I assume sequence of validation calls in this case is incorrect for writer side; some call missing (or extra call).

At least there is reproduction if I have time to go back digging or (ideally) someone else has itch and does spelunking into code :)
(would be happy to merge a PR for fix)

@ppalaga
Copy link
Contributor Author

ppalaga commented Nov 7, 2023

Thanks for having a look @cowtowncoder!

assume sequence of validation calls in this case is incorrect for writer side; some call missing (or extra call).

Something like that would also be my hypothesis: The code perhaps first checks whether the content is an int before it checks whether xsi:nil is true? It does not matter for strings because those can have zero length.

@cowtowncoder
Copy link
Member

Thanks for having a look @cowtowncoder!

assume sequence of validation calls in this case is incorrect for writer side; some call missing (or extra call).

Something like that would also be my hypothesis: The code perhaps first checks whether the content is an int before it checks whether xsi:nil is true? It does not matter for strings because those can have zero length.

All actual validation state handling is done by MSV; Woodstox side just calls various methods to "feed" content.

As per my other note one observable difference is that xs:string can have exactly any content so MSV does not check anything.

@ppalaga
Copy link
Contributor Author

ppalaga commented Jan 5, 2024

I think I figured out, what is the problem.

I traced the method calls using AspectJ for both the reader and writer validation - here is the code: https://github.com/ppalaga/woodstox/commits/i179-trace/

To produce the traces you need to do the following:

mvn clean test -Dtest=TestW3CSchemaNillable179#testNillableIntReader > reader-trace.txt

mvn clean test -Dtest=TestW3CSchemaNillable179#testNillableInt > writer-trace.txt

I am attaching the traces:

In those the following parts are interesting:

In the reader scenario, reading the nil attribute returns true as expected:

    > com.ctc.wstx.msv.GenericMsvValidator.validateElementStart("nillableInt":java.lang.String, "http://server.hw.demo/nillable":java.lang.String, "nl":java.lang.String)
     ...
     > com.ctc.wstx.msv.AttributeProxy.getValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
      > com.ctc.wstx.sr.InputElementStack.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
       > com.ctc.wstx.sr.InputElementStack.findAttributeIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
        > com.ctc.wstx.sr.AttributeCollector.findIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
         > com.ctc.wstx.sr.Attribute.hasQName("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
         < com.ctc.wstx.sr.Attribute.hasQName("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : true:boolean
        < com.ctc.wstx.sr.AttributeCollector.findIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : 0:int
       < com.ctc.wstx.sr.InputElementStack.findAttributeIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : 0:int
       > com.ctc.wstx.sr.InputElementStack.getAttributeValue(0:int)
        > com.ctc.wstx.sr.InputElementStack.getAttrCollector()
        < com.ctc.wstx.sr.InputElementStack.getAttrCollector() : com.ctc.wstx.sr.AttributeCollector
        > com.ctc.wstx.sr.AttributeCollector.getValue(0:int)
         > com.ctc.wstx.util.TextBuilder.getAllValues()
         < com.ctc.wstx.util.TextBuilder.getAllValues() : "true":java.lang.String
         > com.ctc.wstx.sr.Attribute.getValue("true":java.lang.String)
         < com.ctc.wstx.sr.Attribute.getValue("true":java.lang.String) : "true":java.lang.String

but it is a bit different in the writer scenario:

    > com.ctc.wstx.msv.GenericMsvValidator.validateElementStart("nillableInt":java.lang.String, "http://server.hw.demo/nillable":java.lang.String, "nl":java.lang.String)
     ...
     > com.ctc.wstx.msv.AttributeProxy.getValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
      > com.ctc.wstx.sw.BaseStreamWriter.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
      < com.ctc.wstx.sw.BaseStreamWriter.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : null:java.lang.String
     < com.ctc.wstx.msv.AttributeProxy.getValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : null:java.lang.String

I was wondering why BaseStreamWriter.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) returns null. Here is why:

// // // Attribute access: not yet implemented:
/* !!! TODO: Implement attribute access (iff validate-attributes
* enabled?
*/
@Override
public int getAttributeCount() { return 0; }
@Override
public String getAttributeLocalName(int index) { return null; }
@Override
public String getAttributeNamespace(int index) { return null; }
@Override
public String getAttributePrefix(int index) { return null; }
@Override
public String getAttributeValue(int index) { return null; }
@Override
public String getAttributeValue(String nsURI, String localName) {
return null;
}

It always returns null, because it is not implemented.

@ppalaga
Copy link
Contributor Author

ppalaga commented Jan 5, 2024

@cowtowncoder, could you please suggest some way forward?

@cowtowncoder
Copy link
Member

Great detective work @ppalaga !

I wish I remembered the context here, but from what you include I think it is what it say: on writer-side, attribute info (wrt currently written element) is not accessible. Likely because it is not being retained.
And if so, it'd need to be kept track of the case where there is validator to call (otherwise it is additional overhead with no benefit).

@ppalaga
Copy link
Contributor Author

ppalaga commented Jan 5, 2024

Thanks, sounds good. The main questions from my side are: (1) is it worth implementing (2) any chance that somebody from the project does that?

@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 5, 2024

@ppalaga Right now I can't promise to work on this, but I always try to make time to help with PRs if someone else does. If that helps at all?

Fixing this would definitely be useful, for writer-side validation.

@ppalaga
Copy link
Contributor Author

ppalaga commented Jan 10, 2024

Thanks for the statement, @cowtowncoder.

Could you please give some hints about the design of writer validation? Especially, the following topics would be interesting:

  • Do we need Typed and Validating subclasses of BaseStreamWriter as they exist for BasicStreamReader?
  • I guess it is OK to use the InputElementStack for the writer scenario as well?
  • Could you please point me to the relevant validating reader tests, that could be adapted for the validating writer?

@ppalaga
Copy link
Contributor Author

ppalaga commented Jan 10, 2024

  • Do we need Typed and Validating subclasses of BaseStreamWriter as they exist for BasicStreamReader?

I see there is TypedStreamWriter, but where under its subclasses Simple/Repairing should the new Validating writer be hanged?

@cowtowncoder
Copy link
Member

No, should not need new sub-classes as typed writes and pluggable validation already exist on writer side.

Validating reader tests are in packages ending with .vstream (v for validating), split in 3 ones I think:

  • src/test/java/org/codehaus/stax/test/vstream
  • src/test/java/wstxtest/vstream
  • src/test/java/stax2/vstream

but there are also Validating Write tests in:

  • src/test/java/stax2/vwstream

@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 10, 2024

One confusing aspect of naming is that there are, I think, 2 hierarchies of -Writers:

  • -StreamWriters extending Stax XMLStreamWriter
  • other -Writers with XMLWriter, used by stream writers for actual content generation

Validation seems to be mostly (but maybe not completely) handled by StreamWriter implementations; actual validation is mValidator, and it is being called by stream writer based on write calls to essentially maintain and check validator state in parallel with actual token stream writes.

So: I do not think new classes are needed; most other validation functionality exists. But I do not remember what my thinking was with missing state keeping for attributes -- or, possibly just wiring access to state that might be kept.
It is possible state is not kept, however, since whereas on StreamReader side there is natural need to access attribute values, on writer side there isn't (they are just to be output and XMLStreamWriter has no accessors).

ppalaga added a commit to ppalaga/woodstox that referenced this issue Jan 10, 2024
</nl:nillableIntElement>) when validating a document with nillable
elements fix FasterXML#179
ppalaga added a commit to ppalaga/woodstox that referenced this issue Jan 14, 2024
</nl:nillableIntElement>) when validating a document with nillable
elements fix FasterXML#179 fix FasterXML#190
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has-failing-test Indicates that there exists a test case (in git or inlined in issue) to reproduce the issue
Projects
None yet
2 participants