Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLUnit 2 API to specify XSD for similarity for ATTR_VALUE_EXPLICITLY_SPECIFIED #88

Open
tarilabs opened this issue Oct 13, 2016 · 7 comments

Comments

@tarilabs
Copy link

Hello, I'm trying to compare two XML documents for which "control" implies a default attribute, while "test" make the attribute explicit with default value.

I have noted the DifferenceEvaluators.Default specifies for ComparisonType.ATTR_VALUE_EXPLICITLY_SPECIFIED the outcome is indeed ComparisonResult.SIMILAR as I would expected.

However I'm unable to obtain the desired output while using XMLUnit 2 API, as it does not appear to be a way to specify the XSD schema where this definition (attribute is optional with default value) is actually specified.

This is my XSD:

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.example.org/schema/" targetNamespace="http://www.example.org/schema/">
    <element name="myelement">
      <complexType>
        <attribute name="myAttribute" type="string" default="myDefaultValue" use="optional"></attribute>
      </complexType>
    </element>
</schema>

This is "control" xml which implies a default attribute:

<?xml version="1.0" encoding="UTF-8"?>
<myelement  xmlns="http://www.example.org/schema/" />

This is "test" xml where the attribute is explicit with default value:

<?xml version="1.0" encoding="UTF-8"?>
<myelement  myAttribute="myDefaultValue"
            xmlns="http://www.example.org/schema/" />

I would have expected if I XMLUnit Diff-them, they should be "similar", but result actually is "different".

Here is snippet my JUnit test inlined for convenience:

@Test
public void testInline() {
    final String XSD = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + 
            "<schema xmlns=\"http://www.w3.org/2001/XMLSchema\" xmlns:tns=\"http://www.example.org/schema/\" targetNamespace=\"http://www.example.org/schema/\">\n" + 
            "    <element name=\"myelement\">\n" + 
            "      <complexType>\n" +
            "        <attribute name=\"myAttribute\" type=\"string\" default=\"myDefaultValue\" use=\"optional\"></attribute>\n" + 
            "      </complexType>\n" + 
            "    </element>\n" + 
            "</schema>"; 

    final String explicitXML = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + 
            "<myelement  myAttribute=\"myDefaultValue\"\n" + 
            "            xmlns=\"http://www.example.org/schema/\" />";

    final String implicitXML = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + 
            "<myelement  xmlns=\"http://www.example.org/schema/\" />";

    validateAgainstXSD(XSD, explicitXML);
    validateAgainstXSD(XSD, implicitXML);

    Diff myDiff = DiffBuilder.compare( Input.fromString( implicitXML ).build() )
            .withTest( Input.fromString( explicitXML ).build() )
            .ignoreWhitespace()
            .checkForSimilar()
            .build()
            ;
    for (Difference diff : myDiff.getDifferences()) {
        System.out.println(diff);
    }
    Assert.assertFalse(myDiff.toString(), myDiff.hasDifferences());


}

private void validateAgainstXSD(final String xsdContent, final String xmlContent) {
    Validator v = Validator.forLanguage(Languages.W3C_XML_SCHEMA_NS_URI);
    v.setSchemaSource( Input.fromString( xsdContent ).build() );
    ValidationResult r = v.validateInstance( Input.fromString( xmlContent ).build()  );
    if (!r.isValid()) {
        for (ValidationProblem p : r.getProblems()) {
            System.err.println(p);
        }
    }
    assertTrue(r.isValid());
}

Both XML file validate against the XSD Schema, however test fails with result "different".

I've also checked with included JUnit test suite of XMLUnit, but the only relevant test case I found is DefaultComparisonFormatterTest.testComparisonType_ATTR_VALUE_EXPLICITLY_SPECIFIED() here: https://github.com/xmlunit/xmlunit/blob/master/xmlunit-core/src/test/java/org/xmlunit/diff/DefaultComparisonFormatterTest.java#L409
however this test is using DTD.

Is it possible to achieve the same result instead of DTD by using the XSD schema as described above, please?

ps: thank you for XMLUnit is great ! =)

@bodewig
Copy link
Member

bodewig commented Oct 14, 2016

To be honest, I don't really know whether it is possible.

XMLUnit uses https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Attr.html#getSpecified() to know whether the attribute has been specified explicitly. I'm not really sure what needs to be done to make the XML parser add the implicit values of attributes from a schema.

You may want to play with something like https://xerces.apache.org/xerces2-j/faq-pcfp.html#faq-4 and compare Documents you've created that way first. From a cursory glance you may need to enable https://xerces.apache.org/xerces2-j/features.html#validation.schema.normalized-value when using Xerces.

I'm absolutely open to modifying the API in order to make the process easier once we know what exactly is involved.

@tarilabs
Copy link
Author

Thank you for the feedback.

Okay, so for the moment I'm using a custom DifferenceEvaluator to take into account my specific use case as described above, along the lines of:

Set<QName> attrWhichCanDefault = new HashSet<QName>();
attrWhichCanDefault.addAll(Arrays.asList(new QName[] {
        new QName("attr1"), 
        new QName("attr2"), 
        new QName("attr3")
        }));
Set<String> nodeHavingDefaultableAttr = new HashSet<>();
nodeHavingDefaultableAttr.addAll(Arrays.asList(new String[]{"elemA", "elemB"}));
Diff checkSimilar = DiffBuilder
        .compare( control )
        .withTest( test )
        .withDifferenceEvaluator(
                DifferenceEvaluators.chain(DifferenceEvaluators.Default,
                ((comparison, outcome) -> {
                    if (outcome == ComparisonResult.DIFFERENT && comparison.getType() == ComparisonType.ELEMENT_NUM_ATTRIBUTES) {
                        if (comparison.getControlDetails().getTarget().getNodeName().equals( comparison.getTestDetails().getTarget().getNodeName() )
                                && nodeHavingDefaultableAttr.contains( comparison.getControlDetails().getTarget().getNodeName() )) {
                            return ComparisonResult.SIMILAR;
                        }
                    }
                    if (outcome == ComparisonResult.DIFFERENT && comparison.getType() == ComparisonType.ATTR_NAME_LOOKUP) {
                        boolean testIsDefaulableAttribute = false;
                        QName whichDefaultableAttr = null;
                        if (comparison.getControlDetails().getValue() == null && attrWhichCanDefault.contains(comparison.getTestDetails().getValue())) {
                            for (QName a : attrWhichCanDefault) {
                                boolean check = comparison.getTestDetails().getXPath().endsWith("@"+a);
                                if (check) {
                                    testIsDefaulableAttribute = true;
                                    whichDefaultableAttr = a;
                                    continue;
                                }
                            }
                        }
                        if ( testIsDefaulableAttribute ) {
                            if (comparison.getTestDetails().getXPath().equals(comparison.getControlDetails().getXPath() + "/@" + whichDefaultableAttr )) {
                                return ComparisonResult.SIMILAR;
                            }
                        }
                    }
                return outcome;
            })))
        .ignoreWhitespace()
        .checkForSimilar()
        .build();
assertFalse("XML are NOT similar: " + checkSimilar.toString(), checkSimilar.hasDifferences());

However the code is far from being complete:

  • it should check the now explicited, optional attribute value in the "test" xml, is actually really set to the default value
  • "attributes count difference" could become a false negative for a node having mixed optional and required attributes, it should actually check the count difference is indeed motivated just by the optional attributes - or assume the "control" xml input file has indeed at least all the mandatory attributes, should be the case by definition.

For the use case I'm using this code, it should not impact so safe, but I thought helpful to draft how I'm currently trying to approach the problem differently.

Thank you for the pointers about how xmlunit expects be notified about optional attributes, I'll try to dig further also in that direction!

@bodewig
Copy link
Member

bodewig commented Oct 14, 2016

Your approach will certainly work but a generalized solution would require somebody to parse and understand XML Schema. I prefer this "somebody" to be the XML parser :-)

I'll be very interested in your results.

@laeubi
Copy link

laeubi commented Dec 10, 2022

I have a similar use case where two documents are considered different, but they only differ in the default value explicitly given. I therefore like to know if something has changed here?

From the API I would expect something like ignoreDefaultAttributes(XSD) ... is something like this possible or planned?

@bodewig
Copy link
Member

bodewig commented Dec 12, 2022

no, I am not aware of anybody working on this.

The main problem is that the solution requires something to read the XML Schema and understand it well enough to expose the default values for attributes. Right now XMLUnit doesn't contain any code that would understand XML Schema, it purely relies on the JAXP XML parser to do so when validating, for example.

@laeubi
Copy link

laeubi commented Dec 13, 2022

I see, would it be an option to specify default values for an attribute manually? e.g. something like

ignoreDefaultAttribute(XPath, "default value")

if that is available one might later on enhance this to extract the data from the XSD...

@bodewig
Copy link
Member

bodewig commented Dec 13, 2022

If you know this upfront exactly enough to specify the default value, then you would probably be able to write a DifferenceEvaluator that turns differences that stem from default values missing into EQUALS results.

What you describe would certainly be possible, but not without adding new ways to extend the DifferenceEngine - i.e. you can not do that with any of the existing extension points.

It can certainly be done but it would only be useful in a pretty rare situation where (a) you rely on default attribute values and (b) are willing to list them all explicitly once again in addition to the schema. You may be willing to do so, but I doubt many other people are :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants