Design Decisions
The public API of the core libraries will be very similar between the Java and .NET implementations. At the same time they will use the established idioms of the respective platform. Interfaces are prefixed by I
in the .NET version but not the Java version, delegates replace single-method interfaces, properties replace getter/setter pairs and events replace addXYZListener
methods.
In order to make the implemented algorithms similar, implementation differences between Java's and .NET's XML stacks are hidden behind helper methods in the org.xmlunit.util
package and the Org.XmlUnit.Util
namespace.
The Java implementation of the validation, XSLT and XPath parts are based on JAXP packages that all use javax.xml.transform.Source
to specify XML documents or schema definitions. This makes Source
the natural choice for an unified input type to all parts of XMLUnit.
For .NET there is no such common type, most of the class library implementations are based on reader abstractions. javax.xml.transform.Source
only specifies a getter/setter pair for a system-Id, for .NET Org.XmlUnit.ISource
extends this with a read-only property for an XmlReader
. The Org.XmlUnit.Input
namespace provides ISource
implementations similar to the Source
implementations of the Java class library.
XMLUnit for Java 1.x had a few static properties that controlled how the documents are to be interpreted - whether whitespace or comments are significant. Rather than repeating a similar design these options are available as decorators for Source
in XMLUnit 2.x - CommentLessSource
uses XSLT to strip comments from an arbitrary (I)Source
, for example. This way new "interpretations" can be added as new classes without touching the whole library - at the same time they are now available for all parts of XMLUnit, not restricted to comparisons.
The Transformation
class in org.xmlunit.transform
provides a thin layer over TraX or System.Xml.Xsl
respectively and really only exists to support specifying results of XSLT transformations as inputs. It is not supposed to be a general purpose XSLT API but will remain tailored to XMLUnit's needs.
The XPathEngine
interface is minimal as it is expected that more advanced features like expecting the outcome to correspond to a certain regular expression can better be implemented by matchers or constraints. In fact Hamcrest's StringMatcher
and NUnit's StringAssert
should be able to go a long way for testing.
Should there be support to evaluate an XPath as qualified name the way XMLUnit for Java's assertXpathEvaluatesTo
with a QualifiedName
argument does or is this better implemented in a matcher on top of selectNodes
?
The primary target for validation support is XML Schema and it is well covered by JAXP and System.Xml.Schema
. Still the API shall be open to be extended to other "schema" languages. The schema languages are identified by strings rather than enums as string values are needed at least for JAXP anyway and .NET enums are not very convenient if you want to do more than just enumerate a set of values.
The Java version supports DTDs via a validating SAXParser
and any schema language JAXP supports - this includes Relax NG's XML syntax if all the required libraries are available. The .NET version supports DTD and the deprecated XDR - at least as long as the .NET Framework version still supports XDR.
Should there be a way to register additional languages and custom validators for them?
The core DifferenceEngine
only performs comparisons and provides hooks for all kinds of decision making. It must not modify inputs, ignore contents or stop the comparison by itself, this is the job of interface implementations the user can select. The compare
method drives a single comparison of two inputs and provides information about the atomic comparisons it performs to the registered listeners.
Many of the "hooks" or "helpers" are similar to what XMLUnit for Java 1.x provided, some have changed their name or the method signatures of the interface.
The 1.x DifferenceListener
interface had two responsibilities, recording differences found and determining the severity of a difference - it is never informed of comparisons the DifferenceEngine
considered equal, this is the job of MatchTracker
, which can not alter the comparisons outcome. Whether the comparison as a whole should continue or not is decided by a ComparisionController
.
In 2.x the responsibilities are distributed differently.
DifferenceEvaluator
is responsible for determining the severity of
all comparisons, even those that seem to be equal. The
ComparisonListener
is notified of any kind of comparison and it is
possible to selectively subscribe to comparisons whose outcome is
equal, different or to all comparisons. Comparisoncontroller
can
halt the comparision as a whole, only its interface has changes when
compared to XMLUnit for Java 1.x.
The DifferenceEvaluators
class contains few implementations of DifferenceEvaluator
including the Default
implementation which uses similar rules to XMLUnit for Java 1.x's DifferenceEngine
.
The ComparisonControllers
class contains two trivial
implementations, Default
which behaves like DetailedDiff
in 1.x
and StopWhenDifferent
which behaves like Diff
.
In order to deal with documents that differ in the order of child nodes of a given parent XMLUnit for Java 1.x allowed the algorithm that identified which child element of the test document with which one of the child element to be overridden by a custom ElementQualifier
implementations. There was no way to influence the selection of pairs for non-element children like comments.
XMLUnit 2.x uses the NodeMatcher
interface which is more general. Its default implementation DefaultNodeMatcher
performs matching similar to XMLUnit for Java 1.x where ElementSelector
replaces ElementQualifier
. The additional NodeTypeMatcher
interface allows nodes of different types to be compared with each other - the default implementation allows CDATA sections and text nodes to be compared with each other.
XMLUnit 2.x will always ignore the order of attributes as the order is irrelevant according to the standard and XML parsers are free to modify the order as they see fit anyway. The default DifferenceEvaluator
considers text nodes and CDATA sections similar as CDATA sections are really only serialization artifacts.
DOMDifferenceEngine
is a long class and the fact it recurses into the structure makes it difficult to stop the comparison at a arbitrary level. The XMLUnit for Java 1.x version used exception for control flow, which didn't feel right. The first 2.x implementations were littered with
result = someComparison();
if (result == CRITICAL) {
return CRITICAL;
}
result = nextComparison();
if (result == CRITICAL) {
return CRITICAL;
}
result = ...
(at that time DifferenceEvaluator
was responsible for stopping the
comparison process with a special ComparisonResult
)
which wasn't any better either. We even used code generation at one
point, but it was ugly as well. Right now comparisons are chained in
construct that perform a certain deferred comparison only if the
ComparisonController
didn't signal to stop the whole comparison
process. This doesn't really look pretty in Java without lambdas
either. It would be good to find a nicer approach.
All parts of XMLUnit described so far provide traditional APIs that may be cumbersome to use in certain context like when formulating unit tests. This is particularly true when configuring a DifferenceEngine
with various options and perform a comparison.
Builders using a fluent style are provided to create Source
s from various inputs or perform XSLT transformations. There will be a builder for comparing XML documents and probably a related builder that helps configuring the node matching algorithms.