Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate which spec this package implements and identify deviations #41

Open
2 of 3 tasks
joewiz opened this issue Jun 30, 2021 · 2 comments
Open
2 of 3 tasks

Comments

@joewiz
Copy link
Member

joewiz commented Jun 30, 2021

Is your feature request related to a problem? Please describe.

The EXPath Crypto spec has at least two significant versions: v1.0, dated 14 Feb 2015 and an unnumbered version, dated 20 Mar 2017. Users and maintainers need to have a clear sense of which spec this package implements and what deviations, if any, there are in eXist's implementation. The README references both versions of the spec but refers to the 2nd version as "the latest version of this specification for this module" and says, "The implementation follows this specification." (For the spec's sources, see https://github.com/expath/expath-cg/tree/master/specs/crypto.)

However, it appears there are deviations between the spec and eXist's implementation. The test suite references error codes that are in neither version of the spec. There are mysterious fragments in the test suite regarding keystores - not mentioned in the latest spec. The README lists "currently implemented functions," but the listed limitations do not clearly align with the function documentation.

To disentangle these issues and clarify what users can use, we should investigate which spec is currently implemented and what, if any, actions might be needed to align the package and the specification. (Perhaps we might even identify improvements needed in the specification and ways to better align with the BaseX implementation—itself "based on an early draft" of the EXPath spec. See the latest discussion at expath/expath-cg#132.)

Describe the solution you'd like

  • Compare (a) the README's list of implemented functions, (b) the generated function documentation, and (c) the "latest version" of the specification
  • Identify divergences
  • Discuss findings
@joewiz
Copy link
Member Author

joewiz commented Jun 30, 2021

In this post I look at each of the functions that appear in the latest spec and compare the function signatures to the corresponding functions in eXist's implementation, as well as the README's remarks, the test suite, and the BaseX implementation.

crypto:hash

latest spec

hash($data      as xs:anyAtomicType,
     $algorithm as xs:string) as xs:string

hash($data      as xs:anyAtomicType,
     $algorithm as xs:string,
     $encoding  as xs:string) as xs:string

eXist function documentation

hash($data      as xs:anyType, 
     $algorithm as xs:string) as xs:byte*

hash($data      as xs:anyType, 
     $algorithm as xs:string, 
     $encoding  as xs:string) as xs:byte*

divergences

  • latest spec says $data parameter is xs:anyAtomicType (and that xs:string, xs:base64Binary, and xs:hexBinary are allowed), but eXist implementation says it's xs:anyType
  • latest spec says function returns xs:string (exactly 1), but eXist implementation says it's xs:byte* (0 or more)
  • latest spec is essentially identical to v1.0 spec ($encoding was called $format in v1.0)
  • in some cases, the test suite invokes hash#3 in which the 3rd parameter is an empty sequence, a case which shouldn't be possible since the cardinality of the 3rd parameter is exactly 1, not 0 or 1.
  • BaseX doesn't implement crypto:hash; its hashing functions are in its hash module, where each hashing algorithm has a dedicated function; the functions all return xs:base64Binary instead of xs:string or xs:byte*.

crypto:hmac

latest spec

hmac($data       as xs:anyAtomicType,
     $key        as xs:anyAtomicType,
     $algorithm  as xs:string) as xs:byte*

hmac($data       as xs:anyAtomicType,
     $key        as xs:anyAtomicType,
     $algorithm  as xs:string,
     $encoding   as xs:string) as xs:string

eXist function documentation

hmac($data      as xs:anyAtomicType*, 
     $key       as xs:anyAtomicType*, 
     $algorithm as xs:string) as xs:byte*

hmac($data      as xs:anyAtomicType*, 
     $key       as xs:anyAtomicType*, 
     $algorithm as xs:string, 
     $encoding  as xs:string) as xs:byte*

divergences

  • latest spec says $data parameter is xs:anyAtomicType (exactly 1; and that xs:string, xs:byte*, xs:base64Binary, and xs:hexBinary are allowed), but eXist implementation says it's xs:anyAtomicType* (0 or more)
  • latest spec says $key parameter is xs:anyAtomicType (exactly 1), but eXist implementation says it's xs:anyAtomicType* (0 or more)
  • latest spec says hmac#4 returns xs:string (exactly 1), but eXist implementation says it's xs:byte* (0 or more)
  • latest spec says hmac#3 returns xs:byte* (0 or more), but v1.0 spec says it's xs:string (exactly 1)
  • latest spec discusses the xs:byte* vs. xs:string difference: "[this function] has two signatures; the first one outputs the result as xs:byte*, while the second one outputs the result as encoded xs:string." No further explanation. But eXist's function signature for hmac#4 doesn't follow this distinction.
  • README states "only for xs:string data for now"—presumably referring to the $data parameter
  • BaseX follows the v1.0 spec in returning xs:string in both function signatures. The v1.0 spec has no mention of xs:byte at all.

crypto:generate-signature

latest spec

crypto:generate-signature($data       as document()?,
                          $parameters as map(xs:string, item()+)?) as document()* 

eXist function documentation

generate-signature($data                       as item(), 
                   $canonicalization-algorithm as xs:string, 
                   $digest-algorithm           as xs:string, 
                   $signature-algorithm        as xs:string, 
                   $signature-namespace-prefix as xs:string, 
                   $signature-type             as xs:string) as item()

generate-signature($data as item(), 
                   $canonicalization-algorithm as xs:string, 
                   $digest-algorithm           as xs:string, 
                   $signature-algorithm        as xs:string, 
                   $signature-namespace-prefix as xs:string, 
                   $signature-type             as xs:string, 
                   $xpath-expression           as xs:anyType) as item()

generate-signature($data as item(), 
                   $canonicalization-algorithm as xs:string, 
                   $digest-algorithm           as xs:string, 
                   $signature-algorithm        as xs:string, 
                   $signature-namespace-prefix as xs:string, 
                   $signature-type             as xs:string, 
                   $xpath-expression           as xs:anyType, 
                   $digital-certificate        as xs:anyType) as item()

generate-signature($data as item(), 
                   $private-key                as xs:string, 
                   $signature-algorithm        as xs:string) as item()

divergences

  • latest spec (and v1.0 spec) lists only one function whose 2nd parameter is a map, but eXist implementation contains 4 functions with 3-8 parameters
  • latest spec says $data is of type document()?, but eXist implementation says it's item()
  • latest spec's list of parameter map entries says $key is of type xs:anyAtomicType (and that xs:string, xs:base64Binary, and xs:hexBinary are allowed), but the eXist implementation says it's xs:string
  • latest spec's list of parameter map entries lacks the $digital-certificate parameter that eXist and BaseX's function documentation shows
  • latest spec generate-signature#3 is the only version of this that takes a $private-key parameter.
  • The test suite only tests generate-signature#6, not 7-8 or 3
  • BaseX lacks generate-signature#3

crypto:validate-signature

latest spec

crypto:validate-signature($data as document()) as xs:boolean

eXist function documentation

validate-signature($data as node()) as xs:boolean

divergences

  • spec says $data is of type document(), but eXist implementation says it's node() (as does BaseX)

crypto:encrypt

latest spec

encrypt($data       as xs:anyAtomicType,
        $type       as xs:string,
        $parameters as map(xs:string, item())?) as xs:base64Binary

function documentation

encrypt($data            as xs:anyAtomicType, 
        $encryption-type as xs:string, 
        $secret-key      as xs:string, 
        $algorithm       as xs:string, 
        $iv              as xs:string?, 
        $provider        as xs:string?) as xs:string

divergences

  • latest spec (and v1.0 spec) says 2nd parameter is a map, but eXist implementation contains 6 parameters (BaseX's lacks eXist's parameters 5-6).
  • latest spec's list of parameter map entries says $key is of type xs:anyAtomicType (and that xs:string, xs:base64Binary, and xs:hexBinary are allowed), but the eXist implementation says it's xs:string.
  • latest spec's list of parameter map entries says $iv is of type xs:string (exactly 1), but the eXist implementation says it's xs:string? (0 or 1).
  • README says, "only for xs:string data and symmetric encryption for now"
  • the test suite only tests encrypt#6, not 3

crypto:decrypt

latest spec

decrypt($data       as xs:anyAtomicType,
        $type       as xs:string,
        $parameters as map(xs:string, item())?) as xs:string

eXist function documentation

decrypt($data as xs:anyAtomicType, 
        $decryption-type as xs:string, 
        $secret-key as xs:string, 
        $algorithm as xs:string, 
        $iv as xs:string?, 
        $provider as xs:string?) as xs:string

divergences

  • latest spec (and v1.0 spec) says 3rd parameter is a map, but eXist implementation contains 6 parameters (BaseX lacks eXist's parameters 5-6).
  • latest spec's list of parameter map entries says $key is of type xs:anyAtomicType (and that xs:string, xs:base64Binary, and xs:hexBinary are allowed), but the eXist implementation says it's xs:string.
  • latest spec's list of parameter map entries says $iv is of type xs:string (exactly 1), but the eXist implementation says it's xs:string? (0 or 1).
  • README says, "only for xs:string data and symmetric decryption for now"
  • the test suite only tests decrypt#6, not 3

other notes

  • The v1.0 spec lists many more functions than are in the "latest spec". None of the v1.0 functions appear in the eXist or BaseX implementation. Categories of functions dropped in "latest spec" and unimplemented in eXist or BaseX include:
    • Cryptographic Service Providers (list-providers, list-services), Key Management (generate-key-pair, generate-secret-key, compare-keys, key-agreement, convert-key-specification-to-key-object, convert-key-object-to-key-specification)
    • Secure Storing of Sensitive Keying and Data Material (create-secure-store, load-secure-store, convert-secure-store, get-secure-store-metadata, metadata, add-entry, get-entry, delete-entry, get-entry-metadata, list-trusted-certificate-authorities)
    • Digital Certificates (generate-certificate, validate-certificate, parse-certificate, generate-certification-path, validate-certification-path, generate-certification-request, validate-certification-request, validate-certification-revocation-list)
    • Random Sequences Generation (generate-random-number).
  • BaseX's error conditions use numbers with a CX prefix (some of which appear in the test suite!), whereas the latest spec's error conditions use hyphen-delimited phrases
  • The latest spec contains error codes for "error reading keystore or password incorrect", but keystores and passwords aren't mentioned anywhere else in the spec. Perhaps these errors are left over from the v1.0 spec, which mentioned keystores and passwords in the context of the generate-signature function. It seems eXist and BaseX have not implemented functions related to keystores or passwords.
  • Saxon does not appear to have an implementation of any crypto functions (only file, binary, and archive are listed on Saxon's list of supported EXPath extensions)

@joewiz
Copy link
Member Author

joewiz commented Jun 30, 2021

In conclusion, there are numerous minor divergences between the latest spec and the eXist implementation.

There is a surprising number of differences in parameter types and cardinalities in function signatures. Questions for further investigation: Which of these, if any, are significant? (The xs:string vs. xs:byte return types for crypto:hash and crypto:hmac seem significant; c.f. BaseX's use of xs:base64Binary for the return type of its hashing functions.) And would updating eXist's implementation to match the spec break tests or fix them (and thus code using these functions)?

The biggest difference between the latest spec and the eXist implementation is in the cases of functions where the spec uses a map for parameters, whereas eXist (and BaseX) do not appear to support this and instead use multi-parameter function signatures. How to reconcile this difference between the spec and the implementations is perhaps the biggest glaring issue.

The test suite is another window into what eXist supports (alongside the function signatures). Without knowing which tests previously passed, it's hard to say which test failures represent a regression with the new 6.0.0-SNAPSHOT version. A lot of work appears to remain in investigating the failing tests.

The low hanging fruit which would help fix 3 test failures are the error messages, which still use a pre-1.0 draft set of error codes (seen in the BaseX documentation). Updating the error codes in the 3 tests with %test:assertErrror should fix those. That would leave only 10 failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant