New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C#,C++,Java] Generate DTOs for non-perf-sensitive usecases. #957
base: master
Are you sure you want to change the base?
Commits on May 15, 2024
-
[C#] Generate DTOs from SBE IR for non-perf-sensitive usecases.
In some applications performance is not cricital. Some users would like to use SBE across their whole "estate", but don't want the "sharp edges" associated with using flyweight codecs, e.g., accidental escape. In this commit, I've added a first cut of DTO generation for C# and a simple test based on the Car Example. The DTOs support encoding and decoding via the generated codecs using `EncodeInto(CodecT codec)` and `DecodeFrom(CodecT codec)` methods. Currently there is no support for equality/comparison or read-only views over the data; although, these have been requested. Here are some points that we may or may not wish to change in the future: 1. Non-present (due to the encoded version) string/array data and repeating groups are represented as `null` rather than empty. 2. Non-present primitive values are represented as their associated null value rather than using nullable types. 3. Non-present bitsets are represented as `0`. 4. DTOs are generated via a separate `CodeGenerator` rather than a flag to the existing C# `CodeGenerator`.
Configuration menu - View commit details
-
Copy full SHA for c5067e6 - Browse repository at this point
Copy the full SHA c5067e6View commit details -
[Java] Start to introduce property based testing (PBT).
In this commit, I've: 1. Added a test dependency on JQwik. 2. Added a JQwik-based generator of arbitrary valid SBE message schemas. This generator exercises a lot of possibilities but not all. In particular, it is missing the generation of constants, min/max, and custom offsets. 3. Added a test that shows the parser parses any arbitrary SBE message schema. Why have I introduced PBT? My aim is to eventually test (an approximation of) the following property: ``` ∀ msgSchema ∈ PossibleMessageSchemas, ∀ bytes ∈ PossibleValidValues(msgSchema), DtoEncode(DtoDecode(bytes)) = bytes ``` I.e., that `DtoEncode` is the inverse of `DtoDecode` and that it preserves the information in an encoded message.
Configuration menu - View commit details
-
Copy full SHA for 440a0b0 - Browse repository at this point
Copy the full SHA 440a0b0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d038c9 - Browse repository at this point
Copy the full SHA 9d038c9View commit details -
[Java] Move property tests to their own source set.
Gradle 8 complains about deprecations when using the "old style" test source set creation. Therefore, I have had to use `jvm-test-suites`, which is the Gradle 9 way of doing things. I have moved the property tests into their own source set and own test task: `propertyTest`, as they may be more expensive to run than the other unit tests. We may only wish to run them nightly in CI. I have also profiled the `propertyTest` task and removed usage of JQwik list item duplication, as it dominates the time taken during testing. The time taken is now dominated by compiling XPaths.
Configuration menu - View commit details
-
Copy full SHA for e988b51 - Browse repository at this point
Copy the full SHA e988b51View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5118d06 - Browse repository at this point
Copy the full SHA 5118d06View commit details -
[Java] Split apart schema generation files.
I think separate files improves the readability of the arbitrary message schema generation.
Configuration menu - View commit details
-
Copy full SHA for 3ccb3e8 - Browse repository at this point
Copy the full SHA 3ccb3e8View commit details -
[Java] Extend PBT to generate arbitrary encoded messages.
In this commit, I've also demonstrated the usage of arbitrary encoded messages in a JsonPrinter test. In the test, I print an arbitrary value of an arbitrary schema and check that it is valid JSON. The test passes as long as we constrain the generation of strings, as JsonPrinter currently does not do any escaping. I also noted that it uses `'` rather than `"`, which AFAICS is not permitted by the JSON specification, but is permitted by our (less strict) JSON testing tool.
Configuration menu - View commit details
-
Copy full SHA for 31d8d48 - Browse repository at this point
Copy the full SHA 31d8d48View commit details -
[Java] Test DTOs preserve information.
In this commit, I've added a round trip property test, where I show `dto.EncodeInto(...)` is the inverse of `dto.DecodeFrom(...)` and that the original encoded value is preserved through the transformation.
Configuration menu - View commit details
-
Copy full SHA for 41c01ba - Browse repository at this point
Copy the full SHA 41c01baView commit details -
[Java,C#] Make dotnet executable customisable in tests.
Addresses CodeQL comment.
Configuration menu - View commit details
-
Copy full SHA for dd36f31 - Browse repository at this point
Copy the full SHA dd36f31View commit details -
Configuration menu - View commit details
-
Copy full SHA for c03e53e - Browse repository at this point
Copy the full SHA c03e53eView commit details -
[Java] Use arbitrary fixed-length arrays in property tests.
Previously, we were only generating non-array fields in blocks. Now, we can exercise more possibly specifications in our property-based tests.
Configuration menu - View commit details
-
Copy full SHA for 5377d9a - Browse repository at this point
Copy the full SHA 5377d9aView commit details -
[Java] Model optional fields in property-based tests.
Previously, we were only supplying required fields in the schema. Now, we allow fields and primitive types to be marked as optional. Note that they can be marked as optional in two places. Optional => null value. There are some oddities around specifying `nullValue`s: real-logic#429 We currently use the default `nullValue`s for types rather than generating arbitrary ones.
Configuration menu - View commit details
-
Copy full SHA for 0d3a6bc - Browse repository at this point
Copy the full SHA 0d3a6bcView commit details -
[Java, C#] Add a GitHub workflow for slow checks.
This workflow is configured to run once per day or when manually requested. At the moment, the slow checks include several property-based tests. These tests take a significant amount of time to execute; therefore, we decided it was better to keep them out of the CI build.
Configuration menu - View commit details
-
Copy full SHA for ead428c - Browse repository at this point
Copy the full SHA ead428cView commit details -
[Java] Extend arbitrary varData encodings.
We now explore more of the "schema space", as we generate lengths of different encodings. It surfaces an issue, reported in real-logic#955, which was fixed in another outstanding PR's commit: 3885a63.
Configuration menu - View commit details
-
Copy full SHA for a207daf - Browse repository at this point
Copy the full SHA a207dafView commit details -
[Java, C#] Tidy up encoded message writing in tests.
Previously, we were writing all bytes in a buffer rather than just those used for encoding the message. Now, we keep track of the "limit" of the message and only use the bytes in the buffer up to the limit.
Configuration menu - View commit details
-
Copy full SHA for a83ee9a - Browse repository at this point
Copy the full SHA a83ee9aView commit details -
[C#] Upgrade to latest LTS .NET version.
To support records in DTOs, we require C# 9+. This is not supported by .NET 3. Also, this old version is no longer supported by Microsoft. Therefore, I have upgraded to the latest LTS release.
Configuration menu - View commit details
-
Copy full SHA for d312555 - Browse repository at this point
Copy the full SHA d312555View commit details -
[C#] Generate DTOs using C# 9+ records.
The reason behind this change is to support equality and comparison of DTOs in tests by leveraging records.
Configuration menu - View commit details
-
Copy full SHA for ff99679 - Browse repository at this point
Copy the full SHA ff99679View commit details -
[C#] Use more-idiomatic null representations.
Also, make records immutable, as it is more idiomatic to use `with` expressions. In the future, it may be preferable to declare record fields in the constructor rather than as init-only properties; however, this would be a much larger change. This is the current mapping of optional fields: - Primitives: as nullable value types - Composites: as nullable references - Enums: we already generate a `NULL_VALUE` case and we use that - Bitsets: as a flags enum where `0` is the null value - Strings: as empty string - Arrays: as empty read-only lists
Configuration menu - View commit details
-
Copy full SHA for a58e36c - Browse repository at this point
Copy the full SHA a58e36cView commit details -
[IR] Change default float and double values to minimum representable …
…values. Previously, we were using the minimum positive value as the `minValue`.
Configuration menu - View commit details
-
Copy full SHA for 8f6b74f - Browse repository at this point
Copy the full SHA 8f6b74fView commit details -
[C#] Add validation to DTOs and improve use of records.
The aim of these changes are to avoid multiple representations in DTOs and to support read-only views over data. The new validation includes: - Checking null values "idiomatic null values" rather than the reserved null value, to prevent against multiple kinds of null in DTOs. - Checking primitive field values are at least `minValue` and at most `maxValue`. Note that this validation is not applied to fixed-size arrays as the specification says, "Data range attributes minValue and maxValue do not apply", under the "Fixed-length data" section. Records are now immutable. Record expressions, i.e., using `with`, are supported and will apply validation, as we have customised the `init` property accessor. I have not included support for encoding null composite values from DTOs; however, in theory, this could be added later.
Configuration menu - View commit details
-
Copy full SHA for 92331c8 - Browse repository at this point
Copy the full SHA 92331c8View commit details -
[C#] Support DTO generation via system property.
Previously, you had to pass the `CSharpDtos` FQN as the `TargetCodeGenerator` when running sbe-tool; however, that means the XML schema was parsed multiple times, as DTOs depend on flyweights, which seems wasteful. Therefore, I have introduced a system property, `sbe.csharp.generate.dtos` that also controls the generation of DTOs when targetting `CSharp`.
Configuration menu - View commit details
-
Copy full SHA for 3f4168e - Browse repository at this point
Copy the full SHA 3f4168eView commit details -
[C#] Extend tests to cover extended schemas.
These tests showed some deficiencies in the DTO generation. For example, added variable-length data was not represented as nullable nor properly handled in `EncodeInto(...)`. During this work, I noticed composite "field" tokens (i.e., types) take their `token.version()` from their containing message/group field. I had to adjust some code that was using `token.version() > 0` to determine whether a field had been added, as this only works with message/group fields.
Configuration menu - View commit details
-
Copy full SHA for 04ee2f7 - Browse repository at this point
Copy the full SHA 04ee2f7View commit details -
[C#] Address some of Martin's feedback re "added" var data representa…
…tion. The SBE spec doesn't explicitly allow variable-length data or groups to be added to existing message schemas, i.e., only block-level fields may be added; however in practice the encoders/decoders for some languages do support the addition of var data fields. Previously, I had represented these "added" var data fields as optional strings or byte arrays, but having chatted with Martin, we think it is better to mimick the existing decoder representation, i.e., use empty strings/arrays to represent missing elements. In this commit, I've also fixed some instances where I was checking `token.version() > 0` where I should have been checking `token.version() > sinceVersionOfParentContainer`. Ideally, I would have liked to remove such checks entirely, but the C# and C++ codecs do not give sensible responses when decoding older versions in some cases, i.e., you _must_ check the presence of the field before accessing it.
Configuration menu - View commit details
-
Copy full SHA for cc6514d - Browse repository at this point
Copy the full SHA cc6514dView commit details -
[C#] Address feedback around encode/decode methods.
This commit addresses some feedback from Martin: - Makes encode and decode methods static - Improves naming: - `EncodeInto(codec)` -> `EncodeWith` - `DecodeFrom(codec)` -> `DecodeWith` - Introduces version that works directly with buffer
Configuration menu - View commit details
-
Copy full SHA for b19227e - Browse repository at this point
Copy the full SHA b19227eView commit details -
[C++] Generate DTOs for non-perf-sensitive usecases.
In some applications performance is not cricital. Some users would like to use SBE across their whole "estate", but don't want the "sharp edges" associated with using flyweight codecs, e.g., accidental escape. In this commit, I've added a first cut of DTO generation for C++ and a simple test based on the Car Example. The DTOs support encoding and decoding via the generated codecs using `DtoT::encode(CodecT& codec, const DtoT& dto)` and `DtoT::decode(CodecT& codec, Dto& dto)` methods. Generation can be enabled specifying the target code generator class, `uk.co.real_logic.sbe.generation.cpp.CppDtos`, or by passing a system property `-Dsbe.cpp.generate.dtos=true`.
Configuration menu - View commit details
-
Copy full SHA for 64f6516 - Browse repository at this point
Copy the full SHA 64f6516View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ff1e11 - Browse repository at this point
Copy the full SHA 4ff1e11View commit details -
[C++] Improve naming conventions in DTOs.
Changes: - `encode` -> `encodeWith` - `decode` -> `decodeWith`
Configuration menu - View commit details
-
Copy full SHA for b8b3e01 - Browse repository at this point
Copy the full SHA b8b3e01View commit details -
[C++] Support "to string" without specification of buffer length.
As we use the generated codecs to create a string representation of our DTOs, we don't use Agrona buffers in C++, and there is no concept of resizing, it is necessary to size a temporary buffer during the construction of the string data. Previously, we were letting the user supply this value, which wasn't a very friendly API. Now, we use the `computeLength` methods on the codec to determine how big of a temporary buffer we need. Perhaps the methods will also be useful for avoiding a buffer copy when used in conjunction with Aeron. For example, a developer could use `dto.computeEncodedLength()` to initialise a buffer claim rather than copying via the `offer(...)` API.
Configuration menu - View commit details
-
Copy full SHA for bc3e523 - Browse repository at this point
Copy the full SHA bc3e523View commit details -
[C++] Fix issues with length computation.
I had incorrectly assumed that the `Flyweight::computeLength` method took _encoded lengths_ of groups etc., but actually it takes a complicated structure of group counts and variable lengths. As it was hard to build this list, I've opted for a simpler approach: do the length calculation within the generated DTO message and its groups. In this commit, I've also added some convenience methods for converting between DTOs and "byte arrays".
Configuration menu - View commit details
-
Copy full SHA for fd06b2d - Browse repository at this point
Copy the full SHA fd06b2dView commit details -
[C++] Address feedback from Todd re var data representation.
It is more-idiomatic to represent variable-length data using `std::string` even when there is no character encoding specified, the the `std::string` API provides useful utilities regardless.
Configuration menu - View commit details
-
Copy full SHA for 4cc68f1 - Browse repository at this point
Copy the full SHA 4cc68f1View commit details -
The main change in this commit is to add property-based testing around the C++ DTOs. We check the same property as the C# tests, i.e., that decoding and re-encoding via a DTO preserves the original bytes. There are some other smaller changes in this commit: 1. We now only generate arbitrary schemas where enums have at least one case, as this is required by the spec. 2. We now log details about how we encoded the `input.dat` file used in property-based tests, which can help diagnose problems in the test.
Configuration menu - View commit details
-
Copy full SHA for 57fcebd - Browse repository at this point
Copy the full SHA 57fcebdView commit details -
[Java, C++, C#] Ensure buffer is large enough to contain message.
Previously, when generating arbitrary encoded values the generator would not necessarily extend the buffer to the size necessary to hold the message, e.g., if optional fields were left unset at the end of the message block. We now call `checkLimit` to "reserve" space for the full block length at the message and group level. This fixes an exception seen in the slow tests: ``` java.lang.IndexOutOfBoundsException: index=0 length=129 capacity=128 at org.agrona.AbstractMutableDirectBuffer.boundsCheck0(AbstractMutableDirectBuffer.java:1719) at org.agrona.AbstractMutableDirectBuffer.getBytes(AbstractMutableDirectBuffer.java:464) at org.agrona.io.DirectBufferInputStream.read(DirectBufferInputStream.java:175) at uk.co.real_logic.sbe.properties.DtosPropertyTest.writeInputFile(DtosPropertyTest.java:199) at uk.co.real_logic.sbe.properties.DtosPropertyTest.cppDtoEncodeShouldBeTheInverseOfDtoDecode(DtosPropertyTest.java:147) at java.lang.reflect.Method.invoke(Method.java:498) at net.jqwik.engine.execution.CheckedPropertyFactory.lambda$createRawFunction$1(CheckedPropertyFactory.java:84) at net.jqwik.engine.execution.CheckedPropertyFactory.lambda$createRawFunction$2(CheckedPropertyFactory.java:91) at net.jqwik.engine.properties.CheckedFunction.execute(CheckedFunction.java:17) at net.jqwik.api.lifecycle.AroundTryHook.lambda$static$0(AroundTryHook.java:57) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$2(HookSupport.java:48) at net.jqwik.engine.hooks.lifecycle.TryLifecycleMethodsHook.aroundTry(TryLifecycleMethodsHook.java:57) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$3(HookSupport.java:53) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$2(HookSupport.java:48) at net.jqwik.engine.hooks.lifecycle.BeforeTryMembersHook.aroundTry(BeforeTryMembersHook.java:69) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$3(HookSupport.java:53) at net.jqwik.engine.execution.CheckedPropertyFactory.lambda$createTryExecutor$0(CheckedPropertyFactory.java:60) at net.jqwik.engine.execution.lifecycle.AroundTryLifecycle.execute(AroundTryLifecycle.java:23) at net.jqwik.engine.properties.GenericProperty.testPredicate(GenericProperty.java:166) at net.jqwik.engine.properties.GenericProperty.check(GenericProperty.java:68) at net.jqwik.engine.execution.CheckedProperty.check(CheckedProperty.java:67) at net.jqwik.engine.execution.PropertyMethodExecutor.executeProperty(PropertyMethodExecutor.java:90) at net.jqwik.engine.execution.PropertyMethodExecutor.executeMethod(PropertyMethodExecutor.java:69) at net.jqwik.engine.execution.PropertyMethodExecutor.lambda$execute$0(PropertyMethodExecutor.java:49) at net.jqwik.api.lifecycle.AroundPropertyHook.lambda$static$0(AroundPropertyHook.java:46) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$0(HookSupport.java:26) at net.jqwik.api.lifecycle.PropertyExecutor.executeAndFinally(PropertyExecutor.java:39) at net.jqwik.engine.hooks.lifecycle.PropertyLifecycleMethodsHook.aroundProperty(PropertyLifecycleMethodsHook.java:56) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$1(HookSupport.java:31) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$0(HookSupport.java:26) at net.jqwik.engine.hooks.statistics.StatisticsHook.aroundProperty(StatisticsHook.java:37) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$1(HookSupport.java:31) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$0(HookSupport.java:26) at net.jqwik.engine.hooks.lifecycle.AutoCloseableHook.aroundProperty(AutoCloseableHook.java:13) at net.jqwik.engine.execution.lifecycle.HookSupport.lambda$wrap$1(HookSupport.java:31) at net.jqwik.engine.execution.PropertyMethodExecutor.execute(PropertyMethodExecutor.java:47) at net.jqwik.engine.execution.PropertyTaskCreator.executeTestMethod(PropertyTaskCreator.java:166) at net.jqwik.engine.execution.PropertyTaskCreator.lambda$createTask$1(PropertyTaskCreator.java:51) at net.jqwik.engine.execution.lifecycle.CurrentDomainContext.runWithContext(CurrentDomainContext.java:28) at net.jqwik.engine.execution.PropertyTaskCreator.lambda$createTask$2(PropertyTaskCreator.java:50) at net.jqwik.engine.execution.pipeline.ExecutionTask$1.lambda$execute$0(ExecutionTask.java:31) at net.jqwik.engine.execution.lifecycle.CurrentTestDescriptor.runWithDescriptor(CurrentTestDescriptor.java:17) at net.jqwik.engine.execution.pipeline.ExecutionTask$1.execute(ExecutionTask.java:31) at net.jqwik.engine.execution.pipeline.ExecutionPipeline.runToTermination(ExecutionPipeline.java:82) at net.jqwik.engine.execution.JqwikExecutor.execute(JqwikExecutor.java:46) at net.jqwik.engine.JqwikTestEngine.executeTests(JqwikTestEngine.java:70) at net.jqwik.engine.JqwikTestEngine.execute(JqwikTestEngine.java:53) ```
Configuration menu - View commit details
-
Copy full SHA for f05f63a - Browse repository at this point
Copy the full SHA f05f63aView commit details -
[CI] Fix dependency conflict in JQwik.
Recently JUnit was upgraded. Rebasing revealed a dependency conflict.
Configuration menu - View commit details
-
Copy full SHA for 79b4c61 - Browse repository at this point
Copy the full SHA 79b4c61View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d7a381 - Browse repository at this point
Copy the full SHA 3d7a381View commit details -
Configuration menu - View commit details
-
Copy full SHA for d6ca75d - Browse repository at this point
Copy the full SHA d6ca75dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1a7a9dc - Browse repository at this point
Copy the full SHA 1a7a9dcView commit details -
[Java] Add support for DTO generation.
This commit adds support for generating SBE DTOs in Java. Unlike the C# and C++ support, there isn't a natural candidate for the idiomatic representation of optional fields, e.g., `Optional<T>` is only meant for use as a return type. Therefore, the mapping between Java types and the encoding is simpler than in these other languages. In a future commit, I will extend the property-based tests to cover Java DTOs.
Configuration menu - View commit details
-
Copy full SHA for bdfbc8f - Browse repository at this point
Copy the full SHA bdfbc8fView commit details -
[Java] Extend property-based tests to exercise Java DTOs.
The added test (non-exhaustively) checks the property: ``` ∀ msg ∈ MessageSchemas, ∀ encoding ∈ EncodingsOf(msg), encoding = dtoEncode(dtoDecode(encoding)) ``` The added test shows some compilation failures, which need to be resolved before enabling it.
Configuration menu - View commit details
-
Copy full SHA for a8fa1a4 - Browse repository at this point
Copy the full SHA a8fa1a4View commit details -
[Java] Avoid checking parentMessage.actingVersion inside composite.
Most not present conditions were already elided via `generateFieldNotPresentCondition(inComposite=true, ...)` but the condition within the wrap method was not. Here's an example of a failing schema found with the DTO property-based tests: ``` <?xml version="1.0" encoding="UTF-8" standalone="no"?> <sbe:messageSchema id="42" package="uk.co.real_logic.sbe.properties" version="1" xmlns:sbe="http://fixprotocol.io/2016/sbe"> <types> <composite name="messageHeader"> <type name="blockLength" primitiveType="uint16"/> <type name="templateId" primitiveType="uint16"/> <type name="schemaId" primitiveType="uint16"/> <type name="version" primitiveType="uint16"/> </composite> <composite name="groupSizeEncoding"> <type name="blockLength" primitiveType="uint16"/> <type name="numInGroup" primitiveType="uint16"/> </composite> <composite name="Type1"> <type length="2" name="member0OfType0" presence="required" primitiveType="uint8"/> </composite> </types> <sbe:message id="1" name="TestMessage"> <field id="0" name="member0" presence="optional" sinceVersion="1" type="Type1"/> </sbe:message> </sbe:messageSchema> ``` And generated code (within a composite without a `parentMessage`): ``` public void wrapMember0OfType0(final DirectBuffer wrapBuffer) { if (parentMessage.actingVersion < 1) { wrapBuffer.wrap(buffer, offset, 0); return; } wrapBuffer.wrap(buffer, offset + 0, 2); } ```
Configuration menu - View commit details
-
Copy full SHA for 3e95123 - Browse repository at this point
Copy the full SHA 3e95123View commit details -
[Java] Fix remaining DTO issues uncovered with PBT.
The main issue was around determining when a field might not be present. We were checking `token.version() > 0`, which isn't accurate in composites or groups. In this commit, I've also started to use the `Footnotes` API in JQWIK, which allows useful debug information to be captured _only_ for failing attempts. It then prints out the shrunken footnotes upon a failure.
Configuration menu - View commit details
-
Copy full SHA for e4479b2 - Browse repository at this point
Copy the full SHA e4479b2View commit details -
Configuration menu - View commit details
-
Copy full SHA for eacdb50 - Browse repository at this point
Copy the full SHA eacdb50View commit details