-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support serialization as [u8; 16]
rather than as &[u8]
#329
Comments
Hi @smarnach Thanks for reporting this and sorry you had to stop using the crate for this. Are you open to writing a PR? If not, its okay, not a big deal. |
@Dylan-DPC No worries! I don't have time to prepare a PR during the week, but I may have time in the weekend, so I will try to do it then. No guarantees, though. :) |
Cool :) No we have labelled it "hacktoberfest" so there is a chance someone might pick it up before you end up doing it. .Will keep this issue updated if that happens 👍 |
I'd be interested in taking a stab at this, would the code I'm changing be under serde_support? |
@Redrield I'm just the reporter of this issue, and I'm not particularly familiar with the code base, but I had a quick look anyway. The current serialization code is in For human-readable codecs, no change is required. For non-human-readable codecs, the new implementation should use @Dylan-DPC Could you please comment on whether backwards compatibility is required, given that this goes into a new version? Having both versions in the codebase could be confusing in the long run, in particular since the difference is somewhat subtle. On the other hand, breaking compatibility would make it harder for people to upgrade. |
Hi thanks for showing interest. Yeah @smarnach has got everything covered in his last comment. Had discussed with my co-maintainer earlier and we prefer if the feature-gate is named Backwards compatibility is preferred in this case as we don't want to introduce a backward incompatible minor version and don't have much to go for a major release. Also some users might be wanting the old functionality so we'll keep both and the user can opt for either based on a feature gate. |
* This commit adds a feature `dense_serde` which enables serialization and deserialization of a Uuid as a [u16; 8]
* This commit adds a feature `dense_serde` which enables serialization and deserialization of a Uuid as a [u16; 8]
As @Dylan-DPC said we need backward compat... as such if the feature is not enabled the current behaviour should exist otherwise |
@kinggoesgaming Hmm, I'm not sure a feature flag is the best way to go here, because you can't really guarantee what flags a dependency will have toggled when you pull it in (thanks to it maybe being a dependency of one of your dependencies already) it means we effectively couldn't guarantee You can get a copy of the |
@KodrAus you can make a feature flag depend on a dependency feature flag which afaik should toggle it for the dependency. |
@Dylan-DPC, the issue is any dependency could toggle that flag on for you, and the introduction of any new dependency could bring in I think we could explore what potential breakage there is in a patch that uses more space-efficient array serialization, but supports both fixed-size arrays and slices in deserialization. That would be backwards compatible, but not forwards compatible. |
@KodrAus that's true but what I was saying is that someone who wants that deserialisation will have the feature opted in else he can use what's already available. I can't think of a better approach if you have any, let us know. I feel we could try this approach and see how it goes and then take a call on it in a later version? |
I see what you mean, and it's not very likely but the problem is for folks that don't have the feature enabled. Any new dependency they add, or maybe anytime they run I wouldn't recommend landing this with a feature flag. They're only suited to features that add functionality that wasn't available before in a backwards compatible way. If we want to support this at all, and if we want to support this at all in the current minor version then I think we would introduce the same breakage over the lifetime of that minor version by just making the change to the serializer and ensuring the deserializer is resilient, with no feature flags. That would be a tidier solution I think. |
Yes fair but wouldn't that be a breaking change as well since we are changing what we are serialising? |
That depends on whether we consider the format itself in scope for breakage, or just the fact that any serialized I guess my point is really that a feature flag doesn't actually prevent this same kind of breakage in this case, because you're not solely in control of whether or not it's toggled. And if the feature flag isn't protecting us then we may as well not have it. If we consider the output of the |
Agreed. The reason for the feature flag was to ensure backwards compatibility but seeing that most likely people are going to use it on Uuid type instead of the slice directly, i'm fine with doing this directly without the feature flag if the output is the same. |
Could implementation detail discussion go in the PR that I have opened so I can check there to see what changes from what I have implemented may need to be done? |
It is not possible to make the deserializer support both formats for the Here's a few more details about the use case we encountered. The superblock of an ext2 partitiion contains a struct with close to 50 fields, and two of the are UUIDs: #[repr(C)]
pub struct Superblock {
pub s_inodes_count: u32,
pub s_blocks_count: u32,
[...]
pub s_uuid: uuid::Uuid,
[...]
pub s_journal_uuid: uuid::Uuid,
[...]
} The superblock is stored on disk in packed binary format, one field after the other, with all integers in little endian format. On a little endian machine, we can load it from disk using a bitwise copy. If we want a platform-independent soution, though, we need to use the serde bincode codec, which only works if we replace |
@smarnach Ah, so Slightly off topic, if you need to guarantee the byte layout of |
@KodrAus Thanks for your help. Yes, using a fixed-length array is fine for our use case, but I still think it's a good thing to have a version of the serialization that behaves the same as a fixed-size array. The solution turned out quite nice. |
Would it make sense to make the |
@Marwes, that seems like a fair suggestion. I think we'd have to explore how this might break any currently persisted data though. |
uuid::Uuid serializes as a vector, when it really should be serializing as a fixed-size array. This change creates a local wrapper class that passes through every method except serialize and deserialize. uuid-rs/uuid#329
Revisiting this now. I think we should fix this but need to come up with a plan for any previously serialized data. The problem is a library that derives So I think our only path here is to add a I’m erring in the side of caution here just because of how widely used the library is and how long its format has been around for. I could probably be convinced otherwise 🙂 |
Ah and the original advice in this thread that we don’t guarantee |
I think that solution was already implemented in #331. |
As long as the breaking change is done in a semver incompatible release it is up to users to ensure that they don't break their format (and if you have a binary format I'd really hope you are careful about breaking changes in that format!). Being forced into a suboptimal format forever because users can't be trusted doesn't strike me as a better idea. |
@Marwes I think the trouble is you don’t necessarily know your type with a private field It would be a shame to be stuck with the redundant length forever though. Maybe it’s ok if we are clear in our release notes that the format can change, so if you’re using |
Anything that implements |
🤷 indeed. I’m just a bit worried about it being especially easier to miss in libraries that are just updating things without the context we’ve got in this thread that makes it seem really obvious. The next breaking release I was hoping to do is |
I think I’m convinced then we should use the dense format by default in |
Specifically the case I was thinking about is something like: // On inspection Uuid looks private
pub struct MyData {
field_a: i32,
field_b: bool,
field_n: Uuid,
}
// lots of other code here
#[cfg(some_feature)]
impl Serialize for MyData {
// Actually Uuid is not private
} |
The |
Is your feature request related to a problem? Please describe.
Currently, serialization and deserialization are performed as a slice rather than a fixed-sized array. When using the
bincode
codec, this means that the raw data is prefixed with a redundant length tag (16usize
), which is unsuitable for loading binary datastructures containing UUIDs. It is also unexpected that a UUID is serialized in a different way than[u8; 16]
, since it essentially is an[u8; 16]
.Describe the solution you'd like
I would like to be able to serialize a UUID as if it was a
[u8; 16]
in non-human-readable codecs. This could be protected by a new feature flag to avoid breaking compatibility with old serialized data.Is it blocking?
Not really, since we stopped using this crate because of this issue. While it would be possible to implement a different serialization for the
UUID
type, it was easier for us to use a plain[u8;16]
instead.Describe alternatives you've considered
I looked into writing my own serialization and deserialization functions. It would be nice if this was supported out of the box, though.
Additional context
N/A
Other
N/A
The text was updated successfully, but these errors were encountered: