New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V7 and project status #106
Comments
Hi @bgadrian , the PRs that are open are either controversial or are waiting for the correct reviewers. We've been merging the v6 and v7 updates we've received from contributors. |
Closing issue because it's a more of a question |
Thanks for the quick answer. I think the V7 especially from #99 does not fully implement even the v3 of the draft, more specifically
Meaning that only relying on the nanosecond from time may not be enough. Especially on commodity hardware in clouds, the nanoseconds increment are not guaranteed, and generating multiple UUIDs in the same MS could lead to collisions. In Draft v4 there are more options defined in how to counteract this https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-04.html#monotonicity_counters and/or https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-04.html#distributed_shared_knowledge I am afraid that trying to implement these features will break the current UUID structure simplicity, as it would require some sort of internal in-memory state (counter) and/or new parameters (nodeID), so I would want to get some opinions before opening a PR on one approach, and define a design agreed by the current project maintainers. One valid way would be to keep it simple and say that this implementation will not support "Distributed Nodes" and only add support for single node (Monotonicity and Counters), or not ... |
Those are fair concerns @bgadrian. I'd counter, though, with the fact that there is no API in this library for generating batches of UUIDs. Each call to
I'd definitely argue for this path. There's definitely nothing in this library today related to synchronized generation across multiple machines.
IMO the simplicity is one of the advantages of this library so I would be very hesitant to start down a path that would lead to breaking that. For what it's worth, I've used this library to generate sometimes millions of UUIDs per second and have never run into a collision. |
I don't think batches are supported per se in any implementation, but rather generating a large amount of UUIDs in a small amount of time can be considered a batch (example an insert baching algorithm that shuffles the inserts each X seconds). One simple implementation for monotonic UUIDs would be to have an internal counter that resets at each ms (lowest V7 granularity) and increments at each generated UUID. From the client point of view/the interface will remain the same (simple).
Actually there is, the V3 HardwareAddress
I think we can easily implement this by generating a random UUID at construct (or singleton per process) and inject it in each generated UUID V7. These 2 simple additions can guarantee the monotonic and local uniqueness (and avoid collisions) properties while keeping the interface simple. |
While that's theoretically possible, injecting a function that does RPCs to other machines to do synchronization definitely goes against the spirit of that API. Again, though, we'll definitely review a PR to close any gaps in V6 or V7 UUIDs. |
I am not aware of any RPC or cross process communication and I was not suggesting one. I was only referring to the existing code |
I so glad to see this discussion! This was also a point I was difficult to decide on a policy for implementation when I submitted PR #99. First, the number of random bits in UUIDv7 without awareness of monotonic counter is 74 bits. Next, when the number of random bits is N and the collision probability P=0.5, the number of trials required can be calculated as 2^(N/2). I have often seen this value used in discussions of UUIDs and other unit time generation capabilities. From these, UUIDv7, which is unaware of the monotonic counter, has a 74-bit random, so its generation capacity can be calculated to be 2^37 when P=0.5, that is 137.4 billion (Incidentally, the generation capacity of UUIDv4 at P=0.5 is 2^61, that is 2.3 quintillion). I think that a generation capacity of 137.4 billion in msec is extremely unlikely to cause a collision in almost all use cases (even for batch applications). In the Rev04 draft, the specification for monotonic counters has been changed to should I didn't implement it at the time for these reasons, but it's important for library users to have a lot of options, so if you could implement it, that would be great! |
Hello,
What is the status of this repo? I see some PRs unmerged 1y old.
I am interested to update or write a new implementation for V6-7 UUIDs (v4 draft including the distributed requirements) and wondering if I can contribute to this project or create another one.
The text was updated successfully, but these errors were encountered: