Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support identifying potentially errorneous characters #95

Open
Kixunil opened this issue Mar 29, 2023 · 7 comments
Open

Support identifying potentially errorneous characters #95

Kixunil opened this issue Mar 29, 2023 · 7 comments
Labels
1.0 Issues and PRs required or helping to stabilize the API

Comments

@Kixunil
Copy link

Kixunil commented Mar 29, 2023

The checksum algorithm has error detection that can identify which characters were actually wrong (with some probability). It'd be great to support this in the library.

@clarkmoody
Copy link
Member

From the text of BIP-0173:

Because of this, implementations SHOULD NOT implement correction beyond potentially suggesting to the user where in the string an error might be found, without suggesting the correction to make.

I would support including some position information in one of the enum variants. This feature might necessitate a new API function that decodes with detection, since it seems to be a more intense operation than simple checksum validation.

@Kixunil
Copy link
Author

Kixunil commented Apr 5, 2023

Indeed, that's why I wrote "identify", not "correct" :)

And yes, position information should be in error variant. I wonder if feature flag is enough.

@apoelstra
Copy link
Member

@Kixunil I think we should have a separate function call for identifying errors. Locating errors is very costly -- on the order of 20-50x as expensive as verifying a checksum for a normal sized address, I"d estimate. So we can't modify the normal checksum API to return this, even with a feature gate.

Then as for locating vs correcting errors .... I think it'd be fine to also put the corrections into the error variant, and just document in the API that users should not use these values to do automatic corrections, and probably they should just ignore them entirely and just use the locations.

@Kixunil
Copy link
Author

Kixunil commented Apr 5, 2023

OK, that makes sense for this library. But what about bitcoin::Address? It'd suck to not have locations returned from from_str. I guess I'd skip them in case of serde deserialization though.

I wouldn't include corrections at all. Seems like a noise. At least until someone asks for them.

@apoelstra
Copy link
Member

For bitcoin::Address we probably also don't want a 50x slowdown in parsing.

Though I wouldn't mind if Address::from_strdid this and we had a separate parse_without_error_locations method or something that performance-minded people could use. And serde sholud not do locating, agreed.

@Kixunil
Copy link
Author

Kixunil commented Apr 5, 2023

Yeah, also we would only calculate them on errors, which should be cold path in vast majority of applications.

@apoelstra
Copy link
Member

Yeah, that's a good point -- the first step of error locating is doing the normal verification anyway, and we know immediately whether there is even any point in doing the heavy computations.

@tcharding tcharding added the 1.0 Issues and PRs required or helping to stabilize the API label Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Issues and PRs required or helping to stabilize the API
Projects
None yet
Development

No branches or pull requests

4 participants