Add optional ASSUME_COMPACT_PDF_417 decode hint (fixes #1624) #1627

gredler · 2023-05-23T00:30:17Z

This is a proposed fix for bug #1624. Tests 01 and 02 test basic decoding capability with the new flag enabled, and test 03 is confirmed to fail without the flag.

srowen · 2023-05-23T00:54:26Z

I'm ok with it. Is there any reasonable way to do this automatically without a flag or does the caller really have to hint what it's looking for?

gredler · 2023-05-23T01:13:23Z

The root issue is that there are magic fudge factors intended to make the system lenient to imperfect images, and these leniency factors end up considering a legitimately non-matching module sequence as a "good enough" match for the standard PDF417 stop pattern. This messes up the whole symbol detection algorithm.

These leniency factors are Detector.MAX_AVG_VARIANCE = 0.42 and Detector.MAX_INDIVIDUAL_VARIANCE = 0.8. If these numbers are lowered to 0.15 and 0.79 respectively, this one failing example passes without the new flag (only one value needs to change, it's not required that both values change).

The problem is that I have no idea how these magic factors were derived. The change needed for Detector.MAX_AVG_VARIANCE is very large, so I have to imagine it would cause regressions elsewhere. But the change needed for Detector.MAX_INDIVIDUAL_VARIANCE is small, so perhaps changing it to 0.7 or 0.75 would fix the issue without causing regressions? I just have no idea. Do you know how these numbers were derived, and how to quantify the risk of any changes?

srowen · 2023-05-23T01:21:14Z

Oh, just crudely hand tuned to give the best perf on a small training dataset probably 15 years ago. Not that magic, but yeah changing it significantly could shift results. Not worth rocking the boat much. I think this is pretty fine, if the use case is a bit of a special case.

gredler · 2023-05-23T01:25:25Z

I just realized that some other symbologies have similar MAX_INDIVIDUAL_VARIANCE factors, and none of them are as high as this one:

ITF: 0.5
Code128: 0.7
RSS: 0.45
UPC/EAN: 0.7

Maybe changing it to 0.7 would be OK?

srowen · 2023-05-23T01:45:24Z

Try it; if it doesn't make tests worse net-net, then it'd be OK too. You can see the # of images that pass and fail and it'll highlight where it's better or worse.

gredler · 2023-05-23T02:05:02Z

Every PDF417 image in the current test set decodes correctly at MAX_INDIVIDUAL_VARIANCE = 0.7, using mvn --projects core test "-Dtest=com.google.zxing.pdf417.*TestCase". A full mvn test also passes, though I didn't check every image count on the full test run.

I have to reduce MAX_INDIVIDUAL_VARIANCE to 0.56 for any existing tests to start failing, using mvn --projects core test "-Dtest=com.google.zxing.pdf417.*TestCase". So either 0.7 is pretty safe, or the test suite is incomplete... but I was impressed with the real-world samples in the pdf417-2 test directory.

I'm fine either way (flag or adjust factor to 0.7) -- it's your call!

srowen · 2023-05-23T02:24:40Z

If you have a sec, open a PR with just the threshold change. That's simpler and if it's effective, might be a nicer place to start

gredler · 2023-05-23T02:46:17Z

OK, I'll create a separate PR and you can reject this one -- that way we can reference this PR later if needed (if the threshold change doesn't work out for whatever reason).

Add optional ASSUME_COMPACT_PDF_417 decode hint (fixes zxing#1624)

b65da9b

srowen closed this May 23, 2023

gredler mentioned this pull request May 28, 2023

Compact PDF417: spurious ChecksumException #1641

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional ASSUME_COMPACT_PDF_417 decode hint (fixes #1624) #1627

Add optional ASSUME_COMPACT_PDF_417 decode hint (fixes #1624) #1627

gredler commented May 23, 2023

srowen commented May 23, 2023

gredler commented May 23, 2023

srowen commented May 23, 2023

gredler commented May 23, 2023 •

edited

srowen commented May 23, 2023

gredler commented May 23, 2023

srowen commented May 23, 2023

gredler commented May 23, 2023

Add optional ASSUME_COMPACT_PDF_417 decode hint (fixes #1624) #1627

Add optional ASSUME_COMPACT_PDF_417 decode hint (fixes #1624) #1627

Conversation

gredler commented May 23, 2023

srowen commented May 23, 2023

gredler commented May 23, 2023

srowen commented May 23, 2023

gredler commented May 23, 2023 • edited

srowen commented May 23, 2023

gredler commented May 23, 2023

srowen commented May 23, 2023

gredler commented May 23, 2023

gredler commented May 23, 2023 •

edited