Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extension method for generating Norwegian national identity numbers. #272

Merged
merged 2 commits into from Dec 10, 2019
Merged

Conversation

mika-s
Copy link
Contributor

@mika-s mika-s commented Dec 9, 2019

This adds an extension method (Person.Fødselsnummer()) to Bogus.Extensions.Norway for generating Norwegian national identity numbers (fødselsnummer).

It generates numbers for people born between 1834 and 2039, for both men and women. It will throw an exception if the birth year is before or after the years mentioned.

@bchavez
Copy link
Owner

bchavez commented Dec 9, 2019

Hi Mika,

Thank you for the pull request. I will review the changes when I get back from work later today. If all looks good I will send out a release tonight.

Quickly looking at the changes it looks like you did great work.

-Brian

Copy link
Owner

@bchavez bchavez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mika-s, thanks for the PR. If you could check my comments, about the do-while loop and the checksum weights that would be great.

{
individualNumber = GenerateIndividualNumber(r, p.Gender, p.DateOfBirth.Year);
isOkChecksum = GenerateChecksum(birthDate, individualNumber, out checksum);
} while (!isOkChecksum);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mika-s,

First off, great work! It's very well done and looks good!

However, I have a question about this do-while loop:

  • Why do we need to "check if the checksum is okay" and if the checksum is bad, loop again?
  • Is there some reason why we can't calculate the checksum directly from birthDate and individualNumber?

I guess I feel a little uneasy about this do-while loop because generating the check digits should be a deterministic result of the input arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bchavez,

The reason for the loop is because the checksums can end up getting 10 as calculated value , either cs1, cs2 or both. In that case the number is invalid, and a new one has to be generated. The only thing that can be regenerated is the individual number, because the birthdate is fixed. So in the loop I generate a new individual number and try calculating the checksum again.

I don't think it will break determinism, as long as the global Random object is used. Given a seed the loop will run the same number of iterations every time.

As for the second comment:

I am not sure why the weights are as they are in the Python library, but they should be correct in the PR. The offical spec is here (page 12-13):

In English:

The weights are:
3, 7, 6, 1, 8, 9, 4, 5 and 2
k1 = 11 - ((3D1 + 7D2 + 6M1 + 1M2 + 8Y1 + 9Y2 + 4I1 + 5I2 + 2I3) - 11Q1)

Q1 is the the integer quotient for the product sum divided by 11, meaning the highest number that when multiplied with 11 gives the product sum or less.

[Q1 is just modulo.]

There are no birth numbers assigned that gives k1 = 10.
If k1 = 11, set K1 equal to 0, otherwise K1 is equal k1.

The second control digits, K2, is calculated in the same way of the first 9 digits and the first
control digit.

The weights here are:
5, 4, 3, 2, 7, 6, 5, 4, 3 and 2
k2 = 11 - ((5D1 + 4D2 + 3M1 + 2M2 + 7Y1 + 6Y2 + 5I1 + 4I2 + 3I3 + 2K1) - 11*Q2)

Q2 is the the integer quotient for the product sum divided by 11, meaning the highest number that when multiplied with 11 gives the product sum or less.

There are no birth numbers assigned that gives k2 = 10.
If k2 = 11, set K2 equal to 0, otherwise K2 is equal k2.

I generated a couple of numbers and double checked their validity with a couple of other tools, e.g. this and this (first textbox), and they said the numbers were valid.

These are the numbers I checked (base64 encoded to make sure Google doesn't index numbers that belong to real people):

MjQwMzk3MDQxNDENCjEwMDE4MTI5MDIyDQowNDExOTkwMjQzMQ0KMTgwMTYyMDQxNjgNCjEzMDM4NzM1NzQy

The calculation of the checksum could of course be made in a different way. A functional approach would be to use map-fold. Another imperative approach is to use a loop that iterates over a split birthdate+indnum string and then multiplies each number with the weight and adds it to a total variable (like in the Python code).

As for the do-while loop, another approach there could be to increment or decrement the individual number if the control digits become 10, and then recalculate the control digits with the incremented/decremented values. The control digits should then become something else than 10, but I cannot guarantee it. The incremented/decremented individual number would become the new individual number of course, it's not just done in the checksum calculation.

int i2 = int.Parse(individualNumber.Substring(1, 1));
int i3 = int.Parse(individualNumber.Substring(2, 1));

int cs1 = 11 - (((3 * d1) + (7 * d2) + (6 * m1) + (1 * m2) + (8 * y1) + (9 * y2) + (4 * i1) + (5 * i2) + (2 * i3)) % 11);
Copy link
Owner

@bchavez bchavez Dec 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps related to my previous do-while comment:
I'm trying to figure out why the method needs to return bool for isOkChecksum in the do-while loop.

I found some open-source python code shows the check digit weight calculations as:
3, 7, 6, 1, 8, 9, 4, 5, 2, 1
https://github.com/magnuswatn/fodselsnummer/blob/5452318bf99ffba5efd4b4dbaa4f3ae9bc17e319/fodselsnummer.py#L122-L123

Here, in the PR, the current C# code weights are:
3, 7, 6, 1, 8, 9, 4, 5, 2
Is it possible that the weights are a little off because the last 1 is missing from the calculation here in the PR?

Again, great work @mika-s! I apologize for the delay.

@bchavez bchavez merged commit 1bf2b72 into bchavez:master Dec 10, 2019
@bchavez
Copy link
Owner

bchavez commented Dec 10, 2019

Hi @mika-s ,

Thanks so much for the clarification. Also, thank you for helping translate the official spec. It helps a lot!

Your changes should be ready in Bogus v28.4.4 shortly after CI/CD is done.

https://ci.appveyor.com/project/bchavez/bogus

https://www.nuget.org/packages/Bogus/28.4.4
(link will show correct version number when artifacts are deployed)

Thanks again for your contribution!
Brian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants