Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests(refactor): open_dkim.bats #3060

Merged

Conversation

polarathene
Copy link
Member

@polarathene polarathene commented Feb 5, 2023

Description

2 years later, but we've finally done it 😛

This test has been refactored heavily and should address the concerns raised in the comments and past discussions about it. There was a mention of a checkkey feature that could be used, but that requires setting up DNS. In future once we do have a DNS service in the tests, I'd rather just do a test that sends and receives mail, while verifying DKIM was handled correctly.

I staged out changes across commits with commit messages in each if anyone wants to compare to the original with easier diffs to follow.

Feature history

Type of change

  • Improvement (non-breaking change that does improve existing functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing unit tests pass locally with my changes

- These all do roughly the same logic that can be split into two separate methods.
- `_should_generate_dkim_key()` covers a bit more logic as it can be leveraged to handle other test cases that also perform the same logic.
- The `config/opendkim/` doesn't seem necessary for tests. Only the first few test cases here are testing against it, so we can conditionally make that available. `process_check_restart.bats` also depended on it to run OpenDKIM successfully, but this was due to the `setup-stack.sh` config defaults failing to find an "empty" file forcing `supervisord` to constantly restart the process..
- With this, there we inverse the default opendkim config, so we don't have to mount unique / empty subfolders for each test case, followed by copying over the two extra configs.
All the remaining test cases but the last one were refactored here for a clean commit diff. The last test case will be refactored in the following commit.

Plenty of repeated logic spread across these test cases, now condensed into shared methods.
- As the majority of test cases are only running `open-dkim` helper, we don't actually have to wait for a full container setup. So an alternative container start is called.
- Also improves assertions a bit more instead of just counting lines.
- Some test cases don't bind mount all of `/tmp/docker-mailserver` contents, thus don't raise permission errors on subsequent test runs.
- Instead of `rm -f` on some config files, have opted to mount them read-only instead, or alternatively mount an anonymous empty volume instead.
- Collapsed the first three test cases into one, thus no `setup_file()` necessary.
- Shift the `_wait_for_finished_setup_in_container()` method into `_common_container_setup()` instead since nothing else is using `_common_container_start()` yet, this allows for avoiding the wait.
This makes these tests a bit more DRY, and enhances the raised quality issue with these tests. Now not only is the domain checked in the generated DNS dkim record, but we also verify the key size is corrected in the public and private keys via openssl.
- `__should_have_tables_trustedhosts_for_domain` shifted in each test case to just after generating the domains keys.
- Asserts `open-dkim` logs instead of just counting them.
- Added checks for domains that should not be present in a test case.
- Additional coverage and notes about the alias from vhost `@localdomain.com`
- Single assert statement with switch statement as all are using common args.
@polarathene polarathene added area/tests kind/improvement Improve an existing feature, configuration file or the documentation service/security/dkim-dmarc-spf labels Feb 5, 2023
@polarathene polarathene added this to the v12.0.0 milestone Feb 5, 2023
@polarathene polarathene self-assigned this Feb 5, 2023
The order printed from local system vs CI differed causing the CI to fail. The order of lines is irrelevant so `--index` is not required.

Additionally correct the prefix of the called method to be only one `_` now that it's a `common.bash` helper method.
These cover the same test logic for the most part, the first domain could also be testing the custom selector.

`special_use_folders.bats` + `mailbox_format_dbox` can assert lines instead, removing the need for `--partial`.
Copy link
Member

@georglauterbach georglauterbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, nice work! Just some minor suggestions from my side :)

test/tests/parallel/set1/dovecot/mailbox_format_dbox.bats Outdated Show resolved Hide resolved
test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
local EXPECTED_DOMAIN=${2}
local EXPECTED_SELECTOR=${3:-'mail'}

case "${ASSERT_FOR}" in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want / need a default ( * ) for this switch-case statement? Usually good practice to catch issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add one, but it's only a method used by the direct 3 wrapping methods preceding it. And only this test would use this method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just remove the switch statement method, and copy/paste the 3 parameters into the wrapper ones with their relevant assertion? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just remove the switch statement method, and copy/paste the 3 parameters into the wrapper ones with their relevant assertion? 🤔

This may actually be the best idea, yes. It removes the need for the default case and eliminates some complexity as well.

test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
test/tests/serial/open_dkim.bats Outdated Show resolved Hide resolved
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
georglauterbach
georglauterbach previously approved these changes Feb 7, 2023
Copy link
Member

@georglauterbach georglauterbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏼

Now you just have to take care of removing the extra slash for tests to pass :D (see https://github.com/docker-mailserver/docker-mailserver/actions/runs/4112519553/jobs/7097549778)

@polarathene
Copy link
Member Author

polarathene commented Feb 7, 2023

Initial investigation (with CI run logs linked)

It seems our CI is buggered:

Docker Engine: 20.10.22
Buildx: 0.10.0

/usr/bin/docker buildx build --cache-from type=local,src=/tmp/.buildx-cache --iidfile /tmp/docker-build-push-AubJbU/iidfile --platform linux/amd64 --provenance false --tag mailserver-testing:ci --load --metadata-file /tmp/docker-build-push-AubJbU/metadata-file .

ERROR: attestations are not supported by the current buildkitd
Error: buildx failed with: ERROR: attestations are not supported by the current buildkitd

BuildKit in Docker Engine would be 0.8 I think, but buildx should be using it's own BuildKit which should be at least 0.10 🤷‍♂️

Weird since we have explicitly opted out of provenance 🤔


Serial test did work however.

It has the same Docker Engine version, but Buildx was 0.10.2. Same buildx action and command with --provenance false, and same cache used. For some reason the parallel test sets used runners with an older buildx version 🤔

I am going to restart the parallel tests and see if that make a difference.


Further investigation
  • Multiple re-runs on failed jobs was unsuccessful. Serial runner also started using buildx 0.10.0 and failing too.
  • Reported issue upstream to docker/build-push-action and docker/buildx.

Attempt to resolve by clearing CI cache (no luck)
  • Observed buildx release notes for 0.10.1 that describe images built with attestations are not compatible with buildx --load, as we previously used docker/build-push-action where release notes describe using attestations by default, it's possible we hit a compatibility issue with --provenance false.
  • Deleted relevant image cache to do a full re-run of the CI workflow, and we'll see if a fresh image build without attestations was the problem 👍
  • Realized our CI caching is too good, I need to delete all image cache prior to us upgrading to docker/build-push-action 3.3 😅
  • Entire cache cleared (only went back as far as 1 week), workflow built new image from scratch but still failed 😭

A CI run for this PR was working yesterday before the docker/build-push-action upgrade:

Docker Engine: 20.10.22
Buildx: 0.10.0

/usr/bin/docker buildx build --cache-from type=local,src=/tmp/.buildx-cache --iidfile /tmp/docker-build-push-FJmVIc/iidfile --platform linux/amd64 --tag mailserver-testing:ci --load --metadata-file /tmp/docker-build-push-FJmVIc/metadata-file .

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 12.34kB done
#1 DONE 0.0s

I'm wondering if we remove provenance: false to allow the attestations, perhaps we'll get successful runs with buildx again? 🤷‍♂️

I have noticed that the builds are performed with the buildx docker-container driver, where it can leverage a newer BuildKit. But the test workflow which only needs to run buildx build --load while pulling in the build image cache from earlier, is using the Docker Engine BuildKit version via buildx docker driver.

Hence the BuildKit version is too low and a safeguard is causing the false-positive. That seems to not be triggered with newer buildx 0.10.2, and within a week Docker Engine should be updated to v23 which provides a compatible BuildKit version. This will resolve itself by then.

Problem is due to --provenance option (even when false) causing the BuildKit compatibility check (and likewise failure I assume, which has since been fixed).

PR raised to resolve it: #3072

@polarathene polarathene merged commit 88767f7 into docker-mailserver:master Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tests kind/improvement Improve an existing feature, configuration file or the documentation service/security/dkim-dmarc-spf
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants