Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow failing tests without immediate abort #676

Open
nathanperkins opened this issue Jun 28, 2023 · 6 comments
Open

Allow failing tests without immediate abort #676

nathanperkins opened this issue Jun 28, 2023 · 6 comments

Comments

@nathanperkins
Copy link

nathanperkins commented Jun 28, 2023

For testing, golang best practices state that tests should keep going as long as possible to give as much information about the failed test as possible. This is particularly critical in flaky and/or long-running tests where reproducing the failure takes considerable time and effort.

In standard golang testing this is achieved using t.Errorf, which marks the test as failed and continues, instead of t.Fatalf, which marks the test as failed and aborts immediately. I don't see a similar mechanism in gomega for making assertions which should continue.

From what I can tell, these solutions available in gomega today:

  • Gather your assertions into a map or struct which can be checked in a single Expect.
  • Write a custom matcher.

But these solutions can be clunky, tedious, and complex cases require a deep understanding of gomega. It becomes particularly difficult and unergonomic when trying to combine complex assertions involving a few different objects.

Could it be considered to provide another assertion option which would fail the test and continue? This feature would make it relatively trivial to follow testing best practices to expose as much information about test failures as possible.

In case, I've missed an existing solution which is similarly trivial, it may be worth highlighting more prominently in the docs. I did Ctrl-F Errorf and Fatalf in https://onsi.github.io/gomega/ and scanned through but I didn't see anything.

@onsi
Copy link
Owner

onsi commented Jul 7, 2023

Hey @nathanperkins I tend to take a different view on how failure should be handled, particularly for long-running/flaky tests. But we don't have to dig into those more philosophical/subjective differences here 😅

There are a few mechanisms that enable you to allow test failures without immediately aborting the current test.

  1. You can wrap your assertions with InterceptGomegaFailures()
  2. You can override the global failure handler associated with the default Gomega (the one that handles bare invocations of Expect and Eventually) with RegisterFailHandler
  3. You can create a new Gomega with a custom fail handler and use the new Gomega in all your assertions

I'd probably lean towards #3. The simplest thing you could do (assuming you are using testing and not Ginkgo) is something like:

g := NewGomega(func(message string, _ ...int) {
    t.Helper()
    t.Errorf("%s\n",  message)
})

and then use g.Expect(), g.Eventually(), etc...

@nathanperkins
Copy link
Author

Thanks for your helpful responses! I hope this follow up isn't a bother :)

(assuming you are using testing and not Ginkgo)

We are using Ginkgo :)

Hey @nathanperkins I tend to take a different view on how failure should be handled, particularly for long-running/flaky tests. But we don't have to dig into those more philosophical/subjective differences here 😅

No problem! I hope you don't mind if I explain with a more concrete example, in case this is an XY Problem.

We create and maintain Kubernetes controllers (using kubebuilder) which manage advanced networking functionality on Kubernetes clusters.

  1. Our E2E tests take about 45 minutes to set up the environment, which is orchestrated by bash script which calls out to some internal cluster creation services.
  2. Once the cluster is ready, we transfer the e2e test binary (with ginkgo and gomega tests) to our cluster and then execute it.
  3. We use Context + BeforeEach + JustBeforeEach to deploy K8s objects in the cluster to set up the state we are looking for. This process can take multiple minutes.
  4. We use It() blocks to validate the result. Usually this means doing ping, curl, checking the state of external networking components or simply getting objects from K8s and checking its state. It's common for us to be validating the state on multiple objects, which may be a list of the same type or may be different types.
  5. We use AfterEach and JustAfterEach blocks to delete the K8s objects and wait for the environment to be clean.

In one It block, we may have Expects on the state of Object A, object B and object C. If our check on object A fails, it is still incredibly valuable for us to see what object B and C look like. It could reveal that there is a problem isolated to object A or it could reveal that the problem is across all objects. If the test is flaking frequently, knowing that one object failed vs all is helpful. When the test aborts immediately, we can only see one object failed at a time. We have to correlate the failure across multiple runs to get a full picture (or it may be impossible to get a full picture at all).

Now, I understand that the common ginkgo practice would be for us to try to separate these checks into different It blocks. But in practice, it may make the tests harder to follow. And in our case, it increases the runtime greatly because each It block requires destroying and recreating objects then waiting for them to be ready. Each It block adds 1-3 minutes to the total test runtime.

It may be that we need to change how we use ginkgo and gomega to make our tests more efficient but I haven't found any good patterns for reducing runtime here without asserting multiple things in each It block.

@nathanperkins
Copy link
Author

Another approach I was thinking about would be if there was a wrapper similar to InterceptGomegaFailures() which could unconditionally run all of the Expects in a block and if any failed, log the failures and mark the test as failed.

g.ExpectAll(func() {
  g.Expect(ObjectA).Should(Equal(wantObjectA))
  g.Expect(ObjectB).Should(Equal(wantObjectB))
  g.Expect(ObjectC).Should(Equal(wantObjectC))
})

@onsi
Copy link
Owner

onsi commented Jul 10, 2023

first off ❤️ ❤️ ❤️ for wanting to avoid the XY problem!

Yeah I've been seeing a lot of k8s usecases with super expensive setup at the Describe/Context level. Ginkgo was originally built around the notion that you could do the expensive stuff in a single BeforeSuite - but I appreciate that there are plenty of usecases in k8s where a collection of tests need to operate on a particular set of controllers/operators and you don't want to spend time setting up/tearing down the controllers/operators. If you have lots of Contexts that do this I sometimes point folks at Ordered containers so they can perform the expensive setup once and run a bunch specs in order - you lose parallelization within the container but often there are enough other tests to run that Ginkgo can keep the running Ginkgo processes saturated with work. So Ordered containers with ContinueOnFailure could be a potential solution for you...

...but, honestly, I'd probably lean in the direction of your last comment and build a custom VeryifyAll function. I could imagine adding something like it to Gomega but I think you have all the building blocks you need to do it yourself:

If you are using the global Gomega DSL (e.g. either import . "github.com/onsi/gomega" or import g "github.com/onsi/gomega" then you can just:

func VerifyAll(assertions func()) {
    GinkgoHelper()
    failures := InterceptGomegaFailures(assertions)
    if (len(failures) > 0) {
        Fail("Unexpected Assertion Failures:\n", strings.Join(failures,"\n"))
    }
}

if you're making and managing your own gomega instance I can share an alternative example.

This allows you to explicitly opt into this behavior when you want to make multiple assertions - while preserving the default behavior (which, honestly, I think is usually what you want. e.g. No sense pushing a container to the cluster to test an operator if the api call you made to deploy the operator failed!).

@nathanperkins
Copy link
Author

That helps a lot. Thank you so much for the thoughtful response!

We will try a combination of both:

  • Ordered blocks to reduce the number of setup+teardown cycles.
  • VerifyAll to combine multiple assertions.

Is VerifyAll something you'd want contributed to ginkgo? I can do a PR if you tell me where it goes.

@onsi
Copy link
Owner

onsi commented Jul 11, 2023

thanks @nathanperkins - glad we could land on a few things that might help. I'm open to a PR, sure - I think it would make sense to go in Gomega alongside the Intercept... functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants