Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature idea: aggregate_failures #733

Closed
myronmarston opened this issue Feb 12, 2015 · 20 comments
Closed

Feature idea: aggregate_failures #733

myronmarston opened this issue Feb 12, 2015 · 20 comments

Comments

@myronmarston
Copy link
Member

When you've got multiple independent expectations to make about a particular result, there are generally two routes people take:

RSpec.describe MyThing do
  it "does the thing and makes many expectations about it" do
    result = MyThing.do_it

    expect(result).to blah
    expect(result).to blah_blah
    expect(result).to have_other_interesting_properties
  end
end

# or

RSpec.describe MyThing do
  let(:result) { MyThing.do_it }
  specify { expect(result).to blah }
  specify { expect(result).to blah_blah }
  specify { expect(result).to have_other_interesting_properties }
end

Nether is perfectly ideal -- the latter gives you failure output for each expectation that failed, whereas the former runs faster, particularly if MyThing.do_it is slow. People ask if RSpec supports a better solution for this from time to time (the most recent example is on the mailing list).

One solution is available in rspec-given: it's And API. That's not going to work in vanilla RSpec, of course, but it got me thinking about this, and I've come up with an idea that I think would work quite well: a new block API like aggregate_failures. Here's how it would work:

RSpec.describe MyThing do
  it "does the thing and makes many expectations about it" do
    result = MyThing.do_it

    aggregate_failures do
      expect(result).to blah
      expect(result).to blah_blah
      expect(result).to have_other_interesting_properties
    end
  end
end

The idea is that you can wrap any set of expectations with it, and it aggregates all the expectation failures that occur in the block, and, if any occur, raises a single exception at the end of the block containing all the failure messages.

Implementation wise, I've thought of a super simple way to do this: change fail_with so that rather than calling raise directly, it instead invokes RSpec::Expectations.configuration.failure_notifier.call(failure). We'd implement failure_notifier so that it uses a thread local for storage and defaults to Kernel.raise. aggregate_failures would then simply change the failure_notifier for the current thread to an alternate implementation that aggregates the failures rather than raising immediately, yield, change it back, and then raise with all the failure if there are any.

I'm also thinking that we could automatically make this available via metadata as well:

RSpec.describe MyThing do
  it "does the thing and makes many expectations about it", :aggregate_failures do
    result = MyThing.do_it

    expect(result).to blah
    expect(result).to blah_blah
    expect(result).to have_other_interesting_properties
  end
end

...which would use an around hook to wrap the example in aggregate_failures.

I'd like to hear from others about this:

  • Good idea or bad idea? (And why?)
  • Is the name aggregate_failures the right name for it? (It's the first OK name I thought of but I don't love it). Other ideas I have are collect_failures and delay_failures. I don't love any of my ideas, though.
  • Are there any gotchas or caveats we'd have to document with this that would make it more complex to use?

/cc @rspec/rspec

@JonRowe
Copy link
Member

JonRowe commented Feb 12, 2015

I'm not sure this will solve the issue because people want multiple failures, not one massive failure, but it is a good compromise and I personally like it as a starter.

I wonder if we could somehow report (we probably can using the reporter), multiple failures from one test (so it's 1 pass, or x failures)... This gives you the advantage of having the compact output when passing, but having individual failures you can fix...

👍 on the metadata option too...

@xaviershay
Copy link
Member

I don't necessarily think people want multiple failures, they just want to see all the failure messages.

I would use this. I've wasted more time than I care to admit flipping between whether I do the first or second of the current ways to do this currently.

Implementation wise the only part that scares me a little is the thread local, though I don't have any concrete issues with it. I like the metadata option, I'd be leaning towards only supporting that. (And maybe even making it the default eventually?) I don't see a strong reason (beyond esoteric hypotheticals) for either using multiple blocks, nor having some expects inside and some outside. At least, I'd start with metadata option first and then only add block if we found it lacking.

Could you instance eval the block with a redefined expect? That's probably just as weird as using a thread local...

@myronmarston
Copy link
Member Author

I'm not sure this will solve the issue because people want multiple failures, not one massive failure, but it is a good compromise and I personally like it as a starter.

As @xaviershay said, I think people just want to see all the failure messages. Ultimately one example is one example and can only count as one failure in the formatter summary. This feature would never change that.

We could, however, add some additional metadata to RSpec::Expectations::ExpectationNotMetError such that it would have an attribute that returns a list of the failures (in addition to having a message that returns all of them formatted into one string). We could potentially put some logic in rspec-core's formatters to format the output more intelligently to make it clear you are getting failures from multiple expectations in one example. I think that would come later, though.

Implementation wise the only part that scares me a little is the thread local, though I don't have any concrete issues with it.

Thread locals always feel a bit like a hack but it's the best solution I could come up with, and making it thread local seems important in light of our plans to support a multi-threaded runner at some point (e.g. rspec/rspec-core#1254).

I like the metadata option, I'd be leaning towards only supporting that.

Problem is, rspec-expectations has no concept of example metadata because it has no concept of examples. Metadata is purely an rspec-core concept. Given that we intend for rspec-expectations to be usable on its own (or in a context like cucumber), it needs to expose an API for this, and then rspec-core could use that and wire up the metadata.

I can also see cases where it would be useful to only apply this to part of an example. For example:

context "when a record has been saved" do
  it "returns a response with particular properties" do
    expect(MyModel.create(valid_attributes)).to be true

    get '/some/endpoint'

    aggregate_failures do
      expect(response).to foo
      expect(response).to bar
    end
  end
end

In this example, the first expectation is validating an assumption/pre-condition that the record got saved properly. (In practice you'd probably use create! but this is just meant as an example...). The last two expectations are independent from each other and can be aggregated, but if the first fails, you'd want the example to bail at that point like it does now.

And maybe even making it the default eventually?

Maybe, although again I can see cases where you may not want it, like the case above. It's something we can consider down the line if we see this being used in most situations by users.

Could you instance eval the block with a redefined expect? That's probably just as weird as using a thread local...

It's not just as weird. It's far, far weirder. I briefly thought about taking that route, but consider:

  • In normal instance_eval situations you don't have access to the same methods as outside the block because the self is different. That would be hugely problematic (even basic constructs like let would not work with this as it's just a method def). You can do hacks like method_missing to forward messages to the outer context but I really don't want that complexity.
  • If you call a helper method that has an expectation within it, it would be the normal expect and would not get the same behavior which would be confusing.
  • This wouldn't work for should -- which isn't something we want to keep investing effort into maintaining, but given there's a simpler solution that works just fine with should this is another negative of the instance_eval solution.

So one big question is: what do we call this? My best ideas so far are:

  • aggregate_failures
  • collect_failures
  • delay_failure (or delay_failures -- either way works)
  • allow_multiple_failures

All those names work OK for me but none is obviously the best name. Any better ideas (or does one of those standout as best?)

A couple other issues to consider...

  • What do we do if some other type of exception is raised in the block? I think the right behavior would be to rescue the exception, and:
    • If no expectation failures have occurred, re-raise the exception
    • If expectation failures have occurred, append the exception's message to the list of failures and raise an ExpectationNotMetError exception with the full failure message (containing both the expectation failures and the additional exception)
  • How do rspec-mocks message expectations (and various other types of mock failures) play into this? I have no great ideas here yet.
  • As around hooks work now, I'm realizing that using this API within one is potentially problematic. Consider this spec file:
RSpec.describe "Around hooks" do
  around do |ex|
    puts "Before outer around"
    ex.run
    puts "After outer around"
  end

  context "when an error is raised in an example" do
    it "still runs the after logic of the around hook" do
      raise "boom"
    end
  end

  context "when an error is raised in the after logic of an around hook" do
    around do |ex|
      puts "Before inner around"
      ex.run
      raise "boom"
    end

    it "aborts the after logic of other around hooks" do
    end
  end
end

Here's the output of rspec --format doc --order defined:

Around hooks
  when an error is raised in an example
Before outer around
After outer around
    still runs the after logic of the around hook (FAILED - 1)
  when an error is raised in the after logic of an around hook
Before outer around
Before inner around
After inner around
    aborts the after logic of other around hooks (FAILED - 2)

...

Notice that After outer around is only printed for the first spec. In general, you can put cleanup logic in the after portion of an around hook and trust that it will get run even if the example fails, because RSpec itself rescues exceptions in the example, marks it as failed, and continues to run your around hook. But when an exception occurs in an around hook (as is the case in the 2nd context), it can prevent the after logic of the outer around hook from running.

This matters here because the metadata integration we have discussed with trigger this issue, since it would raise an error from the after portion of an around, potentially preventing other user hooks from completing. We should probably look into a fix in rspec-core for that before implementing the metadata integration (or find an alternate way to make the metadata integration work).

@ianchesal
Copy link

I too would use this. I've run in to a case where I'm driving long-running integration tests with Rspec and would rather all my expect() blocks execute than failing on the first missed expectation because setup for the tests take a long time and I want a full list of problems that need attention while only incurring one setup.

@waterlink
Copy link
Contributor

I would definitely love this feature, both metadata and block approach

@fables-tales
Copy link
Member

I'm hugely 👍 on this feature.

@myronmarston
Copy link
Member Author

I plan to tackle this in the next week or so and then I'm hoping to ship RSpec 3.3 after that.

@cupakromer
Copy link
Member

👍 for this feature. While I 💙 💙 composable + compound matchers, there are times they do not work cleanly to define a "behavior". At that point I must decide how if it's worth writing a custom matcher or not. This feature would allow me to skip those times where I do not want or need a custom matcher but still want to group a set of expectations.

I like the metadata option, I'd be leaning towards only supporting that.

I would really like to see the block form as well. This would come in handy for "workflow" based specs. Often I care about the set of expectations at a particular point in the flow, not every expectation throughout the entire flow.

Would it be possible to assign the block form an additional description?

aggregate_failures "this explains some additional context" do
  expect(result).to blah
  expect(result).to blah_blah
  expect(result).to have_other_interesting_properties
end

So one big question is: what do we call this? My best ideas so far are:

  • aggregate_failures
  • collect_failures
  • delay_failure (or delay_failures -- either way works)
  • allow_multiple_failures

Personally I like aggregate_failures. It tells me all of their messages will be combined but that it will still trigger a failure at that point. With collect_failures it feels like it may not stop the spec execution at that point, this goes for delay_failure as well. The allow_multiple_failures could work and but I feel we are overloading allow with rspec-mocks usage.

I haven't spent too much time thinking about names but some others:

  • expect_set
  • expect_group
  • expect_all

Though that may be a bit too much expect for the block name and all expectations inside it.

What do we do if some other type of exception is raised in the block? I think the right behavior would be to rescue the exception, and:

  • If no expectation failures have occurred, re-raise the exception
  • If expectation failures have occurred, append the exception's message to the list of failures and raise an ExpectationNotMetError exception with the full failure message (containing both the expectation failures and the additional exception)

Personally, I would expect the exception to terminate the spec and be reported as the failure. For example, I can see someone writing:

aggregate_failures do
  expect(obj.foo).to blah

  bar.do_something(foo) # raises exception

  expect(obj.some_thing).to blah_blah
  expect(obj.foo).to have_other_interesting_properties
end

In the above contrived example, the last two expectations will fail because foo wasn't properly updated. However, the failure causes will be misleading as the real problem was the raised exception.

How do rspec-mocks message expectations (and various other types of mock failures) play into this? I have no great ideas here yet.

I don't have any real ideas. I would expect that any spy expectation to be verified at the end of the block. Likewise, any mock defined in the block would be verified at that point.

@myronmarston
Copy link
Member Author

Would it be possible to assign the block form an additional description?

Yes, although I would only want to support that if the additional description was used in some fashion. Do you have something in mind for how we would use that? If we're not going to use it for anything, the user can just use ruby comments to add additional context -- no need for us to support an argument we don't do anything with.

@cupakromer
Copy link
Member

Do you have something in mind for how we would use that?

I was thinking as part of the aggregated message. Did you have an idea what an aggregate failure message would look like?

@myronmarston
Copy link
Member Author

Did you have an idea what an aggregate failure message would look like?

I haven't given it too much thought, but something like:

Got 2 failed expectations from block at ./path/to/spec.rb:15:

1) <failure message one>

2) <failure message two>

I suppose the string could be used in place of the "at " so that aggregate_failures("checking JSON response") results in a message like:

Got 2 failed expectations from block "checking JSON response":

1) <failure message one>

2) <failure message two>

Although, perhaps we should still include the block source location.

@JonRowe
Copy link
Member

JonRowe commented Apr 12, 2015

I think it should still include the source location :)

@cupakromer
Copy link
Member

👍 for that output. I agree with Jon we should always include the source location.

@myronmarston
Copy link
Member Author

One potential issue with that output: when it is printed in the end-of-run list of failures, it could get confusing with the fact that the failures are numbered with 1), 2), etc, and then it also being used within and individual failure to list all the failed expectations. Any ideas for how to avoid that confusion?

myronmarston added a commit to rspec/rspec-support that referenced this issue Apr 12, 2015
This is intended to support the new `aggregate_failures`
feature discussed in rspec/rspec-expectations#733. I
put the `failure_notifier` here so that rspec-mocks can
use it as well -- that way, when we are aggregating failures,
we can aggregate rspec-mocks failures as well.
myronmarston added a commit to rspec/rspec-support that referenced this issue Apr 12, 2015
This is intended to support the new `aggregate_failures`
feature discussed in rspec/rspec-expectations#733. I
put the `failure_notifier` here so that rspec-mocks can
use it as well -- that way, when we are aggregating failures,
we can aggregate rspec-mocks failures as well.
myronmarston added a commit to rspec/rspec-support that referenced this issue Apr 12, 2015
This is intended to support the new `aggregate_failures`
feature discussed in rspec/rspec-expectations#733. I
put the `failure_notifier` here so that rspec-mocks can
use it as well -- that way, when we are aggregating failures,
we can aggregate rspec-mocks failures as well.
@fables-tales
Copy link
Member

Nested enumeration?

Failures:

1) example_that_aggregates_failures
  <message about aggregating failures> 
   1.1) first_expectation_that_failed
   1.2) Second_expectation_that_failed

2) example_that_doesnt_aggregate_failures_with_expectation_that_failed

myronmarston added a commit that referenced this issue Apr 19, 2015
@myronmarston myronmarston mentioned this issue Apr 19, 2015
6 tasks
myronmarston added a commit that referenced this issue Apr 19, 2015
myronmarston added a commit that referenced this issue Apr 20, 2015
myronmarston added a commit that referenced this issue Apr 20, 2015
myronmarston added a commit that referenced this issue Apr 20, 2015
myronmarston added a commit that referenced this issue Apr 20, 2015
myronmarston added a commit that referenced this issue Apr 21, 2015
myronmarston added a commit that referenced this issue Apr 21, 2015
@myronmarston
Copy link
Member Author

So I've started on this in #776. So far, here's the output we're getting:

RSpec.describe "An aggregated failure" do
  it "lists each of the individual failures in the failure output" do
    aggregate_failures "testing equality" do
      expect(1).to eq(2)
      expect(2).to eq(3)
      expect(3).to eq(4)
    end
  end
end
Failures:

  1) An aggregated failure lists each of the individual failures in the failure output
     Failure/Error: aggregate_failures "testing equality" do
       Got 3 failures from failure aggregation block "testing equality" at /Users/myron/code/rspec-dev/repos/rspec-expectations/tmp/aruba/spec/aggregated_failure_spec.rb:3:

         1) expected: 2
                 got: 1

            (compared using ==)

         2) expected: 3
                 got: 2

            (compared using ==)

         3) expected: 4
                 got: 3

            (compared using ==)
     # ./spec/aggregated_failure_spec.rb:3:in `block (2 levels) in <top (required)>'

Pretty decent, but what I'd like is this:

Failures:

  1) An aggregated failure lists each of the individual failures in the failure output
     Failure/Error: aggregate_failures "testing equality" do
       Got 3 failures from failure aggregation block "testing equality" at /Users/myron/code/rspec-dev/repos/rspec-expectations/tmp/aruba/spec/aggregated_failure_spec.rb:3:

         1.1) Failure/Error: expect(1).to eq(2)

                expected: 2
                     got: 1

                (compared using ==)
              # ./spec/aggregated_failure_spec.rb:4:in `block (2 levels) in <top (required)>'

         1.2) Failure/Error: expect(2).to eq(3)

                expected: 3
                     got: 2

                (compared using ==)
              # ./spec/aggregated_failure_spec.rb:5:in `block (2 levels) in <top (required)>'

         1.3) Failure/Error: expect(3).to eq(4)

                expected: 4
                     got: 3

                (compared using ==)
              # ./spec/aggregated_failure_spec.rb:6:in `block (2 levels) in <top (required)>'

     # ./spec/aggregated_failure_spec.rb:3:in `block (2 levels) in <top (required)>'

The latter includes the line of source code that rspec-core's failure formatting adds as well as the stacktrace for each failure. I haven't yet figured out the best way to get that but the error being raised exposes the data about all the failures so that rspec-core can apply its formatting logic to it.

I'm thinking we may actually want to cut the "Got 3 failures from the failure aggregation block" line, at least from the rspec-core failure output. It's redundant -- the line above it shows the aggregate_failures "label", and the block's location is shown in the backtrace at the end. I think it's worth keeping in the message as formatted by rspec-expectations, though, so that it is displayed when used in other test frameworks like cucumber.

Thoughts?

@JonRowe
Copy link
Member

JonRowe commented Apr 22, 2015

Yep, cut it :)

myronmarston added a commit that referenced this issue Apr 24, 2015
@myronmarston
Copy link
Member Author

So I've bee making progress on this. The formatting is honestly the hardest part! I've got some aweful code that needs refactoring that produces output like this:

1) An aggregated failure lists each of the individual failures in the failure output
   Got 3 failures from failure aggregation block "testing equality":
   # ./spec/aggregated_failure_spec.rb:3:in `block (2 levels) in <top (required)>'

   1.1) Failure/Error: expect(1).to eq(2)

          expected: 2
               got: 1

          (compared using ==)
        # ./spec/aggregated_failure_spec.rb:4:in `block (3 levels) in <top (required)>'

   1.2) Failure/Error: expect(2).to eq(3)

          expected: 3
               got: 2

          (compared using ==)
        # ./spec/aggregated_failure_spec.rb:5:in `block (3 levels) in <top (required)>'

   1.3) Failure/Error: expect(3).to eq(4)

          expected: 4
               got: 3

          (compared using ==)
        # ./spec/aggregated_failure_spec.rb:6:in `block (3 levels) in <top (required)>'

A few things to note:

  • I cut the Failure/Error: aggregate_failures "testing equality" do line. That line is not actually a failure -- it's just the start of the block that is aggregating failures. As such, it's not interesting to print the actual code there.
  • Since we're cutting that line, I added back the aggregation block label, since we are no longer printing the line of code. I like this better, anyway. Among other things, it means the string argument still has a purpose :).
  • I cut the at <source location> bit. The stack trace tells us where it is -- no need to state that twice.
  • I originally put the stacktrace for the aggregation block at the end, but that kind floats there weirdly and could be confusing when there are lots of failures (e.g. 10+) as it may be pages and pages of text between the aggregation description and the stack trace for it. So I've moved the aggregation block stack trace to just after the Got 3 failures from failure aggregation block "testing equality": line.
  • For the backtraces of the individual failures, I drop the frames that are included in the aggregation block stack trace -- no need to put those duplicate frames on the trace for every failure.

Thoughts? I definitely think this is improved from what I put above, but there is one thing that bugs me about it....the aggregation block stacktrace is in between the Got 3 failures from failure aggregation block "testing equality": line and the list of failures, which seems odd since the : suggest the list will come next. One solution might be to change the : to a . -- that way, it doesn't suggest the failures list will come next.

1) An aggregated failure lists each of the individual failures in the failure output
   Got 3 failures from failure aggregation block "testing equality".
   # ./spec/aggregated_failure_spec.rb:3:in `block (2 levels) in <top (required)>'

   1.1) Failure/Error: expect(1).to eq(2)

          expected: 2
               got: 1

          (compared using ==)
        # ./spec/aggregated_failure_spec.rb:4:in `block (3 levels) in <top (required)>'

   1.2) Failure/Error: expect(2).to eq(3)

          expected: 3
               got: 2

          (compared using ==)
        # ./spec/aggregated_failure_spec.rb:5:in `block (3 levels) in <top (required)>'

   1.3) Failure/Error: expect(3).to eq(4)

          expected: 4
               got: 3

          (compared using ==)
        # ./spec/aggregated_failure_spec.rb:6:in `block (3 levels) in <top (required)>'

Is that what we want?

@JonRowe
Copy link
Member

JonRowe commented Apr 27, 2015

That last output looks good to me, but are we printing the entire backtrace or just the one liner version? I'm thinking theres a strong case for only printing a truncated "this is where the error" came from rather than a full backtrace as these often get rather large and will just swamp the output.

@myronmarston
Copy link
Member Author

That last output looks good to me, but are we printing the entire backtrace or just the one liner version? I'm thinking theres a strong case for only printing a truncated "this is where the error" came from rather than a full backtrace as these often get rather large and will just swamp the output.

My plan is to print the entire backtrace of the aggregate_failures callsite as the overall backtrace, and then the backtrace from that point to the specific expectation failure for the sub-failures. That'll keep the sub-failure backtraces nice and short.

We could truncate the aggregate_failures backtrace but that would be a bad idea, IMO. Consider a case like:

def test_endpoint_with(version)
  get "/endpoint?v=#{version}" 
  aggregate_failures "checking response" do
    # expectations here
  end
end

it 'tests endpoint x' do
  test_endpoint_with 1
  test_endpoint_with 2
end

If we truncated to a single frame, you'd have no way to know whether the failure happened for the first test_endpoint_with call or the second. You need the stacktrace to be able to differentiate.

myronmarston added a commit that referenced this issue May 2, 2015
myronmarston added a commit that referenced this issue May 2, 2015
myronmarston added a commit that referenced this issue May 2, 2015
myronmarston added a commit that referenced this issue May 5, 2015
myronmarston added a commit that referenced this issue May 12, 2015
myronmarston added a commit to rspec/rspec-core that referenced this issue May 12, 2015
myronmarston added a commit to rspec/rspec-core that referenced this issue May 16, 2015
MatheusRich pushed a commit to MatheusRich/rspec-core that referenced this issue Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants