Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New matcher idea: match_pattern for Ruby's pattern-matching #1436

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

r7kamura
Copy link
Contributor

@r7kamura r7kamura commented Nov 11, 2023

Summary

Pattern-matching has been available since Ruby 3.0. How about providing this as a RSpec matcher?

Examples

Basic example

# Pass
expect([1, 2, 3]).to match_pattern([Integer, Integer, Integer])

# Fail
expect([1, 2, 3]).to match_pattern([Integer, Integer, String])

More realistic example

node = Nokogiri::HTML5.parse(response.body).at('.foo')
expect(node).to match_pattern(name: 'a', href: 'http://example.com')

This is done using nokogiri's pattern-matching feature:

Background

As a background, I am usually involved in Rails app development, and since recent Rails makes it easier to do pattern matching using response.parsed_body, so I thought it would be nice if RSpec had the ability to support this.

Implementation

I have come up with two implementation ideas and I think either is fine, but the latter seems closer to what RSpec is aiming for, so I decided to propose this one for the time being, even if it is a bit magical.

The former has its advantages here because it is simple to implement and easy to understand. Please let me know what you think.

Using block

First in thinking about this, I referred to assert_pattern in minitest:

Minitest's pattern-matching feature provides an interface that encloses the entire expression in a block, as shown below:

assert_pattern { [1, 2, 3] => [Integer, Integer, Integer] }

A similar implementation could be done in RSpec. In short, we can have a match_pattern that does the following:

expect { [1, 2, 3] => [Integer, Integer, Integer] }.not_to raise_error(NoMatchingPatternError)
expect { [1, 2, 3] => [Integer, Integer, Integer] }.to match_pattern

Using instance_eval

In RSpec, it is common to write expect(actual).to ... , so it would be nice if we could provide pattern-matching in a way that follows this convention.

I thought of a way to achieve this by using inspect and instance_eval to carry around the given expected pattern as a value.

expect([1, 2, 3]).to match_pattern([Integer, Integer, Integer])

In this implementation, the following code will be evaluated in the above code:

[1, 2, 3] in [Integer, Integer, Integer]

@pirj
Copy link
Member

pirj commented Nov 11, 2023

@zverok what do you think?

@zverok
Copy link

zverok commented Nov 11, 2023

So... Just thinking out loud!

My design space analysis is as follows:

  1. I personally would like any new matcher to be fully on the right side of to (i.e., allowing to always write expect(subject).to matcher or expect { subject }.to matcher, without any change to the left side — also allowing implicit subjects); which, for me, excludes the first option.
  2. I personally believe I would be confused if the new feature would be pretending to accept a Ruby's pattern-matching, but actually wouldn't, i.e., accepted some subset of Ruby that "looks like" some patterns but wouldn't be handled with the same syntactical rules actually. I believe it is a great source of possible confusion. There are many valid options that people might want, and which are valid patterns yet invalid in other contexts (and will be therefore invalid in this "imitation" syntax), like find patterns [*, {id: 3}, *], variable pinning: {id:, ... {linked_id: ^id}}, or even just (if "we have a full pattern matching here! it is more expressive!") trying to use {x:, y:} instead of hash_including(:x, :y). Which, basically, makes the second option undesirable.
  3. An aside note: Ideally, and for realistic large data structures (like JSON responses), it would be cool to provide a detailed report of "which part exactly didn't match"

So, what options do I see viable for this matcher?

1. A block on the right side that re-accepts the subject and just allows to write the check:

expect([1, 2, 3]).to match_pattern { _1 in [Integer, String, Integer] }
# expected to match pattern [Integer, String, Integer], but didn't match

This is currently possible in almost the same form this way:

expect([1, 2, 3]).to satisfy { _1 in [Integer, String, Integer] }
# expected [1, 2, 3] to satisfy expression `_1 in [Integer, String, Integer]`

(I believe the expression is extracted by just reading the file source, and the same can be done by match_pattern, just cleaning up _1 in part)

2. The same block, yet also (or only) expecting you to write => which then analyzes the error:

expect([1, 2, 3]).to match_pattern { _1 => [Integer, String, Integer] }
# expected to match pattern [Integer, String, Integer], but didn't match:
#  String === 2 does not return true

Here, we are doing the same as above yet also catching the NoMatchingPattern error and extract the relevant part of text from it.

(That's how it will fail in this case... I don't think it is ideal, and probably there is some room for improvement in Ruby core with good proposals for better structuring "what haven't matched" info, and probably even providing it in a structured form in the exception object!)

Obviously, both forms would be less appealing for those who HATE numbered block params (and those people are plenty!), so they'll need to write

expect([1, 2, 3]).to match_pattern { |val| val => [Integer, String, Integer] }

With a bit of trickery (providing proper evaluation context), we can provide some nice name available inside a block instead (but this "trickery" might lead to undesirable effects either, and deciding on the name is not easy):

expect([1, 2, 3]).to match_pattern { value => [Integer, String, Integer] }

3. The third idea is much more unholy and also currently impossible, but looks more appealing as a matcher syntax:

expect([1, 2, 3]).to match_pattern => [Integer, String, Integer]

In my imagination, it would work by having match_pattern just provide an object to match (the same as subject or some wrapper), and then adding some "custom handler" to RSpec core, which will catch NoMatchingPattern error, and treat it as "that matcher that added the handler, have failed". This is kinda tricky, but wouldn't work in the current Ruby nevertheless: => has a "void context" (same as, say, return), and can't be put in the expression in this place.

4. The fourth one is a compromise on the above:

expect([1, 2, 3]).to { match_pattern => [Integer, String, Integer] }

This is valid syntax, and while also requires adjusting RSpec core, the adjustment seems more understandable than "adding a global handler" (in this case, the exception needs to be handled in the limited scope of the "block passed to to", not in the scope of the "entire example"). Maybe fiddling a bit around "how it can be implemented" might lead to some insights about adjustments in RSpec core (expect/to) that would be useful... Or maybe not!

Anyway, atm, honestly, no option looks super-persuasive for myself either. Would I need a lot of PM-in-spec (unfortunately, on my dayjob I still stuck with Ruby 2.7 where pattern matching was experimental and we prefer to avoid that unfinished-yet syntax), I would start with just satisfy { _1 in pattern } and after several weeks/months would try to rethink it and see whether it requires some more tinkering.

@r7kamura
Copy link
Contributor Author

Very thoughtful input, thank you!

Indeed, the instance_eval idea is not a good one where a lot of pattern-matching syntax doesn't actually work properly.

The expect([1, 2, 3]).to match_pattern { |val| val => [Integer, String, Integer] } idea seems the most reasonable of the current proposals for me, although I think the drawback is that it looks a bit clunky. Even from my personal observation, the use of numbered parameters does not seem to be widespread yet, nor does it seem to be that popular, so many people will probably use the normal block argument format.

Regarding the "what haven't matched" info, I was thinking that since pattern matching has variable capturing feature, it would be nice to be able to partially verify it using aggregated failures as follows:

expect(hash).to match_pattern { _1 => { user: user } }
expect(user).to match_pattern { _1 => { name: 'foo' } }

However, blocks create local variable scopes, so we have to do something tricky to make this work well.

Thinking about it this way, I wonder if it is no longer necessary to use any matcher and just do the assignment as usual...🤔

it 'returns something' do
  subject => { user: user }
  user => { name: 'foo' }
end

@r7kamura
Copy link
Contributor Author

r7kamura commented Dec 8, 2023

By the way, I heard that it might be available in Ruby 3.4 instead of _1.

If that happens, we may be able to write the following:

expect([1, 2, 3]).to match_pattern { it => [Integer, String, Integer] }

But interestingly (unfortunately), the naming conflicts with RSpec's #it, which is very, very confusing...

Copy link
Member

@JonRowe JonRowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea but would have a preference for this sort of style:

expect([1, 2, 3]).to match_pattern { _1 in [Integer, String, Integer] }


context 'when pattern-matching is not supported on the current Ruby version' do
before do
stub_const('RUBY_VERSION', '2.7.0')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering we run builds on multiple rubies, you don't need to do this, you can add a flag to rspec-support for wether pattern matching is supported, then flag one set of tests as :if => ... and the other has :unless => ...


def supported_ruby_version?
RUBY_VERSION.to_f >= 3.0
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would typically conditionally define the entire matcher on supported rubies, and then have a dummy implementated that raises the not implemented error.


begin
instance_eval(<<~RUBY, __FILE__, __LINE__ + 1)
actual in #{expected.inspect}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If theres a way to avoid using eval that would be ideal...

expect([1, 2, 3]).to match_pattern([Integer, Integer, String])
```

Scenario: Pattern usage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also have to be tagged and excluded on unsupported ruby, and as it is used in our docs we would need a note about supported ruby versions in that,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants