Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for matching multiple patterns #1139

Merged
merged 5 commits into from
Nov 21, 2022
Merged

Conversation

Uthar
Copy link
Contributor

@Uthar Uthar commented Oct 15, 2022

Added the --and --expr --and flags that search for matches containing all the patterns, similar to the -and flag of GNU find.

minad/consult#6
#315

@Uthar Uthar force-pushed the master branch 2 times, most recently from c6877b6 to c409682 Compare October 15, 2022 12:33
@sharkdp
Copy link
Owner

sharkdp commented Oct 31, 2022

Thank you very much for your contribution!

Before we consider the integration of this new feature, please see my comment in #315 (comment):

Ok, I'm inclined to accept a feature request to support --and <pattern>. Before we implement this, we need:

  • A short discussion about the command-line option name. What do other tools use?
  • A detailed analysis if this could clash with any of the other command-line options or features of fd. There are some immediate questions like: what does fd patternA --and patternB --type f mean? (we are not going to support the meaning patternA AND (patternB AND type==file)).

Ideally, the second point should be discussed in detail and represented in corresponding unit tests.

@Uthar
Copy link
Contributor Author

Uthar commented Oct 31, 2022

A short discussion about the command-line option name. What do other tools use?

  • ripgrep uses -e/--regexp
-e, --regexp <PATTERN>...                    
  A pattern to search for. This option can be provided multiple times, where
  all patterns given are searched. Lines matching at least one of the provided
  patterns are printed. This flag can also be used when searching for patterns
  that start with a dash.
  
  For example, to search for the literal '-foo', you can use this flag:
  
      rg -e -foo
  
  You can also use the special '--' delimiter to indicate that no more flags
  will be provided. Namely, the following is equivalent to the above:
  
      rg -- -foo

  • GNU find uses the following syntax:
expr1 expr2
       Two expressions in a row are taken to be joined with an implied -a; expr2 is not evaluated if expr1 is false.

expr1 -a expr2
       Same as expr1 expr2.

expr1 -and expr2
       Same as expr1 expr2, but not POSIX compliant.

IMHO I like fd first --and second --and third, because it keeps backwards compatibility while giving new functionality/use case

@Uthar
Copy link
Contributor Author

Uthar commented Oct 31, 2022

what does fd patternA --and patternB --type f mean?

I think the --and flag would simply mean "Find all files whose names match all patterns specified by the initial [pattern] and all the --and flags"

No such nesting like you described would be possible

I don't know all of fd's flags, but it looks to me like it will compose well, being just an additional filter for names.

unit tests

I created some basic unit tests. What other cases should be covered?

Some ideas:

  • test with empty patterns
  • test with unreasonable pattern should match nothing
  • a broken pattern in any of the --and flags should signal an error

@sharkdp
Copy link
Owner

sharkdp commented Oct 31, 2022

I think the --and flag would simply mean "Find all files whose names match all patterns specified by the initial [pattern] and all the --and flags"

Sounds reasonable. I would still like to see some thought going into potential conflicts (or chances for combination) with other command line options. To further elaborate what I meant by the quoted example, consider --extension/-e instead of --type, which acts similar:

fd patternA --and --patternB --extension jpg --extension png

Since --extension (like --type) combines in an OR-sense, this would currently mean:

  • the filename has to match patternA AND patternB
  • the extension has to match jpg OR png

which might certainly be a bit confusing. Especially if someone writes this as:

fd patternA --extension jpg --and --patternB --extension png

(which would do the same thing, as we don't care about the order of options)

Other potentially interesting options to look at:

  • --glob/--fixed-strings
  • --ignore-case/--case-sensitive
  • --full-path
  • --extension/--type/--size/etc.

a broken pattern in any of the --and flags should signal an error

That is definitely worth checking.

Another thing to potentially test would be patterns starting with a dash. Usually, we solve this using the -- separator: fd -- '-foo'. Not sure if we could make it work for --and as well. Might not even require the -- separator, but maybe a special clap setting to allow the value to start with a dash.

@Uthar
Copy link
Contributor Author

Uthar commented Oct 31, 2022

patterns starting with a dash

One could also use ^[-]

@Uthar
Copy link
Contributor Author

Uthar commented Oct 31, 2022

Hmm you're right, maybe an --expr flag would sound better?

@Uthar
Copy link
Contributor Author

Uthar commented Oct 31, 2022

Wrote tests for all these in new commit:

--glob

--and patterns contribute to the search as globs, not regexps

--fixed-strings

--and patterns are matched as whole strings, not regexp

--ignore-case

case is ignored in --and patterns as well as the primary pattern

--case-sensitive

case is sensitive in --and patterns as well as the primary pattern

--full-path

--and patterns contribute to the search in the whole path, not just file or dir name

--extension

--extension and --and patterns are composed in a logical way, i.e. patterns match file names via AND and extension matches extension via OR

--type

--type additionally restricts the search done by all the --and patterns

--size

didn't check, I didn't know how

Might not even require the -- separator,

With --and flags it works with a single dash, but not with two.

also added these from my post:

test with empty patterns
test with unreasonable pattern should match nothing
a broken pattern in any of the --and flags should signal an error

tests fails for non-ubuntu CI... let me check...

@Uthar
Copy link
Contributor Author

Uthar commented Oct 31, 2022

tests fails for non-ubuntu CI... let me check...

Can't make sense out of it. Is there some Mac and Windows experts on board?

@tmccombs
Copy link
Collaborator

tmccombs commented Nov 1, 2022

We probably want to clearly document how --and interacts with other options. The behaviors described above makes sense to me and are probably what I would expect in the absence of documentation, but I could easily see someone else expecting different behavior.

Regarding hyphens. We probably want to use the allow_hyphen_values option on the argument.

@sharkdp
Copy link
Owner

sharkdp commented Nov 1, 2022

@Uthar Thank you very much for the updates! Unfortunately, there are some larger merge conflicts now that I merged a few other PRs, especially #1067. I hope you can sort this out, otherwise please let us know.

@Uthar
Copy link
Contributor Author

Uthar commented Nov 1, 2022

Yes, I will attempt to solve this, stay tuned

@Uthar
Copy link
Contributor Author

Uthar commented Nov 1, 2022

Damn, looks like I will have to write it from scratch
It will take some time, I haven't written any rust code before

@Uthar
Copy link
Contributor Author

Uthar commented Nov 1, 2022

I will reopen once I have some more time

@Uthar Uthar closed this Nov 1, 2022
@tmccombs
Copy link
Collaborator

tmccombs commented Nov 2, 2022

I'm willing to help if you would like. After all, I made the PR that caused most of the conflicts.

@Uthar
Copy link
Contributor Author

Uthar commented Nov 2, 2022

Very nice of you tmccombs. My main problem is to write the code again, it takes a lot of time especially debugging the type errors with borrow and so on, because I lack experience. I just don't have the time right now

Ultimately we just have to change pattern to patterns in this line:

fd/src/walk.rs

Line 153 in 567ce26

spawn_senders(&config, &quit_flag, pattern, parallel_walker, tx);

Everything else is just argument parsing/changing parameters to functions

@sharkdp sharkdp reopened this Nov 2, 2022
@Uthar
Copy link
Contributor Author

Uthar commented Nov 2, 2022

I started some work here, the easy stuff :-) https://github.com/Uthar/fd/tree/conflicts

@Uthar
Copy link
Contributor Author

Uthar commented Nov 2, 2022

Ok, I've got this working now... but there are some issues that Rust experienced programmer will probably easily solve:

I'm trying to push the "initial" pattern to the optional vector of "and" patterns, but can't do that without borrowing, do you know how to handle it?

Also its panicking with no --expr flag currently, maybe unwrap() is not the best there?

I also don't know how to exit the program from the match .. Ok .. Err clause, when there is an error, it still requires a regex type.

Could you help me with these issues? See this commit: Uthar@d66470a

@Uthar
Copy link
Contributor Author

Uthar commented Nov 2, 2022

Full diff here master...Uthar:fd:conflicts

@Uthar
Copy link
Contributor Author

Uthar commented Nov 3, 2022

Done after all, we're back in the game :-)

But you should review the changes

@sharkdp
Copy link
Owner

sharkdp commented Nov 3, 2022

Thank you. The tests are currently failing on Windows and macOS

@Uthar
Copy link
Contributor Author

Uthar commented Nov 4, 2022

yeah... the failuress are really weird, dont make sense...

My friend has a macbook, I will ask them to try on it, also try to find a windows machine

@Uthar
Copy link
Contributor Author

Uthar commented Nov 4, 2022

fixed the tests

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@Uthar
Copy link
Contributor Author

Uthar commented Nov 8, 2022

Thank you, tmccombs, for the thorough review.

See the last commit, where I added your solution to avoid match and panic! during generation of the regexp and changed type hints as you requested.

Let me know if you have some other improvement ideas.

src/cli.rs Outdated Show resolved Hide resolved
src/cli.rs Outdated Show resolved Hide resolved
src/cli.rs Outdated Show resolved Hide resolved
src/main.rs Show resolved Hide resolved
src/main.rs Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@Uthar
Copy link
Contributor Author

Uthar commented Nov 20, 2022

fd/src/walk.rs

Line 47 in 882e13a

/// Recursively scan the given search path for files / pathnames matching the pattern.

pattern -> patterns

@Uthar Uthar force-pushed the master branch 2 times, most recently from 9290923 to d498724 Compare November 20, 2022 11:13
@Uthar
Copy link
Contributor Author

Uthar commented Nov 20, 2022

So the --expr flag instead of --and sounds better to you?

@Uthar
Copy link
Contributor Author

Uthar commented Nov 20, 2022

fd/src/main.rs

Lines 438 to 442 in d498724

Err(anyhow!(
"The pattern seems to only match files with a leading dot, but hidden files are \
filtered by default. Consider adding -H/--hidden to search hidden files as well \
or adjust your search pattern."
))

The pattern -> A pattern / Some patterns ?

@sharkdp
Copy link
Owner

sharkdp commented Nov 20, 2022

So the --expr flag instead of --and sounds better to you?

No, I like --and.

The pattern -> A pattern / Some patterns ?

Maybe: "The pattern(s) seems to only match… or adjust your search pattern(s)".

src/cli.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@Uthar Uthar force-pushed the master branch 2 times, most recently from d8196a1 to 1409ad6 Compare November 20, 2022 12:55
Copy link
Owner

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I made a couple of minor improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants