Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TreeSpellChecker: cases where it doesn't work as expected? #166

Open
pudiva opened this issue Oct 18, 2021 · 1 comment
Open

TreeSpellChecker: cases where it doesn't work as expected? #166

pudiva opened this issue Oct 18, 2021 · 1 comment

Comments

@pudiva
Copy link

pudiva commented Oct 18, 2021

I've just started playing with TreeSpellChecker and found a couple of cases where it doesn't give me the corrections I would expect.

If I run this script in ruby-3.0.3:

#!/usr/bin/env ruby
require 'set'

paths = [
  "dir/typo",
  "typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** none slashed:"
puts sc.correct("tyop")

paths = [
  "dir/typo",
  "typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** query slashed:"
puts sc.correct("/tyop")

paths = [
  "dir/typo",
  "/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** dict slashed:"
puts sc.correct("tyop")

paths = [
  "dir/typo",
  "/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)
puts "*** both slashed:"
puts sc.correct("/tyop")

I get this output:

*** none slashed:
*** query slashed:
dir/typo
*** dict slashed:
*** both slashed:
/typo

And I think in all the cases, the second path on the list (either typo or /typo) should be suggested.

What do you think? ✨✨✨
cc @yuki24

@pudiva
Copy link
Author

pudiva commented Oct 18, 2021

Trying to work around the issue above, I decided to just prefix everything with ROOT/ and unprefix it later on, but then I found some other cases...

Script:

#!/usr/bin/env ruby
require 'set'
paths = [
  "ROOT/dir/typo",
  "ROOT/typo",
]
sc = DidYouMean::TreeSpellChecker.new(dictionary: paths)

puts "*** matches typo :)"
puts sc.correct("ROOT/tyop")

puts "*** doesn't match if too distant :)"
puts sc.correct("ROOT/a")

puts "*** matches too distant :("
puts sc.correct("ROOT/asduhij2ed8uuo3iekd3e/238eoiu3jkr3o48if")

Output:

*** matches typo :)
ROOT/typo
*** doesn't match if too distant :)
*** matches too distant :(
ROOT/dir/typo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant