Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mecab-ipadic: character encoding corruption #171137

Closed
4 tasks done
otegami opened this issue May 8, 2024 · 9 comments
Closed
4 tasks done

mecab-ipadic: character encoding corruption #171137

otegami opened this issue May 8, 2024 · 9 comments
Labels
bug Reproducible Homebrew/homebrew-core bug

Comments

@otegami
Copy link

otegami commented May 8, 2024

brew gist-logs <formula> link OR brew config AND brew doctor output

% brew gist-logs mecab-ipadic                   
Error: No logs.
% brew config                
HOMEBREW_VERSION: 4.2.21
ORIGIN: https://github.com/Homebrew/brew
HEAD: 82c2e743a5bcea725f9ca1429e3e21c3088ff904
Last commit: 2 days ago
Core tap JSON: 08 May 07:34 UTC
Core cask tap JSON: 08 May 07:34 UTC
HOMEBREW_PREFIX: /opt/homebrew
HOMEBREW_CASK_OPTS: []
HOMEBREW_MAKE_JOBS: 8
Homebrew Ruby: 3.1.4 => /opt/homebrew/Library/Homebrew/vendor/portable-ruby/3.1.4/bin/ruby
CPU: octa-core 64-bit arm_firestorm_icestorm
Clang: 15.0.0 build 1500
Git: 2.44.0 => /opt/homebrew/bin/git
Curl: 8.4.0 => /usr/bin/curl
macOS: 14.4.1-arm64
CLT: 15.3.0.0.1.1708646388
Xcode: N/A
Rosetta 2: false
% brew doctor
Your system is ready to brew.

Verification

  • My brew doctor output says Your system is ready to brew. and am still able to reproduce my issue.
  • I ran brew update and am still able to reproduce my issue.
  • I have resolved all warnings from brew doctor and that did not fix my problem.
  • I searched for recent similar issues at https://github.com/Homebrew/homebrew-core/issues?q=is%3Aissue and found no duplicates.

What were you trying to do (and why)?

Character encoding corruption happens with the mecab-ipadic dictionary used by mecab on M1 macOS 14.4.1.
The problem happens after reinstalling mecab-ipadic using Homebrew, where Japanese characters are now being output as garbled text.

What happened (include all command output)?

My Environment

% sw_vers
ProductName:        macOS
ProductVersion:     14.4.1
BuildVersion:       23E224
% sysctl machdep.cpu.brand_string
machdep.cpu.brand_string: Apple M1
% brew --version
Homebrew 4.2.21
% mecab --version
mecab of 0.996

What I tried to do

Before reinstalling mecab-ipadic, the output is fine.

% echo 'A' | mecab
A	名詞,固有名詞,組織,*,*,*,*
EOS

After reinstalling mecab-ipadic, the output is garbled text.

% brew reinstall mecab-ipadic
% echo 'A' | mecab               
A	̾??,??ͭ̾??,?ȿ?,*,*,*,*
EOS

What did you expect to happen?

Expected output

After reinstalling mecab-ipadic, the output isn't garbled text.

% echo 'A' | mecab
A	名詞,固有名詞,組織,*,*,*,*
EOS

Step-by-step reproduction instructions (by running brew commands)

% brew install mecab mecab-ipadic
% echo 'A' | mecab               
A	̾??,??ͭ̾??,?ȿ?,*,*,*,*
EOS
@otegami otegami added the bug Reproducible Homebrew/homebrew-core bug label May 8, 2024
@otegami otegami changed the title mecab-ipadic: Character Encoding Corruption After Homebrew Reinstallation mecab-ipadic: Character Encoding Corruption May 8, 2024
@otegami otegami changed the title mecab-ipadic: Character Encoding Corruption mecab-ipadic: character encoding corruption May 8, 2024
@carlocab
Copy link
Member

carlocab commented May 8, 2024

The newer bottles seem broken for some reason. As a workaround, you can try

brew uninstall mecab-ipadic
brew fetch --os=big_sur mecab-ipadic
brew install "$(brew --cache --os=big_sur mecab-ipadic)"

But we should try to understand why the newer bottles are broken.

@carlocab
Copy link
Member

carlocab commented May 8, 2024

It might be more helpful to consult the mailing list at https://lists.osdn.me/mailman/listinfo/mecab-users.

carlocab added a commit that referenced this issue May 8, 2024
This should help prevent #171137.
@otegami
Copy link
Author

otegami commented May 8, 2024

Thank you for addressing it and suggesting the workaround.
If there is something I can help you, please let me know 🙏🏾

@carlocab
Copy link
Member

carlocab commented May 8, 2024

Based on my testing at #171139, the issue you are having seems to be specific to macOS 14 Sonoma running on arm64/Apple Silicon. In particular, it doesn't appear on macOS versions 12-14 running on x86_64/Intel, or on macOS 12 or 13 running on arm64/Apple Silicon.

Could you report this to the mailing list I linked above? (I don't speak Japanese, unfortunately, so doing it myself is difficult.)

Attaching the build logs from the broken build might be helpful: logs.tar.gz

@otegami
Copy link
Author

otegami commented May 9, 2024

Sure. I will report it. Thank you for the build logs too.

@otegami
Copy link
Author

otegami commented May 9, 2024

I've just reported it to the mailing list now.

@carlocab
Copy link
Member

Thanks. Please update us if they have any insight about what's gone wrong, and if there's a fix available.

carlocab added a commit that referenced this issue May 14, 2024
carlocab added a commit that referenced this issue May 14, 2024
This should help prevent #171137.
@carlocab
Copy link
Member

Should be fixed in #171139. After that's merged, please do

brew update && brew reinstall mecab mecab-ipadic

to pick up the fix.

abetomo added a commit to abetomo/pgroonga that referenced this issue May 14, 2024
@otegami
Copy link
Author

otegami commented May 20, 2024

Thank you for addressing it! 🙏🏾

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Reproducible Homebrew/homebrew-core bug
Projects
None yet
Development

No branches or pull requests

2 participants