Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow manual selection of UTF-8 BOM encoding #6597

Open
tfmorris opened this issue May 10, 2024 · 0 comments
Open

Allow manual selection of UTF-8 BOM encoding #6597

tfmorris opened this issue May 10, 2024 · 0 comments
Labels
encoding Selection of encoding at import time, or encoding issues in data cleaning Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators Theme: UX/Usability Focuses on issues related to improving the overall user experience and interaction flow. Type: Bug Issues related to software defects or unexpected behavior, which require resolution.

Comments

@tfmorris
Copy link
Member

We invented a private encoding, "UTF-8-BOM", to handle the wonky Microsoft format, but because it's not listed in the standard Java characters sets, it's not available in the manual selection dialog.

This means that a user can't select it by hand if they know it's the right format and it isn't guessed and it also means that if the format was guessed correctly, but they change it, they have no way to return to the original guess.

To Reproduce

Steps to reproduce the behavior:

  1. Create a project and click on the encoding to bring out the encoding modal dialog

Current Results

The character encoding list doesn't include our private encoding.

Expected Behavior

The character set list includes "UTF-8 with BOM" with the code "UTF-8-BOM", preferably collated correctly in the list.

Screenshots

Versions

  • Operating System:
  • Browser Version:
  • JRE or JDK Version:
  • OpenRefine:
@tfmorris tfmorris added Type: Bug Issues related to software defects or unexpected behavior, which require resolution. Theme: UX/Usability Focuses on issues related to improving the overall user experience and interaction flow. encoding Selection of encoding at import time, or encoding issues in data cleaning Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators labels May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
encoding Selection of encoding at import time, or encoding issues in data cleaning Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators Theme: UX/Usability Focuses on issues related to improving the overall user experience and interaction flow. Type: Bug Issues related to software defects or unexpected behavior, which require resolution.
Projects
None yet
Development

No branches or pull requests

1 participant