Allow manual selection of UTF-8 BOM encoding #6597
Labels
encoding
Selection of encoding at import time, or encoding issues in data cleaning
Status: Pending Review
Indicates that the issue or pull request is awaiting review by project maintainers or collaborators
Theme: UX/Usability
Focuses on issues related to improving the overall user experience and interaction flow.
Type: Bug
Issues related to software defects or unexpected behavior, which require resolution.
We invented a private encoding, "UTF-8-BOM", to handle the wonky Microsoft format, but because it's not listed in the standard Java characters sets, it's not available in the manual selection dialog.
This means that a user can't select it by hand if they know it's the right format and it isn't guessed and it also means that if the format was guessed correctly, but they change it, they have no way to return to the original guess.
To Reproduce
Steps to reproduce the behavior:
Current Results
The character encoding list doesn't include our private encoding.
Expected Behavior
The character set list includes "UTF-8 with BOM" with the code "UTF-8-BOM", preferably collated correctly in the list.
Screenshots
Versions
The text was updated successfully, but these errors were encountered: