html file is not reported as UTF8 after conversion #381
Labels
detection
Related to the charset detection mechanism, chaos/mess/coherence
help wanted
Extra attention is needed
Provide the file
110-original.zip
Verbose output
Using the CLI, run
normalizer -v ./my-file.txt
and past the result in here.enca
will however detect UTF-8 as it shouldExpected encoding
Expected normalizer to show UTF-8 encoding after conversion to UTF-8.
Am I wrong here?
Desktop (please complete the following information):
Additional context
I know. Html is not the same as text.
But I will document this here.
I think that "declarative mark" should not take over like that. But I am new to this encoding world....
The text was updated successfully, but these errors were encountered: