Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak diagnostics with invalid UTF-8 so they can pass over the wire #237

Merged
merged 2 commits into from Nov 22, 2022

Commits on Nov 18, 2022

  1. Tweak diagnostics with invalid UTF-8 so they can pass over the wire

    A correct provider should only ever return valid UTF-8 strings as the
    diagnostic Summary or Detail, but since diagnostics tend to be describing
    unexpected situations and are often derived from errors in downstream
    libraries it's possible that a provider might incorrectly return incorrect
    garbage as part of a diagnostic message.
    
    The protobuf serializer rejects non-UTF8 strings with a generic message
    that is unhelpful to end-users:
        string field contains invalid UTF-8
    
    Here we make the compromise that it's better to make a best effort to
    return a diagnostic that is probably only partially invalid so that the
    end user has a chance of still getting some clue about what problem
    occurred. The new helper functions here achieve that by replacing any
    invalid bytes with a correctly-encoded version of the Unicode Replacement
    Character, which will then allow the string to pass over the wire protocol
    successfully and hopefully end up as an obviously-invalid character in
    the CLI output or web UI that's rendering the diagnostics.
    
    This does introduce some slight additional overhead when returning
    responses, but it should be immaterial for any response that doesn't
    include any diagnostics, relatively minor for responses that include
    valid diagnostics, and only markedly expensive for a diagnostic string
    with invalid bytes that will therefore need to be re-encoded on a
    rune-by-rune basis.
    apparentlymart committed Nov 18, 2022
    Copy the full SHA
    11ceb48 View commit details
    Browse the repository at this point in the history

Commits on Nov 22, 2022

  1. Copy the full SHA
    10d03bc View commit details
    Browse the repository at this point in the history