Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support encoding comments and specifying the encoding format #328

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Nov 24, 2021

  1. Support encoding comments and specifying the encoding format

    This allows encoding comments and setting some flags to control the
    format. While toml.Marshaler added in #327 gives full control over how
    you want to format something, I don't think it's especially
    user-friendly to tell everyone to create a new type with all the
    appropriate formatting, escaping, etc. The vast majority of of use cases
    probably just call in to a few simple categories such as "use `"""` or
    "encode this number as a hex".
    
    I grouped both features together as they're closely related: both set
    additional information on how to write keys.
    
    What I want is something that:
    
    1. allows setting attributes programmatically;
    2. supports round-tripping by default on a standard struct;
    3. plays well with other encoders/decoders;
    4. has a reasonable uncumbersome API.
    
    Most options (custom types, struct tags) fail at least one of these;
    there were some PRs for struct tags, but they fail at 1, 2, and
    (arguably) 4. Custom types fail at 2, 3, and probably 4.
    
    ---
    
    This adds SetMeta() to the Encoder type; this is already what we have
    when decoding, and all additional information will be set on it. On
    MetaData we add the following types:
    
        SetType()       Set TOML type info.
        TypeInfo()      Get TOML type info.
        Doc()           Set "doc comment" above the key.
        Comment()       Set "inline comment" after the key.
    
    Every TOML type has a type in this package, which support different
    formatting options (see type_toml.go):
    
        Bool
        String
        Int
        Float
        Datetime
        Table
        Array
        ArrayTable
    
    For example:
    
        meta := toml.NewMetaData().
                SetType("key", toml.Int{Width: 4, Base: 16}).
                Doc("key", "A codepoint").
        Comment("key", "ë")
        toml.NewEncoder(os.Stdout).SetMeta(meta).Encode(struct {
                Key string `toml:"key"`
        }{"ë")
    
    Would write:
    
        # A codepoint.
        key = 0x00eb  # ë
    
    It also has Key() to set both:
    
        toml.NewMetaData().
            Key("key", toml.Int{Width: 4, Base: 16}, toml.Doc("A codepoint"), toml.Comment("ë")).
            Key("other", toml.Comment("..."))
    
    The advantage of this is that it reduces the number of times you have to
    type the key string to 1, but it uses interface{}. Not yet decided which
    one I'll stick with, and also not a huge fan of Doc() and Comment(), but
    I can't really think of anything clearer at the moment (these are the
    names the Go ast uses).
    
    ---
    
    The Decode() sets all this information on the MetaData, so this:
    
        meta, _ := toml.Decode(..)
        toml.NewEncoder(os.Stdout).SetMeta(meta).Encode(..)
    
    Will write it out as "key = 0x00eb" again, rather than "key = 235".
    
    This way, pretty much any flag can be added programmatically without
    getting in the way of JSON/YAML/whatnot encoding/decoding.
    
    ---
    
    I don't especially care how you need to pass the keys as strings, but
    there isn't really any good way to do it otherwise. There is also the
    problem that the "key" as found in the parser may be different than the
    "key" the user is expecting if you don't use toml struct tags:
    
        type X struct { Key int }
    
    Will read "key = 2" in to "Key", but when encoding it will write as
    "Key" rather than "key". The type information will be set to "key", but
    when encoding it will look for "Key", so round-tripping won't work
    correct and has the potential for confusion if the wrong key is set.
    
    This is not so easy to fix since we don't have access to the struct in
    the parser. I think it's fine to just document this as a caveat and tell
    people to use struct tags, which is a good idea in any case.
    
    ---
    
    I'm not necessarily opposed to also adding struct tags for most of these
    things, although I'm not a huge fan of them. Since struct tags can't be
    set programmatically it's not really suitable for many use cases (e.g.
    setting comments dynamically, using multiline strings only if the string
    contains newlines, etc.) It's something that could maybe be added in a
    future PR, if a lot of people ask for it.
    
    Fixes #64
    Fixes #75
    Fixes #160
    Fixes #192
    Fixes #213
    Fixes #269
    arp242 committed Nov 24, 2021
    Configuration menu
    Copy the full SHA
    e891083 View commit details
    Browse the repository at this point in the history