New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add preserveFormatting option for comments/whitespace #38
Conversation
@magiconair could you take a look at this PR and let me know if you'll consider accepting it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've started reviewing this but stopped since this solution seems too complex. If he goal is to preserve whitespace for comments and the lexer should handle that transparently. In this approach this permeates everywhere. Doesn't look right.
load.go
Outdated
@@ -178,6 +183,10 @@ func LoadMap(m map[string]string) *Properties { | |||
return p | |||
} | |||
|
|||
func GetLoader() (*Loader, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this and it is not idiomatic. It would need to be called NewLoader()
but since it just returns an empty Loader
we can drop this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this method. Callers can just instantiate the struct themselves.
load.go
Outdated
} | ||
|
||
func (l *Loader) loadBytes(buf []byte, enc Encoding) (*Properties, error) { | ||
p, err := parse(convert(buf, enc)) | ||
func (l *Loader) loadBytes(buf []byte, enc Encoding, preserveFormatting bool) (*Properties, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all loadXXX
functions eventually call loadBytes
you don't need pass the parameter through. you can just use l.PreserveFormatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the passed parameters in load.go.
lex.go
Outdated
@@ -162,8 +163,9 @@ func (l *lexer) nextItem() item { | |||
} | |||
|
|||
// lex creates a new scanner for the input string. | |||
func lex(input string) *lexer { | |||
func lex(input string, preserveFormatting bool) *lexer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #37 wants to add another config parameter to the lexer I think we should move the go l.run()
call to the parse
function and refactor this function as follows:
func lex(input string) *lexer {
return &lexer{
input: input,
items: make(chan item),
runes: make([]rune, 0, 32),
}
}
Then parse
looks like this:
func parse(input string) (properties *Properties, err error) {
l := lex(input)
go l.run()
p := &parser{lex: l}
...
}
This way it is possible to pass additional configuration to the lexer without adding them as function arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lex.go
Outdated
@@ -60,6 +60,7 @@ type stateFn func(*lexer) stateFn | |||
|
|||
// lexer holds the state of the scanner. | |||
type lexer struct { | |||
preserveFormatting bool // whether to scan EOLs/whitespace as part of comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets rename to
keepWS // keepWS retains whitespace in comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lex.go
Outdated
@@ -189,14 +191,36 @@ func lexBeforeKey(l *lexer) stateFn { | |||
return nil | |||
|
|||
case isEOL(r): | |||
l.ignore() | |||
return lexBeforeKey | |||
if l.preserveFormatting { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make these two separate switch cases (also for the other ones):
case isEOL(r) && l.keepWS:
l.appendRune(r)
l.backup()
return lexComment
case isEOL(r):
l.ignore()
return lexBeforeKey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
properties.go
Outdated
@@ -48,6 +48,11 @@ func PanicHandler(err error) { | |||
|
|||
// ----------------------------------------------------------------------------- | |||
|
|||
type Comment struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be exported and does not look properly formatted but I also don't understand why we need this in the first place. The only difference is whether the comments have whitespace or not. Shouldn't the lexer handle that transparently for the parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unexported this struct. This is mainly needed to preserve the formatting of the prefix. Since java allows either # or ! as a prefix, with optional leading whitespace before the prefix, I wanted to be able to preserve that. eg:
#############################
!! !!
!! Section 1 !!
!! !!
#############################
# comment 1
key = value
parser.go
Outdated
|
||
for { | ||
token := p.expectOneOf(itemComment, itemKey, itemEOF) | ||
switch token.typ { | ||
case itemEOF: | ||
if preserveFormatting && (len(comments) > 0 || token.val != "") { | ||
// There are comments at the end of the input that are not tied to a particular key | ||
// Save these off against a special empty key when preserving formatting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a sentinel key I suggest using a separate field for heading and trailing comments, e.g. trailComment comment
in the Properties
.
Also, please revert the logic of the if
statement to avoid the indent.
if !preserveFormatting || len(comments) == 0 {
goto done
}
if token.val == "" {
p.trailingComments = comments
goto done
}
...
goto done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
continue | ||
case itemKey: | ||
key = token.val | ||
key = strings.TrimSpace(token.val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not look right since whitespace parsing should only affect comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary to maintain previous behavior. Previously, whitespace around the key was being ignored in lexBeforeKey. However, now the lexer doesn't know whether the whitespace in lexBeforeKey is part of a comment and needs to be preserved or whitespace before a key and needs to be ignored. So I do it here.
I've put up an additional commit addressing your comments so far, and also removed some of the extra parameters, like |
b2c5409
to
be36114
Compare
- Original behavior is still retained for calls to the original Write methods.
be36114
to
c94c1ec
Compare
I've refactored the WriteComment method and moved writing formatted comments out into a new method. The individual commit diff makes it look odd, but looking at the FilesChanged view shows the diff better. This in combination with the previous commit's changes does simplify the original change a bit. However, in order to maintain backward compatibility in the lexer, the parser and in properties, I don't think that this can be achieved transparently in the lexer. I think that the parser needs to understand Let me know if you agree, or have any further suggestions to modify this PR. However, I also understand if you think it's still not the right approach. If you believe this is not the right direction for this feature, I'm ok with closing this out. |
This pull request addresses a use case for reading and updating properties files that have:
Changes:
Notes: