Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper parser for Dockerfile #2266

Closed
wants to merge 1 commit into from
Closed

Proper parser for Dockerfile #2266

wants to merge 1 commit into from

Conversation

joelreymont
Copy link

Closes #2225. This pull request implements a proper parser for Docker build files.

@jpetazzo
Copy link
Contributor

Whoa, that's a lot of lines of code :-)
(Yeah, I know, I'm easily impressed...)
It looks like some files are generated by some others, right?

@joelreymont
Copy link
Author

buildfile_lexer.go and buildfile_parser.go are autogenerated by nex and 'go tool yacc' respectively.

@SvenDowideit
Copy link
Contributor

excellent - this should fix the things I was just exploring -

  1. losing tabs inside quotes in RUN, ENV
  2. space hash being treated as a comment, though the spec suggests it should not :)

excellent 👍

@joelreymont
Copy link
Author

This commit is superseded by #2337.

@@ -2,6 +2,16 @@

DEST=$1

if ! go run vendor/src/github.com/wagerlabs/nex/nex.go -e -o buildfile_lexer.go buildfile.nex; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the original dependency. We can't vendor a personal fork.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now.

@shykes
Copy link
Contributor

shykes commented Nov 12, 2013

I am not convinced I want to merge this.

  • It is a complete rewrite of a critical part of docker..
  • .. which has a lot of legacy quirks..
  • .. and for which reverse-compatibility is extremely important (breaking change = incorrect or failed builds)
  • .. by a first-time contributor with no track record of maintaining his changes..
  • .. and who is openly using this pull request as advertisement material to sell contract work to my employer (will they get an invoice when you need to maintain this?) No offense Joel but it's true :)
  • All this without fixing an immediate problem.

Lastly, I have no plan to extend the Dockerfile syntax into a powerful DSL. Its purpose is to be a simple "assembly-style" language. The future of Dockerfiles is to make it easier to call build scripts of your own, in your favorite syntax: shell scripts, chef scripts, puppet scripts... These are the tools which need a sophisticated lexer - I'm not sure we do.

Thoughts?

@jpetazzo
Copy link
Contributor

I can't tell about the track record of the author or the plug-in ad (didn't see that?) but if the Dockerfile parser has "legacy quirks", shouldn't we fix it now (i.e. the earlier the better)? Looks like minor parsing issues pop up on regular intervals.

@SvenDowideit
Copy link
Contributor

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

@shykes
Copy link
Contributor

shykes commented Nov 13, 2013

Guys, the time to worry about quirks was before shipping. Once it's shipped, we need to support it. Especially when it affects a file format which many people have already checked into source control and use actively.

@solomonstre
@docker

On Tue, Nov 12, 2013 at 10:16 PM, Sven Dowideit notifications@github.com
wrote:

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Reply to this email directly or view it on GitHub:
#2266 (comment)

@solomonstre
Copy link

I don't mean that we necessarily need to preserve all quirks in formol forever. But once it's shipped, how fast we fix it is no longer the most important. The most important is to carefully plan out the least disruptive sunset possible (let's call it LDSP).

Rushing the merge of a completely new lexer definitely doesn't qualify as the LDSP, I think we can agree on that much :)

@solomonstre
@docker

On Tue, Nov 12, 2013 at 10:19 PM, Solomon Hykes notifications@github.com
wrote:

Guys, the time to worry about quirks was before shipping. Once it's shipped, we need to support it. Especially when it affects a file format which many people have already checked into source control and use actively.

@solomonstre
@docker
On Tue, Nov 12, 2013 at 10:16 PM, Sven Dowideit notifications@github.com
wrote:

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Reply to this email directly or view it on GitHub:

#2266 (comment)

Reply to this email directly or view it on GitHub:
#2266 (comment)

@joelreymont
Copy link
Author

I'm not charging for this PR, it's free.

The fact that I'm using this PR to showcase my coding skills is orthogonal to its technical merits.

This PR does fix a number of issues in the current ad-hoc parser and is easily maintainable by me or @jpetazzo ;-).

@joelreymont
Copy link
Author

To be more precise, this PR fixes issues #2315, #2072 and makes it easy to resolve #2210, #2167, #1149, #1996, #1834, #1352 and #1321 among others.

I did a bit of maintenance and improvement on #2337 (rejected), following suggestions from @tianon y @jpetazzo.

I also contributed to other open source projects where, dare I say, accepting my patches took less than a few weeks and turned out to be FAR less emotional.

@solomonstre
Copy link

Hi Joel, your contribution is always accepted whether we merge it or not, there is nothing emotional about this.

#2337 was rejected because I don't think we need an INCLUDE verb. @jpetazzo and @tianon are not core project maintainers so their suggestions, though informative, don't guarantee anything.

You didn't address my last question which is: how confident are you that 100% of existing Dockerfiles will work unmodified with this lexer? Can we increase that confidence? If it's not 100%, what's the plan for a graceful sunset? That's what I'm interested in - not how fast we merge it. 

 

Sorry about the "weeks of delay". We have huge numbers of contributions and are doing our best to keep up and give useful feedback. You will increase your chances by discussing your design earlier in the mailing list or irc, checking for interest from the actual maintainers, keeping your changes small, and most of all, not making me feel like I somehow need to rush something in. I really don't give a shit how fast your patches were merged in other projects - mentioning it comes across as condescending and self-centered.

@solomonstre
@docker

On Wed, Nov 13, 2013 at 3:59 AM, Joel Reymont notifications@github.com
wrote:

To be more precise, this PR fixes issues #2315, #2072 and makes it easy to resolve #2210, #2167, #1149, #1996, #1834, #1352 and #1321 among others.
I did a bit of maintenance and improvement on #2337 (rejected), following suggestions from @tianon y @jpetazzo.

I also contributed to other open source projects where, dare I say, accepting my patches took less than a few weeks and turned out to be FAR less emotional.

Reply to this email directly or view it on GitHub:
#2266 (comment)

@jpetazzo
Copy link
Contributor

Hmmm (deep thinking)

I still believe that having a proper parser is important; but I understand the fear of breaking stuff.

@wagerlabs, do you think that it would be doable to have the two parsers in parallel, and compare their output? (I don't even know if they output a kind of AST or comparable structure, so this might be a completely silly suggestion.)

@shykes, would that be an acceptable way? I.e. having both parsers in parallel, and something showing huge warnings if a difference is found during a build?

Alternatively — have the ability to switch between parsers with a command-line (/API) flag?

IMHO, if a Dockerfile breaks, it's either a problem with the parser (and it should be trivial to fix), or because the Dockerfile is broken. Random example: a Dockerfile using a weird UTF8 character to skip the next command, or some weird crap like that :)

@joelreymont
Copy link
Author

@jpetazzo I don't see a lot of advantage in keeping two parsers afloat. I agree with you that any problems with the new parser should be trivial to fix.

@solomonstre I'm confident that 99% of existing Dockerfiles will work unmodified with this parser. There are tests for the current parser and the new parser passes them. I added a few extra tests and the confidence can be increased by further expanding the test suite.

@joelreymont
Copy link
Author

@solomonstre @shykes

No hard feelings about #2337. I will also refrain from needless jabs in the future ;-).

Please see buildfile_test.go and buildfile_parser_test.go for the tests that apply to this PR and ensure that existing Dockerfiles will work unmodified with this parser.

I take pride in my work and will quickly fix any bugs in the new parser, regardless of my contractual arrangements with Docker.

@joelreymont
Copy link
Author

What's the next step here?

@joelreymont
Copy link
Author

I fixed the vendoring so we are pulling in the original nex repo now.

What's the next to merge this pull request?

@shykes
Copy link
Contributor

shykes commented Nov 30, 2013

Hi @wagerlabs, now that the craziness of the 0.7 release is behind us, we can take a second look at this.

@gabrtv
Copy link

gabrtv commented Dec 18, 2013

+1 for a proper lexer. It would have made #3521 easier (in part because I spent 2 hours debugging what turned out to be #3026!)

@drewcrawford
Copy link

@solomonstre I really want an include verb.

@crosbymichael
Copy link
Contributor

I'm going to close this now. When we are ready to look at the dockerfile parser again in the future and evaluate if the added complexity and code will be worth the feature gains we can reopen.

Thanks.

@0xcaff
Copy link

0xcaff commented Jun 2, 2014

It is probably best to get this done before 1.0.

@moby moby locked and limited conversation to collaborators Jul 30, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proper parser for build files
9 participants