Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AST for rightward-assignment #738

Closed
marcandre opened this issue Sep 3, 2020 · 15 comments
Closed

AST for rightward-assignment #738

marcandre opened this issue Sep 3, 2020 · 15 comments

Comments

@marcandre
Copy link
Contributor

marcandre commented Sep 3, 2020

I must apologize to be late to the party (again 😅), but I have to ask... is the AST for rightward assignment the best we can output? #682

I feel the same arguments as for the endless method definition's AST in #676 stand... Rightward assignment is pure syntax sugar and there is never any difference in execution between foo => var and var = foo (that I know of) so ideally the AST would reflect that.

If we output a standard lvasgn (and it is a local var assignment), then many tools that only care about which variables are assigned to what will not even need to be updated, since in this case the locations are even exactly the same (except that the name location is after the operator location instead of before). For example, I suspect unparser would work out of the box with lvasgn and produce 100% correct code. I didn't check, but I believe DeepCover would too.

cc/ @palkan

@iliabylich
Copy link
Collaborator

Right, but what about linters? IIRC rubocop is based on a source rewriter, not on unparser, and so potentially it can easily reformat rasgn (if it's emitted as lvasgn) into invalid ruby code.

What's gonna happen if you have code like

a
  .b
  .c
  .d => e

and you run rubocop in rewrite mode? e is aligned incorrectly (at least for lvasgn, I guess it should have <start of the block> + 2 spaces, right?), how can a linter fix it if it doesn't know about rvasgn? Also, can rubocop "move" it to the first line? Can it emit this instead?

a
  .b
  .c
e=.d

I'm not super opinionated, we can easily change it. The main reason why I like adding new nodes is that some users don't read updates or don't update their code that depends on parser, and so new parser + old code combo can easily cause bugs. It's better to get error: unsupported node X on the user side than some kind of a NPE. By adding a new node we force everyone to update their code according to new language features. If you want it's possible to override Parser::Builder and "merge" these assignments into a single node.

@marcandre
Copy link
Contributor Author

Right, for sure new syntax will probably require some kind of update to RuboCop, one way or another.

A quick search for lvasgn in the source has only 28 matches, and 100% of them are in the VariableForce, which is not a Cop but helps Cops that need it by determining the scope of local variables. So VariableForce would benefit from having a single lvasgn for either type of assignment. I'm not too sure how the cops that actually deal with alignment do it 😅 I can check tomorrow

@palkan
Copy link
Contributor

palkan commented Sep 4, 2020

By adding a new node we force everyone to update their code according to new language features

+1

then many tools that only care about which variables are assigned to what will not even need to be updated

Do we want to be not only backward-compatible but forward-compatible as well 🙂?

On the other hand, tools that care about the original source format would have to switch from AST-defined difference to some internal knowledge (and deal with if and else).

VariableForce would benefit from having a single lvasgn for either type of assignment

Since the feature is pretty simple, adding support for it would also be simple for any library. For example, that's what we need to make RuboCop VariableForce understand it:
ruby-next/ruby-next@b4c08ba#diff-c129c78dd84946780d5c9f222a11c960

@marcandre
Copy link
Contributor Author

I feel we've had this whole discussion before, I'm not sure we want to have it again 😅

One of the conclusion was that the principle that a any new language feature implies a new AST type was not necessarily the right way to go.

The other conclusion was that "how" something is written is in the location (think if / else / end vs ? :), "what" is written is in the locations.

For the one-line def the final conclusion was to keep the same AST types, despite the location potentially be nil.

For the right assign, there is not even a location that can newly be potentially nil.

The AST produced feels even more wrong than in the def case. The same idea "assign 42 to the instance variable @foo" generates two really different ASTs, different depth and structure:

> ruby-parse --28 -e '@foo = 1'
(ivasgn :@foo
  (int 1))
> ruby-parse --28 -e '1 => @foo'
(rasgn
  (int 1)
  (ivasgn :@foo))

What argument / principle leads to the conclusion that one-line definition => keep same AST type & structure and right assign => use different AST type and structure? For me, it's the same thing, and the same answer applies.

@palkan
Copy link
Contributor

palkan commented Sep 4, 2020

One argument I have is having a natural order of nodes in AST (i.e., similar to the source code):

// the AST looks like I'm reading code RTL 🤷🏻‍♂️
> ruby-parse --28 -e '1 => @foo'
((ivasgn :@foo
  (int 1))

@iliabylich
Copy link
Collaborator

Ok, I agree, let's remove rasgn. Even MRI internally treats rasgn as a regular lvasgn:

> ./miniruby --dump=parsetree -e 'a = 1'
###########################################################
## Do NOT use this node dump for any purpose other than  ##
## debug and research.  Compatibility is not guaranteed. ##
###########################################################

# @ NODE_SCOPE (line: 1, location: (1,0)-(1,5))
# +- nd_tbl: :a
# +- nd_args:
# |   (null node)
# +- nd_body:
#     @ NODE_LASGN (line: 1, location: (1,0)-(1,5))*
#     +- nd_vid: :a
#     +- nd_value:
#         @ NODE_LIT (line: 1, location: (1,4)-(1,5))
#         +- nd_lit: 1

> ./miniruby --dump=parsetree -e '1 => a'
###########################################################
## Do NOT use this node dump for any purpose other than  ##
## debug and research.  Compatibility is not guaranteed. ##
###########################################################

# @ NODE_SCOPE (line: 1, location: (1,0)-(1,6))
# +- nd_tbl: :a
# +- nd_args:
# |   (null node)
# +- nd_body:
#     @ NODE_LASGN (line: 1, location: (1,0)-(1,6))*
#     +- nd_vid: :a
#     +- nd_value:
#         @ NODE_LIT (line: 1, location: (1,0)-(1,1))
#         +- nd_lit: 1

@marcandre
Copy link
Contributor Author

Great. I'll make a PR

@iliabylich
Copy link
Collaborator

Fixed in #739

@mbj
Copy link
Collaborator

mbj commented Sep 7, 2020

Thanks! rasgn to me was leaking syntax not semantics into the AST which is supposed (outside of source maps!) to only care about semantics.

@palkan
Copy link
Contributor

palkan commented Oct 9, 2020

@mbj Well, maybe, it was leaking syntax, but now Unparser fails to correctly handle it: 13.divmod(5) => a,b => c,d => (c, d) = (a, b) = 13.divmod(5) (should be c, d = (a, b = 13.divmod(5)) 🤷🏻‍♂️

@mbj
Copy link
Collaborator

mbj commented Oct 9, 2020

@palkan interesting, right assignment is > 2.7, correct? I've lost track.

Unparser currently only aims at 2.7, but I'm making a list for > 2.7 already.

@mbj mbj mentioned this issue Oct 9, 2020
1 task
@palkan
Copy link
Contributor

palkan commented Oct 10, 2020

right assignment is > 2.7, correct?

@mbj Yep. I posted it here as an example demonstrating that the existing tools still should care about this syntax, even though the AST is the same as for the left assignment.

@mbj
Copy link
Collaborator

mbj commented Oct 10, 2020

Yeah, it likely requires some sensing, just like in case of (or a b) nodes.

It could be that some AST structures "only" can be produced from a righthand assignment, which than needs unparser to also emit it for round tripping.

@marcandre
Copy link
Contributor Author

It could be that some AST structures "only" can be produced from a righthand assignment, which than needs unparser to also emit it for round tripping.

That's the case here. foo => a, b => c, d can't be produced with standard assignment, as you need to put parens in c, d = (a, b = foo) and you get an extra begin node compared to rightward assignment.

@mbj
Copy link
Collaborator

mbj commented Oct 10, 2020

Yeah, I think unparser, once I get to 3.x support can easily detect these from the AST structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants