Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler loses location information before calling macros (sometimes) #43081

Closed
alexcrichton opened this issue Jul 6, 2017 · 38 comments
Closed
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 A-macros-2.0 Area: Declarative macros 2.0 (#39412) C-bug Category: This is a bug. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Jul 6, 2017

Updated description

First implemented in #43230 the compiler can now tokenize a few AST nodes when necessary losslessly from the original token stream. Currently, however, this is somewhat buggy:

  • It only happens for items in certain situations (aka no inner attributes). The consequence of this is that more and more attributed items will have invalid span information when sent to procedural macros.
  • We don't invalidate the cache when later changing the item. The consequence of this is that procedural macros will receive a buggy view of what the AST node actually represents.

The "real bug" here is that we haven't actually implemented converting an AST to a token tree. The initial implementation in #43230 was effectively just a heuristic, and not even a great one! As a result we still need the ability basically to losslessly tokenize an AST node back to its original set of tokens.

Some bugs that arise from this are:

Original Description

There's an associated FIXME in the code right now, and to fix this we'll need to implement tokenization of an AST node. Right now the thinking of how to implement this is to save all TokenStream instances adjacent to an AST node, and use that instead of converting back into a token stream

cc @dtolnay, @nrc, @jseyfried

@alexcrichton alexcrichton added the A-macros-2.0 Area: Declarative macros 2.0 (#39412) label Jul 6, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Jul 28, 2017
alexcrichton added a commit to alexcrichton/rust that referenced this issue Jul 28, 2017
This partly resolves the `FIXME` located in `src/libproc_macro/lib.rs` when
interpreting interpolated tokens. All instances of `ast::Item` which have a list
of tokens attached to them now use that list of tokens to losslessly get
converted into a `TokenTree` instead of going through stringification and losing
span information.

cc rust-lang#43081
bors added a commit that referenced this issue Jul 28, 2017
Implement tokenization for some items in proc_macro

This PR is a partial implementation of #43081 targeted towards preserving span information in attribute-like procedural macros. Currently all attribute-like macros will lose span information with the input token stream if it's iterated over due to the inability of the compiler to losslessly tokenize an AST node. This PR takes a strategy of saving off a list of tokens in particular AST nodes to return a lossless tokenized version. There's a few limitations with this PR, however, so the old fallback remains in place.
alexcrichton added a commit to alexcrichton/rust that referenced this issue May 22, 2018
This commit updates the `Mac_` AST structure to keep track of the delimiters
that it originally had for its invocation. This allows us to faithfully
pretty-print macro invocations not using parentheses (e.g. `vec![...]`). This in
turn helps procedural macros due to rust-lang#43081.

Closes rust-lang#50840
@alexcrichton alexcrichton added the A-macros-1.2 Area: Declarative macros 1.2 label May 22, 2018
@alexcrichton alexcrichton added C-bug Category: This is a bug. and removed C-feature-request Category: A feature request, i.e: not implemented / a PR. labels May 22, 2018
bors added a commit that referenced this issue May 24, 2018
rustc: Correctly pretty-print macro delimiters

This commit updates the `Mac_` AST structure to keep track of the delimiters
that it originally had for its invocation. This allows us to faithfully
pretty-print macro invocations not using parentheses (e.g. `vec![...]`). This in
turn helps procedural macros due to #43081.

Closes #50840
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Oct 1, 2020
…nd, r=petrochenkov

Fix recursive nonterminal expansion during pretty-print/reparse check

Makes progress towards rust-lang#43081

In PR rust-lang#73084, we started recursively expanded nonterminals during the
pretty-print/reparse check, allowing them to be properly compared
against the reparsed tokenstream.

Unfortunately, the recursive logic in that PR only handles the case
where a nonterminal appears inside a `TokenTree::Delimited`. If a
nonterminal appears directly in the expanded tokens of another
nonterminal, the inner nonterminal will not be expanded.

This PR fixes the recursive expansion of nonterminals, ensuring that
they are expanded wherever they occur.
Aaron1011 added a commit to Aaron1011/rust that referenced this issue Oct 11, 2020
Fixes rust-lang#74616
Makes progress towards rust-lang#43081
Unblocks PR rust-lang#76130

When pretty-printing an AST node, we may insert additional parenthesis
to ensure that precedence is properly preserved in code we output.
However, the proc macro implementation relies on comparing a
pretty-printed AST node to the captured `TokenStream`. Inserting extra
parenthesis changes the structure of the reparsed `TokenStream`, making
the comparison fail.

This PR refactors the AST pretty-printing code to allow skipping the
insertion of additional parenthesis. Several freestanding methods are
moved to trait methods on `PrintState`, which keep track of an internal
`insert_extra_parens` flag. This flag is normally `true`, but we expose
a public method which allows pretty-printing a nonterminal with
`insert_extra_parens = false`.

To avoid changing the public interface of `rustc_ast_pretty`, the
freestanding `_to_string` methods are changed to delegate to a
newly-crated `State`. The main pretty-printing code is moved to a new
`state` module to ensure that it does not accidentally call any of these
public helper functions (instead, the internal functions with the same
name should be used).
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 14, 2020
…ochenkov

Refactor AST pretty-printing to allow skipping insertion of extra parens

Fixes rust-lang#75734
Makes progress towards rust-lang#43081
Unblocks PR rust-lang#76130

When pretty-printing an AST node, we may insert additional parenthesis
to ensure that precedence is properly preserved in code we output.
However, the proc macro implementation relies on comparing a
pretty-printed AST node to the captured `TokenStream`. Inserting extra
parenthesis changes the structure of the reparsed `TokenStream`, making
the comparison fail.

This PR refactors the AST pretty-printing code to allow skipping the
insertion of additional parenthesis. Several freestanding methods are
moved to trait methods on `PrintState`, which keep track of an internal
`insert_extra_parens` flag. This flag is normally `true`, but we expose
a public method which allows pretty-printing a nonterminal with
`insert_extra_parens = false`.

To avoid changing the public interface of `rustc_ast_pretty`, the
freestanding `_to_string` methods are changed to delegate to a
newly-crated `State`. The main pretty-printing code is moved to a new
`state` module to ensure that it does not accidentally call any of these
public helper functions (instead, the internal functions with the same
name should be used).
@thiolliere
Copy link
Contributor

thiolliere commented Nov 13, 2020

some question on this issue:

  • Is there some in-progress work or does some people thought about what is needed in order to fix the fact that usage of #[cfg(features="some_not_set")] loses the spans ? I would like to provide some help (if it's in my skill/time)

  • do we have a more complete list of all situation losing the span in proc_macros ? I am aware of:

Thanks all

@Aaron1011
Copy link
Member

Aaron1011 commented Nov 13, 2020

@thiolliere: Summary of the current progress:

Once all of the pieces of #76130 are merged, any code that compiles on stable should have all spans available (except for any additional pretty-printer issues that may get discovered).

Using proc-macros as inner attributes will still cause spans to be lost, but such usage is nightly-only.

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 30, 2020
…k, r=petrochenkov

Replace pretty-print/compare/retokenize hack with targeted workarounds

Based on rust-lang#78296
cc rust-lang#43081

The 'pretty-print/compare/retokenize' hack is used to try to avoid passing an outdated `TokenStream` to a proc-macro when the underlying AST is modified in some way (e.g. cfg-stripping before derives). Unfortunately, retokenizing throws away spans (including hygiene information), which causes issues of its own. Every improvement to the accuracy of the pretty-print/retokenize comparison has resulted in non-trivial ecosystem breakage due to hygiene changes. In extreme cases, users deliberately wrote unhygienic `macro_rules!` macros (likely because they did not realize that the compiler's behavior was a bug).

Additionaly, the comparison between the original and pretty-printed/retoknized token streams comes at a non-trivial runtime cost, as shown by rust-lang#79338

This PR removes the pretty-print/compare/retokenize logic from `nt_to_tokenstream`. We only discard the original `TokenStream` under two circumstances:
* Inner attributes are used (detected by examining the AST)
* `cfg`/`cfg_attr` processing modifies the AST. This is detected by making the visitor update a flag when it performs a modification, instead of trying to detect the modification after-the-fact. Note that a 'matching' `cfg` (e.g. `#[cfg(not(FALSE)]`) does not actually get removed from the AST, allowing us to preserve the original `TokenStream`.

In all other cases, we preserve the original `TokenStream`.

This could use a bit of refactoring/renaming - opening for a Crater run.

r? `@ghost`
@Aaron1011
Copy link
Member

Progress update: The pretty-print/retokenize check was removed in #79472. We now only lose spans when we explicitly remove the original TokenStream. This happens in two cases:

  1. The input to a derive proc-macro has cfg or cfg_attr attributes (except when a #[cfg] attribute is kept due to its predicate evaluating to true)
  2. Inner attributes are present

PR #80689 completely resolves this issue. However, it is quite large, so I'm in the process of splitting it into independent pieces for easier review.

@Aaron1011
Copy link
Member

Aaron1011 commented Apr 11, 2021

PR #82608 has been merged! We now preserve spans for all stable uses of macros (e.g. custom derives), and nearly all unstable usages (e.g. inner attrs).

If anyone encounters a loss of location information, please open a new issue for the specific problem you're having.

KarimHamidou added a commit to KarimHamidou/godot-rust that referenced this issue Feb 6, 2023
537: Add workaround for a rustc bug (rust/43081) r=halzy a=toasteater

This works around a compiler bug which can sometimes cause E0425 when the `#[methods]` macro is used. The error, when triggered, would show up as `error[E0425]: cannot find value `builder` in this scope` without any span information.

This is probably an instance of rust-lang/rust#43081

Co-authored-by: toasteater <48371905+toasteater@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 A-macros-2.0 Area: Declarative macros 2.0 (#39412) C-bug Category: This is a bug. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests