Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement token-based handling of attributes during expansion #82608

Merged
merged 1 commit into from Apr 11, 2021

Conversation

Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Feb 27, 2021

This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:

  • Derives macros no longer lose spans when their input is modified
    by eager cfg-expansion. This is accomplished by performing eager
    cfg-expansion on the token stream that we pass to the derive
    proc-macro
  • Inner attributes now preserve spans in all cases, including when we
    have multiple inner attributes in a row.

This is accomplished through the following changes:

  • New structs AttrAnnotatedTokenStream and AttrAnnotatedTokenTree are introduced.
    These are very similar to a normal TokenTree, but they also track
    the position of attributes and attribute targets within the stream.
    They are built when we collect tokens during parsing.
    An AttrAnnotatedTokenStream is converted to a regular TokenStream when
    we invoke a macro.
  • Token capturing and LazyTokenStream are modified to work with
    AttrAnnotatedTokenStream. A new ReplaceRange type is introduced, which
    is created during the parsing of a nested AST node to make the 'outer'
    AST node aware of the attributes and attribute target stored deeper in the token stream.
  • When we need to perform eager cfg-expansion (either due to #[derive] or #[cfg_eval]), we tokenize and reparse our target, capturing additional information about the locations of #[cfg] and #[cfg_attr] attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 27, 2021
@Aaron1011
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2021
@bors
Copy link
Contributor

bors commented Feb 27, 2021

⌛ Trying commit ac71b40b84add10df1cc6b4394ca53f5c3bfe16d with merge b86108850e24b4dd26ad05cb34d04ef489f2385a...

@bors
Copy link
Contributor

bors commented Feb 27, 2021

☀️ Try build successful - checks-actions
Build commit: b86108850e24b4dd26ad05cb34d04ef489f2385a (b86108850e24b4dd26ad05cb34d04ef489f2385a)

@rust-timer
Copy link
Collaborator

Queued b86108850e24b4dd26ad05cb34d04ef489f2385a with parent ec7f8d9, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (b86108850e24b4dd26ad05cb34d04ef489f2385a): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2021
@petrochenkov
Copy link
Contributor

@Aaron1011
Could you move tests to a separate PR or commit? I'm interested in diffs in test outputs before and after the compiler changes.

@Aaron1011
Copy link
Member Author

@petrochenkov Sure

@Aaron1011
Copy link
Member Author

Opened #82643

compiler/rustc_ast/src/ast_like.rs Outdated Show resolved Hide resolved
compiler/rustc_ast/src/attr/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_ast/src/tokenstream.rs Outdated Show resolved Hide resolved
compiler/rustc_ast/src/tokenstream.rs Outdated Show resolved Hide resolved
compiler/rustc_ast/src/tokenstream.rs Outdated Show resolved Hide resolved
compiler/rustc_ast/src/mut_visit.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/config.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/config.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/config.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/lib.rs Outdated Show resolved Hide resolved
@petrochenkov
Copy link
Contributor

(Still need to review changes in rustc_parse.)

@Aaron1011
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2021
@bors
Copy link
Contributor

bors commented Feb 28, 2021

⌛ Trying commit 87977c6a776ca0b4075f0998bde21267b3addcd6 with merge f1c431c58af7789983cbdfd61c470034c37eb631...

@bors
Copy link
Contributor

bors commented Feb 28, 2021

☀️ Try build successful - checks-actions
Build commit: f1c431c58af7789983cbdfd61c470034c37eb631 (f1c431c58af7789983cbdfd61c470034c37eb631)

@rust-timer
Copy link
Collaborator

Queued f1c431c58af7789983cbdfd61c470034c37eb631 with parent 573a697, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (f1c431c58af7789983cbdfd61c470034c37eb631): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 1, 2021
@Aaron1011
Copy link
Member Author

There appears to be a significant hit from being unable to bail out early from collect_tokens_trailing_token when parsing an expression. We have several different options:

  1. Accept the performance hit. This is a pretty bad option, as I suspect this could be a measurable slowdown on large, complex projects.
  2. Refactor expression parsing to determine in advance if the specific type of expression we are parsing (e.g. an if expression) supports inner attributes. However, the expression parsing code is already quite complicated, and this would require introducing even more complexity.
  3. Speed up the slower path of collect_tokens_trailing_token. Given the current design of the parser, I think this will prove somewhat difficult. We need to clone the current token and the TokenCursor, which I think is causing most of the slowdown
  4. Try to detect if inner attributes are definitely not present in the token stream. This would be kind of a hack, since we'd be effectively re-implementing part of the parser in collect_tokens_trailing_token. If inner attributes are present, then we will have the tokens <open_delimiter> # ! ... <close_delimiter> somewhere in the input. However, src/test/ui/proc-macro/weird-braces.rs shows that finding where this occurs is very tricky without actually parsing all of the tokens.

@bors

This comment has been minimized.

@Aaron1011
Copy link
Member Author

@EliaGeretto: Thanks for finding that! I've opened #84130 to fix it

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 14, 2021
…=petrochenkov

Fix lookahead with None-delimited group

Fixes rust-lang#84162, a regression introduced by rust-lang#82608.
bors added a commit to rust-lang-ci/rust that referenced this pull request May 3, 2021
parser: Remove support for inner attributes on non-block expressions

Remove support for attributes like
```rust
fn attrs() {
    (#![print_target_and_args(fifth)] 1, 2);

    [#![print_target_and_args(sixth)] 1 , 2];
    [#![print_target_and_args(seventh)] true ; 5];

    match 0 {
        #![print_target_and_args(eighth)]
        _ => {}
    }

    MyStruct { #![print_target_and_args(ninth)] field: true };
}
```
They are
- useless
- unstable (modulo holes like rust-lang#65860)
- pessimize compiler performance, namely token collection for macros (cc rust-lang#82608)

I still want to run crater on this to check whether the stability holes are exploited in practice, and whether such attributes are used at all.
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Jun 20, 2021
Pkgsrc changes:
 * Bump bootstrap requirements to 1.52.1
 * Adjust patches, adapt to upstream changes, adjust cargo checksums
 * If using an external llvm, require >= 10.0

Upsteream changes:

Version 1.53.0 (2021-06-17)
============================

Language
-----------------------
- [You can now use unicode for identifiers.][83799] This allows
  multilingual identifiers but still doesn't allow glyphs that are
  not considered characters such as `#` (diamond) or `<U+1F980>`
  (crab). More specifically you can now use any identifier that
  matches the UAX #31 "Unicode Identifier and Pattern Syntax" standard. This
  is the same standard as languages like Python, however Rust uses NFC
  normalization which may be different from other languages.
- [You can now specify "or patterns" inside pattern matches.][79278]
  Previously you could only use `|` (OR) on complete patterns. E.g.
  ```rust
  let x = Some(2u8);
  // Before
  matches!(x, Some(1) | Some(2));
  // Now
  matches!(x, Some(1 | 2));
  ```
- [Added the `:pat_param` `macro_rules!` matcher.][83386] This matcher
  has the same semantics as the `:pat` matcher. This is to allow `:pat`
  to change semantics to being a pattern fragment in a future edition.

Compiler
-----------------------
- [Updated the minimum external LLVM version to LLVM 10.][83387]
- [Added Tier 3\* support for the `wasm64-unknown-unknown` target.][80525]
- [Improved debuginfo for closures and async functions on Windows MSVC.][83941]

\* Refer to Rust's [platform support page][platform-support-doc] for more
information on Rust's tiered platform support.

Libraries
-----------------------
- [Abort messages will now forward to `android_set_abort_message` on
  Android platforms when available.][81469]
- [`slice::IterMut<'_, T>` now implements `AsRef<[T]>`][82771]
- [Arrays of any length now implement `IntoIterator`.][84147]
  Currently calling `.into_iter()` as a method on an array will
  return `impl Iterator<Item=&T>`, but this may change in a
  future edition to change `Item` to `T`. Calling `IntoIterator::into_iter`
  directly on arrays will provide `impl Iterator<Item=T>` as expected.
- [`leading_zeros`, and `trailing_zeros` are now available on all
  `NonZero` integer types.][84082]
- [`{f32, f64}::from_str` now parse and print special values
  (`NaN`, `-0`) according to IEEE RFC 754.][78618]
- [You can now index into slices using `(Bound<usize>, Bound<usize>)`.][77704]
- [Add the `BITS` associated constant to all numeric types.][82565]

Stabilised APIs
---------------
- [`AtomicBool::fetch_update`]
- [`AtomicPtr::fetch_update`]
- [`BTreeMap::retain`]
- [`BTreeSet::retain`]
- [`BufReader::seek_relative`]
- [`DebugStruct::non_exhaustive`]
- [`Duration::MAX`]
- [`Duration::ZERO`]
- [`Duration::is_zero`]
- [`Duration::saturating_add`]
- [`Duration::saturating_mul`]
- [`Duration::saturating_sub`]
- [`ErrorKind::Unsupported`]
- [`Option::insert`]
- [`Ordering::is_eq`]
- [`Ordering::is_ge`]
- [`Ordering::is_gt`]
- [`Ordering::is_le`]
- [`Ordering::is_lt`]
- [`Ordering::is_ne`]
- [`OsStr::is_ascii`]
- [`OsStr::make_ascii_lowercase`]
- [`OsStr::make_ascii_uppercase`]
- [`OsStr::to_ascii_lowercase`]
- [`OsStr::to_ascii_uppercase`]
- [`Peekable::peek_mut`]
- [`Rc::decrement_strong_count`]
- [`Rc::increment_strong_count`]
- [`Vec::extend_from_within`]
- [`array::from_mut`]
- [`array::from_ref`]
- [`char::MAX`]
- [`char::REPLACEMENT_CHARACTER`]
- [`char::UNICODE_VERSION`]
- [`char::decode_utf16`]
- [`char::from_digit`]
- [`char::from_u32_unchecked`]
- [`char::from_u32`]
- [`cmp::max_by_key`]
- [`cmp::max_by`]
- [`cmp::min_by_key`]
- [`cmp::min_by`]
- [`f32::is_subnormal`]
- [`f64::is_subnormal`]

Cargo
-----------------------
- [Cargo now supports git repositories where the default `HEAD` branch is not
  "master".][cargo/9392] This also includes a switch to the version
  3 `Cargo.lock` format which can handle default branches correctly.
- [macOS targets now default to `unpacked` split-debuginfo.][cargo/9298]
- [The `authors` field is no longer included in `Cargo.toml` for new
  projects.][cargo/9282]

Rustdoc
-----------------------
- [Added the `rustdoc::bare_urls` lint that warns when you have URLs
  without hyperlinks.][81764]

Compatibility Notes
-------------------
- [Implement token-based handling of attributes during expansion][82608]
- [`Ipv4::from_str` will now reject octal format IP addresses in addition
  to rejecting hexadecimal IP addresses.][83652] The octal format can lead
  to confusion and potential security vulnerabilities and [is no
  longer recommended][ietf6943].

Internal Only
-------------
These changes provide no direct user facing benefits, but represent significant
improvements to the internals and overall performance of rustc and
related tools.

- [Rework the `std::sys::windows::alloc` implementation.][83065]
- [rustdoc: Don't enter an infer_ctxt in get_blanket_impls for
  impls that aren't blanket impls.][82864]
- [rustdoc: Only look at blanket impls in `get_blanket_impls`][83681]
- [Rework rustdoc const type][82873]

[83386]: rust-lang/rust#83386
[82771]: rust-lang/rust#82771
[84147]: rust-lang/rust#84147
[84082]: rust-lang/rust#84082
[83799]: rust-lang/rust#83799
[83681]: rust-lang/rust#83681
[83652]: rust-lang/rust#83652
[83387]: rust-lang/rust#83387
[82873]: rust-lang/rust#82873
[82864]: rust-lang/rust#82864
[82608]: rust-lang/rust#82608
[82565]: rust-lang/rust#82565
[80525]: rust-lang/rust#80525
[79278]: rust-lang/rust#79278
[78618]: rust-lang/rust#78618
[77704]: rust-lang/rust#77704
[83941]: rust-lang/rust#83941
[83065]: rust-lang/rust#83065
[81764]: rust-lang/rust#81764
[81469]: rust-lang/rust#81469
[cargo/9298]: rust-lang/cargo#9298
[cargo/9282]: rust-lang/cargo#9282
[cargo/9392]: rust-lang/cargo#9392
[`char::MAX`]: https://doc.rust-lang.org/std/primitive.char.html#associatedconstant.MAX
[`char::REPLACEMENT_CHARACTER`]: https://doc.rust-lang.org/std/primitive.char.html#associatedconstant.REPLACEMENT_CHARACTER
[`char::UNICODE_VERSION`]: https://doc.rust-lang.org/std/primitive.char.html#associatedconstant.UNICODE_VERSION
[`char::decode_utf16`]: https://doc.rust-lang.org/std/primitive.char.html#method.decode_utf16
[`char::from_u32`]: https://doc.rust-lang.org/std/primitive.char.html#method.from_u32
[`char::from_u32_unchecked`]: https://doc.rust-lang.org/std/primitive.char.html#method.from_u32_unchecked
[`char::from_digit`]: https://doc.rust-lang.org/std/primitive.char.html#method.from_digit
[`AtomicBool::fetch_update`]: https://doc.rust-lang.org/std/sync/atomic/struct.AtomicBool.html#method.fetch_update
[`AtomicPtr::fetch_update`]: https://doc.rust-lang.org/std/sync/atomic/struct.AtomicPtr.html#method.fetch_update
[`BTreeMap::retain`]: https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.retain
[`BTreeSet::retain`]: https://doc.rust-lang.org/std/collections/struct.BTreeSet.html#method.retain
[`BufReader::seek_relative`]: https://doc.rust-lang.org/std/io/struct.BufReader.html#method.seek_relative
[`DebugStruct::non_exhaustive`]: https://doc.rust-lang.org/std/fmt/struct.DebugStruct.html#method.finish_non_exhaustive
[`Duration::MAX`]: https://doc.rust-lang.org/std/time/struct.Duration.html#associatedconstant.MAX
[`Duration::ZERO`]: https://doc.rust-lang.org/std/time/struct.Duration.html#associatedconstant.ZERO
[`Duration::is_zero`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.is_zero
[`Duration::saturating_add`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.saturating_add
[`Duration::saturating_mul`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.saturating_mul
[`Duration::saturating_sub`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.saturating_sub
[`ErrorKind::Unsupported`]: https://doc.rust-lang.org/std/io/enum.ErrorKind.html#variant.Unsupported
[`Option::insert`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.insert
[`Ordering::is_eq`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_eq
[`Ordering::is_ge`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_ge
[`Ordering::is_gt`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_gt
[`Ordering::is_le`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_le
[`Ordering::is_lt`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_lt
[`Ordering::is_ne`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_ne
[`OsStr::eq_ignore_ascii_case`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.eq_ignore_ascii_case
[`OsStr::is_ascii`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.is_ascii
[`OsStr::make_ascii_lowercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.make_ascii_lowercase
[`OsStr::make_ascii_uppercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.make_ascii_uppercase
[`OsStr::to_ascii_lowercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.to_ascii_lowercase
[`OsStr::to_ascii_uppercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.to_ascii_uppercase
[`Peekable::peek_mut`]: https://doc.rust-lang.org/std/iter/struct.Peekable.html#method.peek_mut
[`Rc::decrement_strong_count`]: https://doc.rust-lang.org/std/rc/struct.Rc.html#method.increment_strong_count
[`Rc::increment_strong_count`]: https://doc.rust-lang.org/std/rc/struct.Rc.html#method.increment_strong_count
[`Vec::extend_from_within`]: https://doc.rust-lang.org/beta/std/vec/struct.Vec.html#method.extend_from_within
[`array::from_mut`]: https://doc.rust-lang.org/beta/std/array/fn.from_mut.html
[`array::from_ref`]: https://doc.rust-lang.org/beta/std/array/fn.from_ref.html
[`cmp::max_by_key`]: https://doc.rust-lang.org/beta/std/cmp/fn.max_by_key.html
[`cmp::max_by`]: https://doc.rust-lang.org/beta/std/cmp/fn.max_by.html
[`cmp::min_by_key`]: https://doc.rust-lang.org/beta/std/cmp/fn.min_by_key.html
[`cmp::min_by`]: https://doc.rust-lang.org/beta/std/cmp/fn.min_by.html
[`f32::is_subnormal`]: https://doc.rust-lang.org/std/primitive.f64.html#method.is_subnormal
[`f64::is_subnormal`]: https://doc.rust-lang.org/std/primitive.f64.html#method.is_subnormal
[ietf6943]: https://datatracker.ietf.org/doc/html/rfc6943#section-3.1.1
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 1, 2021
…tatable, r=Aaron1011

Avoid uneccessary clone of Annotatable

Addresses FIXME comment created in rust-lang#82608

r? `@Aaron1011`
@nnethercote nnethercote mentioned this pull request Apr 27, 2022
nnethercote added a commit to nnethercote/rust that referenced this pull request Apr 29, 2022
`make_tokenstream` has three commented hacks, and a comment at the top
referring to rust-lang#67062. These hacks have no observable effect, at least as judged
by running the test suite. The hacks were added in rust-lang#82608, with an explanation
[here](rust-lang#82608 (comment)). It
appears that one of the following is true: (a) they never did anything useful,
(b) they do something useful but we have no test coverage for them, or (c)
something has changed in the meantime that means they are no longer necessary.

This commit removes the hacks and the comments, in the hope that (b) is not
true.
JohnTitor added a commit to JohnTitor/rust that referenced this pull request May 11, 2022
…cks, r=Aaron1011

Remove hacks in `make_token_stream`.

`make_tokenstream` has three commented hacks, and a comment at the top
referring to rust-lang#67062. These hacks have no observable effect, at least as judged
by running the test suite. The hacks were added in rust-lang#82608, with an explanation
[here](rust-lang#82608 (comment)). It
appears that one of the following is true: (a) they never did anything useful,
(b) they do something useful but we have no test coverage for them, or (c)
something has changed in the meantime that means they are no longer necessary.

This commit removes the hacks and the comments, in the hope that (b) is not
true.

r? `@Aaron1011`
calebcartwright pushed a commit to calebcartwright/rustfmt that referenced this pull request Jun 8, 2022
…ron1011

Remove hacks in `make_token_stream`.

`make_tokenstream` has three commented hacks, and a comment at the top
referring to #67062. These hacks have no observable effect, at least as judged
by running the test suite. The hacks were added in #82608, with an explanation
[here](rust-lang/rust#82608 (comment)). It
appears that one of the following is true: (a) they never did anything useful,
(b) they do something useful but we have no test coverage for them, or (c)
something has changed in the meantime that means they are no longer necessary.

This commit removes the hacks and the comments, in the hope that (b) is not
true.

r? `@Aaron1011`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants