Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: path.parentPath.get(path.listKey) can clobber a traversal's visitors with those from a separate traversal in the global path cache #12570

Closed
1 task done
jedwards1211 opened this issue Dec 29, 2020 · 13 comments · Fixed by #15750
Labels
i: bug i: regression outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: traverse

Comments

@jedwards1211
Copy link
Contributor

Bug Report

  • I would like to work on a fix!

Input Code

const {parse} = require('@babel/core')
const generate = require('@babel/generator').default
const traverse = require('@babel/traverse').default

const code = `
import { Foo } from './Foo'
import { Bar } from './Bar'
`

const presets = [
  ['@babel/preset-env', { targets: { node: 'current' } }],
]

const ast = parse(code, {plugins, presets})

traverse(ast, {
  Program(path) {
    path.traverse({
      ImportDeclaration: {
        enter(path) {
          console.log('ENTER', generate(path.node).code)
          if (path.node.source.value === './Bar')
            path.parentPath.get(path.listKey)
        },
        exit(path) {
          console.log('EXIT ', generate(path.node).code)
        }
      }
    })
  }
})

Expected behavior

I haven't bisected to find the version that introduced the regression yet, but with @babel/core, @babel/traverse, and @babel/preset-env 7.4.5, the exit visitor is called for both import statements:

ENTER import { Foo } from './Foo';
EXIT  import { Foo } from './Foo';
ENTER import { Bar } from './Bar';
EXIT  import { Bar } from './Bar';

Current behavior

With these versions:
├─ @babel/core@7.12.10
├─ @babel/preset-env@7.12.11
└─ @babel/traverse@7.12.12

The exit visitor for the second import statement isn't called because path.parentPath.get(path.listKey) has the side effect of replacing path.opts with those of the parent traversal (which only has a Program visitor)

ENTER import { Foo } from './Foo';
EXIT  import { Foo } from './Foo';
ENTER import { Bar } from './Bar';

(missing EXIT import { Bar } from './Bar')

Additional context

babel-plugin-flow-runtime is no longer working after I upgraded @babel/*, and I uncovered this issue debugging it.

@babel-bot
Copy link
Collaborator

Hey @jedwards1211! We really appreciate you taking the time to report an issue. The collaborators on this project attempt to help as many people as possible, but we're a limited number of volunteers, so it's possible this won't be addressed swiftly.

If you need any help, or just have general Babel or JavaScript questions, we have a vibrant Slack community that typically always has someone willing to help. You can sign-up here for an invite."

@jedwards1211 jedwards1211 changed the title Regression: path.parentPath.get(path.listKey) can clobbers nested traversal's visitors with parent traversal's visitors Regression: path.parentPath.get(path.listKey) can clobber nested traversal's visitors with parent traversal's visitors Dec 29, 2020
@jedwards1211
Copy link
Contributor Author

jedwards1211 commented Dec 29, 2020

@nicolo-ribaudo okay actually I think the nested traversal isn't even the issue, even if I do a non-nested traversal like traverse(path.node, { ... }) within the Program visitor, because there's an apparently global cache of paths (https://github.com/babel/babel/blob/main/packages/babel-traverse/src/path/index.js#L70) it pulls in information that shouldn't be related in any way at all to the non-nested traversal, and clobbers its options.

Shouldn't each traversal own a separate path cache (or at least each root traversal)? This feels very wrong, how this global cache is causing unexpected effects.

@jedwards1211 jedwards1211 changed the title Regression: path.parentPath.get(path.listKey) can clobber nested traversal's visitors with parent traversal's visitors Regression: path.parentPath.get(path.listKey) can clobber a traversal's visitors with those from a separate traversal, because of the global path cache Dec 29, 2020
@jedwards1211 jedwards1211 changed the title Regression: path.parentPath.get(path.listKey) can clobber a traversal's visitors with those from a separate traversal, because of the global path cache Regression: path.parentPath.get(path.listKey) can clobber a traversal's visitors with those from a separate traversal in the global path cache Dec 29, 2020
jedwards1211 referenced this issue Dec 29, 2020
As mentioned on the task https://phabricator.babeljs.io/T7179 having
this cache on the AST leads to all sorts of portability and reuse
bugs.

This moves the cache into a clearable WeakMap which will fix the
following:

1. Moving the AST between different babel versions or tools will not
lead into sharing potentially outdated cached information

2. `.clear()` can be called on the cache by a plugin to clear
potentially outdated information. This is helpful when implementing two
seperate pipelines that should not share information.

I think the next step (which is harder, I tried) is to isolate cache and
make it live on a transform or pipeline level state (like the `hub`).

The reason it is hard is because the `babel-traverse` main API -- although
requires the state object to be passed -- not many callers do. To fix
this we should release a patch version that warns about this and fix all
the internal callers. Next couple of releases we can start throwing when
no state is passed (or we can create our own state).
@jedwards1211
Copy link
Contributor Author

jedwards1211 commented Dec 29, 2020

Okay I found the explanation for why cache ownership was moved from the traversal to the instance of @babel/traverse. But it seems it isn't safe for either one to own the cache.

It seems like the only safe design is either to not cache, or if so, each combination of root traversal and @babel/traverse version has a separate cache, instead of a cache owned by a root traversal accidentally getting accessed by different versions of @babel/traverse or a cache owned by @babel/traverse accidentally getting accessed by different root traversals.

@jedwards1211
Copy link
Contributor Author

jedwards1211 commented Dec 30, 2020

This regression was introduced in @babel/traverse 7.12.7 (seems that PR #12302 did it)

@fedeci
Copy link
Member

fedeci commented Feb 20, 2021

@jedwards1211 I just bisected and it seems to be a regression of #12331.

@jedwards1211
Copy link
Contributor Author

Huh. I didn't do an actual git bisect, just tried subsequent versions, so i bet you're right

@fedeci
Copy link
Member

fedeci commented Feb 21, 2021

After further investigation I discovered that the bug was already there since v6 😄 and #12331 just exposed it.

// quick temporary fix
path.parentPath.get(path.listKey, false)

@JLHwung
Copy link
Contributor

JLHwung commented Nov 18, 2021

I think the behaviour is expected because path.traverse does not visit path itself. In this case, for an ImportDeclaration path, its parentPath is always a Program, whose context are the root context because the Program path is not visited in path.traverse. Now when we call .get("body"): all the immediate children of Program, including the ImportDeclaration, inherit the root context and hence the current context is overridden by root context.

A workaround for

path.parentPath.get(path.listKey)

can be

const siblings = [
  ...path.getAllPrevSiblings(),
  path,
  ...path.getAllNextSiblings(),
];

When parentPath is involved, it is difficult to preserve context as it may run out of the sub-traversal, unless we mark the whole AST as the new context (so plugin is free to visit anywhere), or we provide a getParentPath(context: TraversalContext).

@jedwards1211
Copy link
Contributor Author

Huh I see. FWIW, I wonder if a different API design would be possible where the context information is passed around instead of being a property on the paths that gets mutated. This issue certainly wasn't a showstopper for me but I did burn a lot of time in a painful debugging session on it.

@JLHwung
Copy link
Contributor

JLHwung commented Nov 18, 2021

Indeed, I will see if I can take on #12634

@jedwards1211
Copy link
Contributor Author

@JLHwung oh interesting.

I noticed that visitor is already an argument to path.traverse etc. And this issue involved traversal getting the visitors from path.opts in some cases, and that getting blown away by context changes. So aside from changes in that PR, I wonder if it would be possible for traversal to always use the visitor passed around via function arguments instead of anything related to context?

@nicolo-ribaudo
Copy link
Member

The original example and the test in the attached PR both pass on main.

@nicolo-ribaudo
Copy link
Member

Fixed by #13813 (found by git bisect), I will add a test.

@github-actions github-actions bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Oct 5, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
i: bug i: regression outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: traverse
Projects
None yet
5 participants