Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby): detect node mutations (enabled by flag or env var) #34006

Merged
merged 12 commits into from Dec 8, 2021
101 changes: 101 additions & 0 deletions docs/docs/how-to/local-development/debugging-missing-data.md
@@ -0,0 +1,101 @@
---
title: Debugging Missing or Stale Data Fields on Nodes
---

## Overview of `LMDB` datastore behavior changes

In Gatsby 3 we introduced a new data store called `LMDB` (enabled with `LMDB_STORE` and/or `PARALLEL_QUERY_RUNNING` flags). In Gatsby 4 it became the default data store. It allows Gatsby to execute data layer related processing outside of the main build process and enables Gatsby to run queries in multiple processes as well as support additional rendering strategies ([DSG](/docs/reference/rendering-options/deferred-static-generation/) and [SSR](/docs/reference/rendering-options/server-side-rendering/)).

In a lot of cases that change is completely invisible to users, but there are cases where things behave differently.

Direct node mutations in various API lifecycles are not persisted anymore. In previous Gatsby versions it did work because source of truth for the data layer was directly in Node.js memory, so mutating node was in fact mutating source of truth. Now Gatsby edits data it receives from the database, so unless it explicitly upserts data to this database after edits, those edits will not be persisted (even for same the same build).

Common errors when doing swap to `LMDB` will be that some fields don't exist anymore or are `null`/`undefined` when trying to execute GraphQL queries.

## Diagnostic mode

Gatsby (starting with version 4.4) can detect those node mutations. Unfortunately it adds measurable overhead so it isn't enabled by default. You can opt into it when you see data related issues (particularly when you didn't have this issue before using `LMDB`). To enable diagnostic mode:

- Use truthy environment variable `GATSBY_DETECT_NODE_MUTATIONS`:
```
GATSBY_DETECT_NODE_MUTATIONS=1 gatsby build
```
- Or use `DETECT_NODE_MUTATIONS` config flag:
```javascript:title=gatsby-config.js
module.exports = {
flags: {
DETECT_NODE_MUTATIONS: true,
},
}
```

Example diagnostic message you might see:

```
warn Node mutation detected

File <rootDir>/plugins/transform/gatsby-node.js:4:20
2 | if (node.internal.type === `Test`) {
3 | // console.log(`[Mutating node in onCreateNode]`)
> 4 | node.nested.a2 = `edited`
| ^
5 | }
6 | }
7 |

Stack trace:
at Object.exports.onCreateNode (<rootDir>/plugins/transform/gatsby-node.js:4:20)
at runAPI (<rootDir>/node_modules/gatsby/src/utils/api-runner-node.js:462:22)
at Promise.catch.decorateEvent.pluginName
(<rootDir>/node_modules/gatsby/src/utils/api-runner-node.js:613:13)
at Promise._execute
... Rest of Stacktrace
```

It will point you to the code that is trying to mutate nodes. Note that it might also point to installed plugins in which case you should notify the plugin maintainer about it.

**Please Note:** Be sure to stop using this mode once you find and handle all problematic code paths as it will decrease performance.

## Migration

### Mutating a node in `onCreateNode`

Instead of mutating nodes directly, `createNodeField` action should be used instead. This way Gatsby will update the source of truth (to actually update the node in the datastore). `createNodeField` will create that additional field under `fields.fieldName`. If you want to preserve schema shape, so that additional field is on the root of a node, you can use schema customization.

```diff
const { createRemoteFileNode } = require(`gatsby-source-filesystem`)

exports.onCreateNode = async ({
node, // the node that was just created
- actions: { createNode },
+ actions: { createNode, createNodeField },
createNodeId,
getCache,
}) => {
if (node.internal.type === `SomeNodeType`) {
const fileNode = await createRemoteFileNode({
// the url of the remote image to generate a node for
url: node.imgUrl,
parentNodeId: node.id,
createNode,
createNodeId,
getCache,
})

if (fileNode) {
- node.localFile___NODE = fileNode.id
+ createNodeField({ node, name: 'localFile', value: fileNode.id })
}
}
}
+
+exports.createSchemaCustomization = ({ actions }) => {
+ const { createTypes } = actions
+
+ createTypes(`
+ type SomeNodeType implements Node {
+ localFile: File @link(from: "fields.localFile")
+ }
+ `)
+}
```
3 changes: 1 addition & 2 deletions docs/docs/reference/release-notes/migrating-from-v3-to-v4.md
Expand Up @@ -583,8 +583,7 @@ This was never an intended feature of Gatsby and is considered an anti-pattern (
Starting with v4 Gatsby introduces a persisted storage for nodes and thus this pattern will no longer work
because nodes are persisted after `createNode` call and all direct mutations after that will be lost.

Unfortunately it is hard to detect it automatically (without sacrificing performance), so we recommend you to
check your code to ensure you don't mutate nodes directly.
Gatsby provides diagnostic mode to detect those direct mutations, unfortunately it has noticeable performance overhead so we don't enable it by default. See [Debugging missing data](/docs/how-to/local-development/debugging-missing-data/) for more details on it.

Gatsby provides several actions available in `sourceNodes` and `onCreateNode` APIs to use instead:

Expand Down
3 changes: 2 additions & 1 deletion packages/gatsby/src/redux/actions/public.js
Expand Up @@ -28,6 +28,7 @@ const normalizePath = require(`../../utils/normalize-path`).default
import { createJobV2FromInternalJob } from "./internal"
import { maybeSendJobToMainProcess } from "../../utils/jobs/worker-messaging"
import { reportOnce } from "../../utils/report-once"
import { wrapNode } from "../../utils/detect-node-mutations"

const isNotTestEnv = process.env.NODE_ENV !== `test`
const isTestEnv = process.env.NODE_ENV === `test`
Expand Down Expand Up @@ -868,7 +869,7 @@ actions.createNode =

const { payload: node, traceId, parentSpan } = createNodeAction
return apiRunnerNode(`onCreateNode`, {
node,
node: wrapNode(node),
traceId,
parentSpan,
traceTags: { nodeId: node.id, nodeType: node.internal.type },
Expand Down
14 changes: 9 additions & 5 deletions packages/gatsby/src/schema/node-model.js
Expand Up @@ -22,6 +22,7 @@ import {
} from "../datastore"
import { GatsbyIterable, isIterable } from "../datastore/common/iterable"
import { reportOnce } from "../utils/report-once"
import { wrapNode, wrapNodes } from "../utils/detect-node-mutations"

type TypeOrTypeName = string | GraphQLOutputType

Expand Down Expand Up @@ -149,7 +150,7 @@ class LocalNodeModel {
this.trackInlineObjectsInRootNode(node)
}

return this.trackPageDependencies(result, pageDependencies)
return wrapNode(this.trackPageDependencies(result, pageDependencies))
}

/**
Expand Down Expand Up @@ -180,7 +181,7 @@ class LocalNodeModel {
result.forEach(node => this.trackInlineObjectsInRootNode(node))
}

return this.trackPageDependencies(result, pageDependencies)
return wrapNodes(this.trackPageDependencies(result, pageDependencies))
}

/**
Expand Down Expand Up @@ -221,7 +222,7 @@ class LocalNodeModel {
typeof type === `string` ? type : type.name
}

return this.trackPageDependencies(result, pageDependencies)
return wrapNodes(this.trackPageDependencies(result, pageDependencies))
}

/**
Expand Down Expand Up @@ -346,7 +347,10 @@ class LocalNodeModel {
pageDependencies.connectionType = gqlType.name
}
this.trackPageDependencies(result.entries, pageDependencies)
return result
return {
entries: wrapNodes(result.entries),
totalCount: result.totalCount,
}
}

/**
Expand Down Expand Up @@ -383,7 +387,7 @@ class LocalNodeModel {
// the query whenever any node of this type changes.
pageDependencies.connectionType = gqlType.name
}
return this.trackPageDependencies(first, pageDependencies)
return wrapNode(this.trackPageDependencies(first, pageDependencies))
}

prepareNodes(type, queryFields, fieldsToResolve) {
Expand Down
5 changes: 5 additions & 0 deletions packages/gatsby/src/services/initialize.ts
Expand Up @@ -21,6 +21,7 @@ import { IBuildContext } from "./types"
import { detectLmdbStore } from "../datastore"
import { loadConfigAndPlugins } from "../bootstrap/load-config-and-plugins"
import type { InternalJob } from "../utils/jobs/types"
import { enableNodeMutationsDetection } from "../utils/detect-node-mutations"

interface IPluginResolution {
resolve: string
Expand Down Expand Up @@ -185,6 +186,10 @@ export async function initialize({
}
const lmdbStoreIsUsed = detectLmdbStore()

if (process.env.GATSBY_DETECT_NODE_MUTATIONS) {
enableNodeMutationsDetection()
}

if (config && config.polyfill) {
reporter.warn(
`Support for custom Promise polyfills has been removed in Gatsby v2. We only support Babel 7's new automatic polyfilling behavior.`
Expand Down
40 changes: 36 additions & 4 deletions packages/gatsby/src/utils/api-runner-node.js
Expand Up @@ -30,6 +30,7 @@ const { requireGatsbyPlugin } = require(`./require-gatsby-plugin`)
const { getNonGatsbyCodeFrameFormatted } = require(`./stack-trace-utils`)
const { trackBuildError, decorateEvent } = require(`gatsby-telemetry`)
import errorParser from "./api-runner-error-parser"
import { wrapNode, wrapNodes } from "./detect-node-mutations"

if (!process.env.BLUEBIRD_DEBUG && !process.env.BLUEBIRD_LONG_STACK_TRACES) {
// Unless specified - disable longStackTraces
Expand All @@ -40,6 +41,21 @@ if (!process.env.BLUEBIRD_DEBUG && !process.env.BLUEBIRD_LONG_STACK_TRACES) {
Promise.config({ longStackTraces: false })
}

const nodeMutationsWrappers = {
getNode(id) {
return wrapNode(getNode(id))
},
getNodes() {
return wrapNodes(getNodes())
},
getNodesByType(type) {
return wrapNodes(getNodesByType(type))
},
getNodeAndSavePathDependency(id) {
return wrapNode(getNodeAndSavePathDependency(id))
},
}

// Bind action creators per plugin so we can auto-add
// metadata to actions they create.
const boundPluginActionCreators = {}
Expand Down Expand Up @@ -372,6 +388,14 @@ const runAPI = async (plugin, api, args, activity) => {
runningActivities.forEach(activity => activity.end())
}

const shouldDetectNodeMutations = [
`sourceNodes`,
`onCreateNode`,
`createResolvers`,
`createSchemaCustomization`,
`setFieldsOnGraphQLNodeType`,
].includes(api)

const apiCallArgs = [
{
...args,
Expand All @@ -383,11 +407,19 @@ const runAPI = async (plugin, api, args, activity) => {
store,
emitter,
getCache,
getNodes,
getNode,
getNodesByType,
getNodes: shouldDetectNodeMutations
? nodeMutationsWrappers.getNodes
: getNodes,
getNode: shouldDetectNodeMutations
? nodeMutationsWrappers.getNode
: getNode,
getNodesByType: shouldDetectNodeMutations
? nodeMutationsWrappers.getNodesByType
: getNodesByType,
reporter: extendedLocalReporter,
getNodeAndSavePathDependency,
getNodeAndSavePathDependency: shouldDetectNodeMutations
? nodeMutationsWrappers.getNodeAndSavePathDependency
: getNodeAndSavePathDependency,
cache,
createNodeId: namespacedCreateNodeId,
createContentDigest,
Expand Down