Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v8] Update for ESM monkeypatch + ESM file structure #10046

Closed
12 tasks done
Tracked by #9508
AbhiPrasad opened this issue Jan 4, 2024 · 15 comments
Closed
12 tasks done
Tracked by #9508

[v8] Update for ESM monkeypatch + ESM file structure #10046

AbhiPrasad opened this issue Jan 4, 2024 · 15 comments
Assignees
Milestone

Comments

@AbhiPrasad
Copy link
Member

AbhiPrasad commented Jan 4, 2024

We’re going to be adding proper support for ESM in the Sentry repo.

Requirement:

  • There is no require in ESM bundles at all
  • ESM files have .mjs file extension
  • CJS files have .cjs file extension

Tasks

  1. AbhiPrasad
  2. 2 of 2
  3. Package: remix Type: Improvement

Read below for some justification and information

Module Customization Hooks for Monkey Patching

For ESM bundles on server-side, we’ll require a minimum Node version of 18.6.0 or 20.6.0 or higher. Otherwise we’ll support CJS bundles for Node 14+ generally. This is because we want access to Node customization hooks, which allows us to add monkeypatching for esm modules programatically (doesn't require command-line loader).

We can take some inspiration here from https://github.com/DataDog/import-in-the-middle

import { register } from 'node:module';

register('./sentry-patch.mjs', import.meta.url);

Registering hooks to affect an import still needs to happen before the import is resolved, so there are ordering issues, but I think we can work around this by recommending that Sentry.init is called as soon as possible, and encouraging Sentry.addIntegration patterns for integrations that require objects from different libraries (like express with app).

The modules we need register via register API:

  • Node node:http and node:https (diagnostics channels only works well Node 16+, so we need to maintain this for Node 14
  • Remix @remix-run/server-runtime and react-router-dom - monkeypatch
  • OTEL esm instrumentation
  • GoogleCloudGrpc integration require('google-gax')
  • GoogleCloudHttp integration with require('@google-cloud/common')
  • AWSServices integration with require('aws-sdk/global')

We need to re-evaluate if we need monkeypatching for the database integrations now that we have OpenTelemetry.

Change File Structure

We’ll need to move our file structure to the following:

{
  // at build time we strip `build/X`
  "main": "build/cjs/index.cjs",
  "module": "build/esm/index.mjs",
  "types": "build/types/index.d.cts",
  // note: this is an opportunity for us to explore multiple subpaths, like
  // exporting from `@sentry/browser/utils`
  "exports": {
    "./package.json": "./package.json",
    ".": {
      // esm
      "import": {
        // path must be relative
        "types": "./build/types/index.d.mts",
        "default": "./build/esm/index.mjs"
      },
      // cjs
      "require": {
        // path must be relative
        "types": "./build/types/index.d.cts",
        "default": "./build/esm/index.mjs"
      }
      // we might need to expose an ESM only export for customization hooks
      // to use via node --loader=... API
    }
  },
  "typesVersions": {
    "<4.9": {
      "build/types/index.d.cts": [
        "build/types-ts3.8/index.d.cts"
      ],
      "build/types/index.d.mts": [
        "build/types-ts3.8/index.d.mts"
      ]
    }
  },
  "files": [
    "cjs",
    "esm",
    "types",
    "types-ts3.8"
  ]
}

Shameless promo: Watch my talk if you want more details about this.

We also should change all node standard library imports to import from node:X.

Make dynamicRequire/loadModule work with ESM import

This exists to trick webpack, but we use it all over the codebase. We need to add an esm compatible way.

One thing we can do is introduce an async dynamicRequire that simulates await import, and then build time replace the functionality of dynamicRequire to use import or require under the hood?

  • packages/node/src/integrations/anr/index.ts - to conditionally import worker_threads. This is just because of Node 12 support, we’ll remove this import entirely
  • packages/node/src/integrations/local-variables/local-variables-async.ts - to conditionally import node:inspector/promises.
  • All of our database integrations - these are all going away
  • packages/utils/src/time.ts - this needs to be refactored entirely. We still want an isomorphic solution because we need date-time helpers that work across core/browser/node and all other packages, but what should happen is that we look for the globalThis.performance instead of relying on perf_hooks. This means that if the global performance API is not available we will fall back to plain [Date.now](http://Date.now) and similar helpers (basically only Node 14 for us). We also need to add a smarter timeOrigin calculation that reset’s itself to help alleviate issues with drifting monotonic clock. See Provide a supported way to get anchored clock times open-telemetry/opentelemetry-js#3279 (comment)
@AbhiPrasad AbhiPrasad added this to the 8.0.0 milestone Jan 4, 2024
@AbhiPrasad
Copy link
Member Author

To validate our package.json setup, we'll need to add https://www.npmjs.com/package/@arethetypeswrong/cli as a validation check in CI.

@mydea
Copy link
Member

mydea commented Jan 4, 2024

I don't think we'll need to worry about node:http and node:https, as I guess for node this will be taken care of by the otel instrumentation?

@timfish
Copy link
Collaborator

timfish commented Jan 4, 2024

There is also usage of require.main.filename for module name resolution

@AbhiPrasad
Copy link
Member Author

I don't think we'll need to worry about node:http and node:https, as I guess for node this will be taken care of by the otel instrumentation?

We'll still need something to add breadcrumbs right? Or does that work?

@AbhiPrasad
Copy link
Member Author

There is also usage of require.main.filename for module name resolution

Related GH issue: nodejs/node#49440

@mydea
Copy link
Member

mydea commented Jan 4, 2024

We'll still need something to add breadcrumbs right? Or does that work?

For this we hook into the otel instrumentation and add breadcrumbs from there - see the http integration in node-experimental!

@timfish
Copy link
Collaborator

timfish commented Jan 4, 2024

There is also usage of require.main.filename for module name resolution

Turns out process.argv[1] will be and has always been the entry point in Node so we can use that over require.main.filename. The only place this doesn't hold true is Electron where this is automatically fetched from package.json > main and doesn't end up in argv

@timfish
Copy link
Collaborator

timfish commented Jan 4, 2024

// note: this is an opportunity for us to explore multiple subpaths, like
// exporting from @sentry/browser/utils

package.json exports support was added in Node v12.7.0. Before that, the only way to support subpaths is to output the cjs files into a matching directory in the package root.

https://nodejs.org/api/packages.html#exports

The history suggests conditional exports was added even later but I guess it will still fallback to main.

@Lms24
Copy link
Member

Lms24 commented Jan 5, 2024

One thing we might need to consider is our usage of http-proxy-agent in the Node transport. Whenever I used ESM versions of our SDKs in Vite-based frameworks (Sveltekit, Astro) it seemed like Vite wasn't able to correctly work with this dependency because it's CJS only (so even after experimentally removing stuff like the http integrations to avoid dynamic requires, this seemed to be the problem I didn't find a workaround for).

Can we somehow get rid of this dependency?

Chances are that this problem is limited to Vite as #10042 shows. However, while we can change the respective vite config setting in Astro for our users, it's not as easy in other frameworks.

I'd vote we look at this once we made the other changes to check if it's actually still a problem by then. Maybe it was just a by-product of us using older package.json properties or the generic file extensions.

@timfish
Copy link
Collaborator

timfish commented Jan 5, 2024

One thing we might need to consider is our usage of http-proxy-agent in the Node transport
Can we somehow get rid of this dependency?

It's on my todo list. The latest version is fully typescript so I think I started a PR to vendor it.

EDIT

Ah I remembered what the issue was, they appear to have dropped support for older node versions:
#9199 (comment)

@timfish
Copy link
Collaborator

timfish commented Jan 6, 2024

import { register } from 'node:module';

register('./sentry-patch.mjs', import.meta.url);

Will this work with bundling?

Since register accepts a URL, it also supports data URLs so we can pass the script in the same way we do for the Anr worker.

So the following code:

import { register } from "node:module";

const hookScript = Buffer.from(`
export async function initialize() {
  console.log("initialize");
}

export async function resolve(specifier, context, nextResolve) {
  console.log("resolve", specifier, context);
  return nextResolve(specifier);
}

export async function load(url, context, nextLoad) {
  console.log("load", url, context);
  return nextLoad(url);
}
`);

register(
  new URL(`data:application/javascript;base64,${hookScript.toString("base64")}`),
  import.meta.url
);

const inspect = await import("node:inspector/promises");

Outputs:

initialize
resolve node:inspector/promises {
  conditions: [ 'node', 'import', 'node-addons' ],
  importAttributes: {},
  parentURL: 'file:///Users/tim/Documents/Repositories/node-test/index.mjs'
}
load node:inspector/promises { format: undefined, importAttributes: {} }

@AbhiPrasad
Copy link
Member Author

Will this work with bundling?

I have to test it out, but luckily the opt-out is just asking folks to use node --loader=...

AbhiPrasad added a commit that referenced this issue Jan 18, 2024
Closes #9199

This PR vendors the `https-proxy-agent` code and in the process updates
to v7.0.0.

`https-proxy-agent` is our last remaining cjs-only dependency so this is
required for #10046.

This removes the following dependencies:
- `https-proxy-agent@5.0.1`
- `agent-base@6.0.2`
- `debug@4.3.4`
- `ms@2.1.2`

The vendored code has been modified to use the Sentry logger rather than
`debug`.

Initially, rather than modify the vendored code substantially just to
pass our tight lint rules, I've disabled a few of the less important
lints that would make it particularly tricky to pull in upstream bug
fixes:
```ts
/* eslint-disable @typescript-eslint/explicit-member-accessibility */
/* eslint-disable @typescript-eslint/member-ordering */
/* eslint-disable jsdoc/require-jsdoc */
``` 

## Min supported Node version

`https-proxy-agent` has a `@types/node@14.18.45` dev dependency but
apart from adding an import for `URL`, I can't find anything that would
stop it working on older versions of node and there is nothing in the
changelogs that would suggest the min supported version has changed.

---------

Co-authored-by: Abhijeet Prasad <aprasad@sentry.io>
@GauBen
Copy link

GauBen commented Jan 26, 2024

To validate our package.json setup, we'll need to add npmjs.com/package/@arethetypeswrong/cli as a validation check in CI.

I'd recommend publint too as it currently finds error that arethetypeswrong does not find: https://publint.dev/@sentry/node

@timfish
Copy link
Collaborator

timfish commented Feb 25, 2024

We don't need to concern ourselves with bundling issues for now. If users bundle their code, libraries will not be loaded via the import hook anyway!

I was initially left confused over how import-in-the-middle actually works. The docs say that module customisation hooks run in their own thread so how does it call the hooks in "userland" from the hook thread? It turns out they inject code into the "userland" side to hook this up and then they can use shared state in global variables:
https://github.com/DataDog/import-in-the-middle/blob/c3c2c52c1915b47994af59d507c59029c1f1fae9/hook.js#L302-L324

After experimenting and consulting the docs I can conclude that there is no way to have import hooks work fully without passing a command line argument to node. As per the ESM spec, imports are resolved and loaded before any code is executed.

The node.js docs say the import hook registration should be via the --import arg:

Using --import ensures that the hooks are registered before any application files are imported, including the entry point of the application. Alternatively, register can be called from the entry point, but dynamic import() must be used for any code that should be run after the hooks are registered.

So calling register in the regular entry code can only affect dynamic imports!

If we add a package export for /register which registers our hook, the cli args will be reduced to:

node --import @sentry/node/register app.js

@AbhiPrasad
Copy link
Member Author

I'm going to close this issue. We now have 3 follow up areas:

  1. Remix SDK monkeypatches @remix-run/server-runtime via a dynamic require. With vite powered remix, we may get better APIs to do monkeypatching here. For now we should debate if we temporarily make the Remix SDK

Opened #11067 to track ESM support for Remix SDK.

  1. The serverless SDKs use require to monkeypatch GCP and AWS related libraries. This does not work with ESM (and it gets even more complicated with bundling). For now we made those SDKs CJS only.

Opened #11066 to track ESM support for serverless SDKs.

  1. OpenTelemetry ESM/Bundling/Docs gaps and problems

Overall we've identified some issues with OpenTelemetry instrumentation with ESM (even more problematic when bundling). We created #11070 to track these issues to follow up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

5 participants