Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

err serializer should not copy every key #27

Open
stephenh opened this issue May 24, 2020 · 10 comments
Open

err serializer should not copy every key #27

stephenh opened this issue May 24, 2020 · 10 comments

Comments

@stephenh
Copy link

stephenh commented May 24, 2020

On every project I've used pino on (2 :-) ), we end up using our own err serializer that just does:

function errSerializer(err: Error): unknown {
  return {
    type: err.constructor.name,
    message: err.message,
    stack: err.stack,
  };
}

Because the for key in err behavior of the current serializer invariably ends up walking an Error that points to a connection pool or an HTTP request or an ORM metadata field or a Buffer that was not intended to put JSON.stringified and we get huge/nonsensical/corrupt output in our logs.

(The latest instance actually caused our app to crash b/c of an infinite loop in the fast-safe-stringify library: davidmarkclements/fast-safe-stringify#46 (comment))

So, admittedly based on my N=2, I think the err serializer should be changed to either:

  1. This above verison that just does type, message, stack, or
  2. Make the for key in err smarter and have it only copy over values that are one of:
    • a) primitives,
    • b) have a toJSON property, or
    • c) a is-plain-object (using that npm package) and then apply this same criteria to each value in that object (either as deep as it needs to go or for a fixed number of levels)

I realize both of these are breaking changes, but basically I think err serializer (and really everything in pino that leads to objects getting JSON.stringify'd) should not stringify classes unless it explicitly has a toJSON method on it because its not known whether the class really intended to have its potentially-large/circular 'private' data put on the wire.

(Basically only primitives, object literals, i.e. is-plain-object, and classes with toJSON, are fine to make this assumption.)

As an alternative given the breaking change, maybe having some sort of flag like doNotStringifyClasses (bad name, but you get the idea) on the top-level pino config that switched to different err and formatter behavior and recommend that in the docs as a better default going forward.

@mcollina
Copy link
Member

I think the idea for allowing a custom Error serializer is to cater to situations where libraries do not really keep things simple (more often than not, ORM).

I'm ok in changing the default serializers to what you propose. Can you please send a PR?

(Overall I would recommend not to use any ORM. Most people contributing to this library do not use them, and support for ORMs object is probably very low in our priority list.)

cc @jsumners @davidmarkclements

@stephenh
Copy link
Author

stephenh commented May 24, 2020

Overall I would recommend not to use any ORM

Haha, okay, well let's pretend I didn't mention that. :-)

Sure, I can work on a PR.

I had proposed two potential behaviors (my option 1 ^ was "only copy the 3 values", and option 2 ^ was "only copy primitive/object-literal/toJSON-aware things"). Do you have a preference for which one?

Option 1 is a lot simpler, but option 2 might be more useful in outputting "likely safe/useful" things that the application programmer would like to see in their log output.

@mcollina
Copy link
Member

Just go with option 1.

@jsumners
Copy link
Member

I'm pretty sure the current implementation was directly ported from the original implementation in core pino. I didn't write that implementation so I don't know the thinking around it. But I imagine it is about supporting the widest number of error objects possible in a generic way. For example, the current implementation would cover errors like:

{
  message: 'foo',
  cause: {
    message: 'original error',
   }
}

Whereas the proposed solution would not.

@mcollina
Copy link
Member

That's the likely reason we implemented it in that way @jsumners.
Specifically, there are a few interesting ones, and possibly having a stricter approach would not work at all.

Let's consider https://www.npmjs.com/package/http-errors, https://www.npmjs.com/package/@hapi/boom and others, they add several properties that we currently support.

I think we can consider libraries that add a connection pool to the error problematic in so many ways and they should likely be fixed. A better solution is to document this in our FAQ, so that people can find some guidance in the docs.

@robmcguinness
Copy link
Contributor

Joi validations errors can also print massive stack traces to console using the current err serliazer. If this isn't been worked on I can submit a PR.

@jsumners
Copy link
Member

jsumners commented Jul 6, 2020

A PR for documentation would be welcome. Basically, the situations described in this thread should be using the wrapper method to customize their serializer.

@voxpelli
Copy link
Contributor

I can make a PR for this after #110 is merged

@rektide
Copy link
Contributor

rektide commented Jul 22, 2022

I'd greatly prefer we find some way to help those who have too much information in their errors work through their problems, without adding additional logic intent on reducing Pino's versatility in getting us info. Pino has been wonderful about giving us a lot of freedom to make good errors, in whatever shape we like, that have specific information on the problem, that all shows up in the log. I would be very sad to see it change modes into being an extremely narrow straw of information that comes out. Such dogmatism does not suit the majority of users.

I surveyed HTTP client errors recently, to see who did what with results & specifically errors. I can definitely see & appreciate the challenge of something like axios http client errors being completely overwhelming. But I don't think Pino should change itself & start stripping out almost all information to serve the extreme cases. Better documentation to support difficult situations, & links to alternative serializers that have more restrictive/special-purpose needs feels in order.

Going further, having some toolkit available in pino-std-serializers to suppress certain kinds of things doesn't sound offensive. Ignoring certain property names, values of certain Classes, objects bearing some symbol... various things have been suggested. Picking what specifically to do seems not-super-critical but also non-trivial; it makes me want to see some links to alternative serializers, and a couple years of soak time/exploration, & maybe then a decision made on what to bring in-to "core" pino libraries.

I wrote about some of the impacts of recent work in #110 where we began throwing away most .cause data, which has seriously hampered usability. This issue asks for yet more winnowing down, which seems detrimental to most users. I hope we keep pino-std-serializers useful for not just minimal, tiny, specific use cases, but continue to keep it as an informative, helpful tool for most users. We have issues like #26 showing this is what users expect (and has been what we've given them, to few complaints).

@voxpelli
Copy link
Contributor

As was mentioned in #110 (comment) the design goal is to:

optimize for usefulness per size (message and stack are concatenated but causes are not serialized in full)

So pretty much a 80/20 split. Do a logging that makes sense for 80% of the use cases and leave the ones in need for more with extension points so that users can add the extra things they need on top.

start stripping out almost all information to serve the extreme cases

“Extreme” here feels subjective. What is extreme to one person can be common for another.

My personal view is that additional properties on Error are rarely surfaced anywhere else in the ecosystem. If you do a console.log() or similar none of that data appears. Only the .message, the .stack and sometimes the .name gets exposed typically.

So relying on such properties for logging and similar is the extreme use case to me.

Such properties are great for programmatic use when one wants to figure out what an error is. But for logging I think one should then either make custom serializers for those error types and/or catch them in one’s error handler and do custom logging there.

Teaching people that the addition of properties to Error objects is the proper way to log things is not something I think rhymes well with the wider ecosystem.

Though: I’m not a maintainer of this module of course, so this is just my personal thoughts :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants