Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for mix&match pipelines #1954

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

dbacarel
Copy link
Contributor

@dbacarel dbacarel commented Apr 27, 2024

This PR wants to address the use-case of mix matching pipelines and targets, see discussion #1302.
Currently only pipeline or targets can be defined.

As a stretch, the current prototype proposes to introduce support for named pipelines that can be referred within the definition of a target.

  const transport = pino.transport({
    targets: [
      {
        target: join(__dirname, '..', 'fixtures', 'to-file-transport.js'),
        options: { destination: '/path/to/A'},
        pipeline: 'pipelineA'
      },
      {
        target: join(__dirname, '..', 'fixtures', 'to-file-transport.js'),
        options: { destination: '/path/to/B' }
      },
      {
        target: join(__dirname, '..', 'fixtures', 'to-file-transport.js'),
        options: { destination: '/path/to/C'},
        pipeline: 'pipelineB'
      }
    ],
    pipelines: {
      pipelineA: [
        {
          target: join(__dirname, '..', 'fixtures', 'transform-1.js')
        },
        {
          target: join(__dirname, '..', 'fixtures', 'transform-2.js')
        }
      ],
      pipelineB: [
        {
          target: join(__dirname, '..', 'fixtures', 'transform-3.js')
        },
        {
          target: join(__dirname, '..', 'fixtures', 'transform-4.js')
        }
      ]
    }
  })

More details will follow

Update

The idea of named pipelines was abandoned.
This PR extends targets to support mixing in pipeline definitions within the array of transport configurations

Example:

const pino = require('pino')
const transport = pino.transport({
  targets: [{
    level: 'info',
    target: 'pino-pretty' // must be installed separately
  }, {
    level: 'trace',
    target: 'pino/file',
    options: { destination: '/path/to/store/logs' }
  }, {
    pipeline: [{
      target: 'pino-syslog' // must be installed separately
    }, {
      target: 'pino-socket' // must be installed separately
    }]
  }
  ]
})
pino(transport)

@mcollina
Copy link
Member

I think this is too complex, would avoid the "naming" altogether.

@dbacarel
Copy link
Contributor Author

dbacarel commented May 1, 2024

One of the advantages that I saw in adopting named pipeline would have been enabling the implementation of one input, multiple outputs, similarly to tee. Potentially, the transformation for an input would have been produced only once (for a given pipeline) and only then passed through to the destination target(s).

On the user side, this simplifies the configuration since the pipeline requires to be defined only once.
On the implementation side, it would have been easier to identify identical pipes, if the optimization mentioned above were to be implemented.

Besides the technical aspects, I also understand that it also comes down to how common such use case is so, I don't feel like pushing this idea too much.

@mcollina
Copy link
Member

mcollina commented May 1, 2024

I feel most devs won't understand that.

@dbacarel dbacarel changed the title Adding support for named, mix&match pipelines Adding support for mix&match pipelines May 4, 2024
@dbacarel
Copy link
Contributor Author

dbacarel commented May 4, 2024

Pity but fair enough.

I'll revert the changes related to the named pipelines in favor to a configuration like the following one, looks intuitive enough to me.

const pino = require('pino')
const transport = pino.transport({
  targets: [{
    level: 'info',
    target: 'pino-pretty' // must be installed separately
  }, {
    level: 'trace',
    target: 'pino/file',
    options: { destination: '/path/to/store/logs' }
  }, {
    pipeline: [{
      target: 'pino-syslog' // must be installed separately
    }, {
      target: 'pino-socket' // must be installed separately
    }]
  }
  ]
})
pino(transport)

dbacarel and others added 5 commits May 5, 2024 11:52
- Implemented support for mixed target&pipeline definitions  within `targets` in `transport.js`
- Merged logic from both `worker.js` and `worker-pipeline.js` into `worker.js`
- Fixed `pipeline.test.js`
- Fixed docs to reflect changes above

TODO:
 - Remove `worker-pipeline.js`
 - Fix `transport.js` to use only `worker.js`
 - Fix related docs
 - Fix UTs
- Updated docs to remove mentions of `worker-pipeline.js`
- Fixed failing UTs
- Fixed `transport.js` to use only `worker.js` also when `pipeline` is defined
- Fixed `worker.js` to work properly when only `pipeline` is defined
@dbacarel dbacarel marked this pull request as ready for review May 5, 2024 11:46
@dbacarel
Copy link
Contributor Author

dbacarel commented May 5, 2024

Removing the Draft mark as the code is now in a more decent shape for some feedback.

Besides the implementation of the support for pipelines illustrated above, I also thought to get rid of worker-pipeline and unify the logic into one worker only, worker.js.

The following schema illustrates how the streams in targets with pipelines and pipeline are handled in the new worker

┌──────────────────────────────────────────────────┐   ┌─────┐                
│                                                  │   │     │                
│                                                  │   │  p  │                
│                   target    ┌───────────────┐    │   │  i  │                
│                 ──────────► │               │    │   │  n  │                
│   targets         target    │  pino.        │    │   │  o  │                
│ ────────────►   ──────────► │  multistream  ├────┼──►│  .  │       source   
│                   target    │               │    │   │  m  │         │      
│                 ──────────► └───────────────┘    │   │  u  │         │write 
│                                                  │   │  l  │         ▼      
│                  pipeline   ┌───────────────┐    │   │  t  │      ┌────────┐
│                 ──────────► │ PassThrough   ├────┼──►│  i  ├──────┤        │
│                             └───────────────┘    │   │  s  │ write│ Thread │
│                                                  │   │  t  │◄─────┤ Stream │
│                  pipeline   ┌───────────────┐    │   │  r  │      │        │
│                 ──────────► │ PassThrough   ├────┼──►│  e  │      └────────┘
│                             └───────────────┘    │   │  a  │                
│                                                  │   │  m  │                
│                                                  │   │     │                
│                        OR                        │   │     │                
│                                                  │   │     │                
│  pipeline     ┌──────────────┐                   │   │     │                
│ ────────────► │ PassThrough  ├───────────────────┼──►│     │                
│               └──────────────┘                   │   │     │                
│                                                  │   │     │                
└──────────────────────────────────────────────────┘   └─────┘                

There're probably some naïve decisions here with impacts I don't see, for example pipelines defined via pipeline property now need to go through an additional stream layer represented by pino.multistream.

@mcollina
Copy link
Member

mcollina commented May 7, 2024

pino.multistream() has a few additional consideration (such as the minimum level), that makes this architecture a bit more fragile.

I would recommend:

  • moreving the 2nd multistream for all targets, just reusing the top one.
  • add a special cases for 1 target or 1 pipeline that bypass all the heavy machinery, essentially being used by ThreadStream directly (same as it is right now).

@dbacarel
Copy link
Contributor Author

dbacarel commented May 9, 2024

pino.multistream() has a few additional consideration (such as the minimum level), that makes this architecture a bit more fragile.

Yes, that's something I wanted follow up a bit as I wasn't sure about the implications of logging level vs pipelines in this context. I'll read more about this. Thanks for the heads up.

About the recommended changes:

moreving the 2nd multistream for all targets, just reusing the top one.

There is actually one layer of pino.multistream, the current state looks in fact like this

┌────────────────────────────────────────────────┐     ┌─────┐                
│                                                │     │     │                
│                                                │     │     │                
│                   target                       │     │     │                
│               │ ───────────────────────────────┼────►│     │                
│   targets     │   target                       │     │  p  │                
│ ────────────► │ ───────────────────────────────┼────►│  i  │       source   
│               │   target                       │     │  n  │         │      
│               │ ───────────────────────────────┼────►│  o  │         │write 
│               │                                │     │  .  │         ▼      
│               │  pipeline   ┌───────────────┐  │     │  m  │      ┌────────┐
│               │ ──────────► │ PassThrough   ├──┼────►│  u  ├──────┤        │
│               │             └───────────────┘  │     │  l  │ write│ Thread │
│               │                                │     │  t  │◄─────┤ Stream │
│               │  pipeline   ┌───────────────┐  │     │  i  │      │        │
│               │ ──────────► │ PassThrough   ├──┼────►│  s  │      └────────┘
│                             └───────────────┘  │     │  t  │                
│                                                │     │  r  │                
│                                                │     │  e  │                
│                        OR                      │     │  a  │                
│                                                │     │  m  │                
│  pipeline                   ┌───────────────┐  │     │     │                
│ ───────────────────────────►│ PassThrough   ├──┼────►│     │                
│                             └───────────────┘  │     │     │                
│                                                │     │     │                
└────────────────────────────────────────────────┘     └─────┘                

I drew the wrong schema, my bad.

add a special cases for 1 target or 1 pipeline that bypass all the heavy machinery, essentially being used by ThreadStream directly (same as it is right now).

Makes sense, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants