Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add insights instrumentation - events and metrics #539

Merged
merged 76 commits into from
Jun 4, 2024

Conversation

roelbondoc
Copy link
Member

@roelbondoc roelbondoc commented May 2, 2024

This PR adds automatic Insights data collection to the gem for certain libraries.

By default, all automatic data collection is disabled. To enable collection, add the following to your config/honeybadger.yml

insights:
  enabled: true

The list below are the names of the plugins that contain insight metrics. Some plugins will emit events that contain a duration attribute indicating how long that particular event took to invoke. Other plugins will poll for metrics. The polling occurs every second. As with regular plugins, they only load if the required gem is available, and they may be disabled through the skipped_plugins configuration setting.

rails

events

process_action.action_controller
send_file.action_controller
redirect_to.action_controller
halted_callback.action_controller
write_fragment.action_controller
read_fragment.action_controller
expire_fragment.action_controller
render_template.action_view
render_partial.action_view
render_collection.action_view
sql.active_record
process.action_mailer
service_upload.active_storage
service_download.active_storage

active_job

events

enqueue_at.active_job
enqueue.active_job
enqueue_retry.active_job
enqueue_all.active_job
perform.active_job
retry_stopped.active_job
discard.active_job

sidekiq

events

perform.sidekiq
enqueue.sidekiq

metrics

active_workers
active_processes
jobs_processed
jobs_failed
jobs_scheduled
jobs_enqueued
jobs_dead
jobs_retry
queue_latency
queue_depth
queue_busy
capacity
utilization

karafka

events

consumer.consumed.karafka

solid_queue (new)

metrics

jobs_in_progress
jobs_blocked
jobs_failed
jobs_scheduled
jobs_processed
active_workers
active_dispatchers
queue_depth

net_http (new, disabled by default)

events

request.net_http

To enable automatic instrumentation of Net::HTTP requests, add the following to the config:

net_http:
  insights:
    enabled: true

autotuner (new)

events

report.autotuner

metrics

diff.minor_gc_count
diff.major_gc_count
diff.time
request_time
heap_pages

puma (new)

metrics

pool_capacity
max_threads
requests_count
backlog
running

system (new)

events

report.system

Note: The puma metrics are enabled through the puma plugin. This isn't enabled by default, but must be added to your puma.rb file by adding plugin: honeybadger anywhere in the file.

Further Configuration

You may also disable insights for a specific plugin by adding a setting with the plugin name:

sidekiq:
  insights:
    enabled: false

Some plugins also collect metrics for a cluster. These metrics do not need to be collected from every running instance. To disable the collection of cluster metrics, add the following configuration:

sidekiq:
  insights:
    cluster_collection: false

This is available for the sidekiq and solid_queue plugins. It is recommended to enable this for a single instance. If you are running Sidekiq Enterprise and have this enabled, it will only run on the current leader instance.

By default, polled metrics collection occurs every one second, this can be configured per plugin:

sidekiq:
  insights:
    collection_interval: 10

This will modify the collector so that sidekiq metrics are checked every 10 seconds.

Ignoring events:

events:
  ignore:
    - 'enqueue.sidekiq'
    - !ruby/regexp '/.*.active_storage/'

Metrics and Registry

Components

Honeybadger::Registry - This class stores all tracked metrics registered in the gem. The registry ensures that only one instance of each unique metric is stored at any time. A metric is defined as unique by it's type, name, and set of attributes.

Honeybadger::RegistryExecution - This class is injected into the MetricsWorker and is run on a configurable interval. When called, it iterates over all registered metrics, compiles payloads for each, and sent as an event. The registry is then flushed.

Honeybadger::Metric - A base metric class from which all other metric classes inherit from. These metric classes record data and compute/track values as needed.

Shape of the Data

Each metric type's data is shaped differently. Below are examples of what that data looks like for each type.

counter

{
  "counter": 1
}

gauge

{
  "min": 1,
  "max": 10,
  "avg": 5.0,
  "latest": 5
}

timer

{
  "min": 1,
  "max": 10,
  "avg": 5.0,
  "latest": 5
}

Note: This may look the same as a gauge, however, the interface to use the metric provides a timing mechanism.

Example:

time 'some_operation', -> { code_to_be_timed() }

histogram

{
  "bins": [[10.0, 1], [50.0, 10], [100.0, 5]]
}

Changes to events

The event payloads will now include the hostname by default. This can be changed by setting the events.attach_hostname configuration parameter to false.

The event work now has its own config events.max_queue_size which defaults to 10_000. This will allow the agent to backlog more events. The batch size (events.batch_size) default has also been increased to 1_000.

Before submitting a pull request, please make sure the following is done:

  1. If you've fixed a bug or added code that should be tested, add tests!
  2. Run rake spec in the repository root.
  3. Use a pull request title that conforms to conventional commits.

This adds a `Honeybadger::Instrumentation` class to help facilitate
gathering metrics within the gem. It's used by calling class methods for
the various types of metrics.

The `Honeybadger::InstrumentationHelper` module can be included into any
class for a cleaner metric DSL.

For this round of implementaiton, the metrics are simply sent off as
events to Insights for collection.
This adds new configuration options for customizing the embedded
Insights experience in the gem.

`insights.enabled` - Allows you to turn off the event worker, this is
enabled by default.

`insights.metrics` - Allows you to turn off any plugin Insights data
gathering. This includes metrics and plugin events. This is disabled by
default.

Plugin specific configuration changes:

`[plugin name].insights.enabled` - Toggle the plugin Insights data
gather. This includes any metrics, or events the plugin emits.

`[plugin name].insights.cluster_configuration` - Allows you to toggle
cluster collection metrics.

`[plugin name].insights.collection_interval` - Adjust the frequency of
when the collection interval runs for the specified plugin (defaults to
1 per second).
This adds the ability for a plugin to defined a `collect` block, similar
to the `execution` block. During load, these `collect` blocks are
gathered and wrapped in a `CollectionExecution` class and passed to a
queue in a `CollectorWorker`.

The `CollectorWorker` is setup to sychronously execute each collect
block around 1 time per second. A collect block may be configured to
increase it's interval by passing the `interval` option:

```
collect(interval: 10) do
end
```

The above collect block will only be exectued every 10 intervals (about
10 seconds). The `CollectionExecution` class includes the
`Honeybadger::InstrumentationHelper` module, so it has access to any of
the metric methods.

This allows for polling of resources and reporting them to Insights.
This adds subscribers to a few ActiveSupport::Notification events and
emits them to Insights.
This adds a few events and cluster metrics gathering for Sidekiq.
This adds a Puma plugin for Honeybadger.

Metrics on the puma process are gathered in a background thread every 1
second and should work for both single threaded and clustered mode.
@roelbondoc roelbondoc changed the title Insights Instrumentation - events and metrics feature: Add Insights Instrumentation - events and metrics May 3, 2024
@roelbondoc roelbondoc changed the title feature: Add Insights Instrumentation - events and metrics feat: Add Insights Instrumentation - events and metrics May 3, 2024
@roelbondoc roelbondoc changed the title feat: Add Insights Instrumentation - events and metrics feat: add insights instrumentation - events and metrics May 3, 2024
@roelbondoc roelbondoc marked this pull request as ready for review May 9, 2024 15:20
lib/honeybadger/config.rb Outdated Show resolved Hide resolved
@roelbondoc
Copy link
Member Author

There has been some changes since the last approvals. Requesting again for a final review.

lib/honeybadger/plugins/autotuner.rb Show resolved Hide resolved
lib/honeybadger/metric.rb Show resolved Hide resolved
lib/honeybadger/version.rb Outdated Show resolved Hide resolved
Copy link
Contributor

@rabidpraxis rabidpraxis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome @roelbondoc, great work!

I do have one quick question about the agent methods. If I were to instantiate a second agent and then call instrumentation methods on it, would the events report from that agent, or globally?

lib/honeybadger/agent.rb Outdated Show resolved Hide resolved
lib/honeybadger/config/defaults.rb Outdated Show resolved Hide resolved
lib/honeybadger/metric.rb Outdated Show resolved Hide resolved
end

def parsed_uri_data(request_data)
uri = request_data.uri || build_uri(request_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a concern with sensitive data in the URI here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. There could be sensitive data here that a user may not know is being logged. Perhaps we need a mechanism to disable this by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a default so this is disabled by default and must be explicitly enabled:

net_http:
  insights:
    enabled: true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about leaving it enabled but reporting just the host name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me. I'll make it configurable to allow the full uri if set to true.

lib/honeybadger/plugins/rails.rb Show resolved Hide resolved
lib/honeybadger/agent.rb Outdated Show resolved Hide resolved
@roelbondoc
Copy link
Member Author

I do have one quick question about the agent methods. If I were to instantiate a second agent and then call instrumentation methods on it, would the events report from that agent, or globally?

That's a good question. Right now some of the internals rely on the global singleton agent, but with a few tweaks this shouldn't be too hard to fix.

@roelbondoc roelbondoc requested a review from stympy May 31, 2024 17:49
Copy link
Member

@joshuap joshuap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @roelbondoc. Can't wait to start using this.

},
:'net_http.insights.enabled' => {
description: 'Allow automatic instrumentation of Net::HTTP requests.',
default: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this supposed to be false by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a conversation about disabling it by default, but ultimately we decided to enable it by default, but only log the domain, with an optional configuration to log the full path.

@roelbondoc roelbondoc merged commit d173ac5 into master Jun 4, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants