Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log driver for OpenTelemetry #99

Open
mikehaller opened this issue Nov 29, 2022 · 7 comments
Open

Log driver for OpenTelemetry #99

mikehaller opened this issue Nov 29, 2022 · 7 comments
Assignees
Labels
task Single unit of work

Comments

@mikehaller
Copy link

Currently, the available log driver implementations in container-management are json-file and none.

For the integration of OpenTelemetry, it would be great to have a log driver implementation speaking OLTP and sending container logs to an OpenTelemetry Collector endpoint.

@e-grigorov e-grigorov added the task Single unit of work label Nov 30, 2022
@e-grigorov e-grigorov added this to the M3 milestone Nov 30, 2022
@e-grigorov
Copy link
Contributor

  • OpenTelemetry will bring a bunch of new dependencies i.e. the daemon size will be affected. As a first step here, we have to evaluate the impact.
  • Another issue is that Go implementation has to be checked. It seems that logs are not yet implemented.

@antoniyatrifonova antoniyatrifonova self-assigned this Dec 14, 2022
@antoniyatrifonova
Copy link
Contributor

The OpenTelemetry specification for the Logs is currently not available in a stable state. Fortunately, after a brief research, we can note that the implementation is in an active state. For now, in OpenTelemety specification, the Logs already have a stable data-model and OTLP support. There is an open and active discussion about the API.

In our opinion, for now, it is better to wait for a stable version before looking for alternative options.

@mikehaller
Copy link
Author

even getting the logs as a tcp stream from kanto cm would be helpful.

right now, we have to statically define filenames and watch individual files, which is really cumbersome with random uuids in the filepaths

also, there is no api to get the filename of the logfile. if that would be there, we could at least get the proper filepaths for new containers.

@k-gostev
Copy link
Member

@mikehaller Isn't that what we did in #98, or I am getting it wrong?
Basically you can use kanto-cm logs <container-id>/-n <container-name> and it will fetch the logs using gRPC streaming. You don't need to know the file paths to get the logs of a container anymore.

@mikehaller
Copy link
Author

kant-cm logs is intended for a human user.

how would you do that on a production system where you do not even want to install a cli tool?

how do you want to monitor 50+ containers? spawning 50 separate cli processes just to get the logs? that sounds like an extreme overhead to me.

@dimitar-dimitrow
Copy link
Contributor

dimitar-dimitrow commented Mar 8, 2023

The CLI could be used remotely using the --host flag, so during development it could be used without installing it on the actual device. Also the logs of a container could be fetched directly from the gRPC Logs command without using the CLI.

To get the full logfile path of a container, you need the container manager home directory, the container identifier and the logfile name.

  • The container manager home_dir by default is /var/lib/container-management and is configurable on daemon start up, check here.
  • Containers identifiers could be fetch using the gRPC List command.
  • The logfile name for the json driver is defined in github.com/eclipse-kanto/container-management/containerm/logger/jsonfile.JSONFileLogDriverName.

By default the logfile is located in <home_dir>/containers/<container_id>(e.g. /var/lib/container-management/containers/1e15be49-b587-47fa-aad7-cad16acd1859). The logfile path could be configured per container during creation - check Logging/root_dir in the doc.

Defining a log driver to tcp stream the logs to a remote endpoint is possible. However there are some drawbacks that should be taken in account.

  • When this remote driver is used the logs would not be preserved locally, so no logs would be available through the CLI and the gRPC.
  • Logs would be lost if there is no connection to the remote endpoint.
  • Significant traffic(e.g. many log entries emitted from those 50+ containers) which may become a problem for a resource limited edge device.
  • Major expenses or running out of data quota due to increased traffic.

Fetching the logs after an issue is detected would use less traffic and resources. Do you think that this approach is applicable in your usecase?
If we stick to a tcp streaming the logs, could you provide some requirements(log entries format, tcp or tcp+tls and ect.)?

@mikehaller
Copy link
Author

Interesting discussion, i like where this is going.

So, we have already two high level requirements:

  1. Minimize disk I/O for production, so logging to a json.log file on disk is a no-go for automotive, and i would assume for IoT devices as well.
  2. Minimize traffic / bandwidth to reduce costs

To add:

  1. Minimize maintenance and integration effort by using standard protocol such as OTel
  2. On-demand collection of logs at runtime (no reconfiguration of system or restarting of containers should be necessary)
  3. Streaming of logs (no zipping and uploading, devs need "realtime" streaming during app development for example)

So, how about always attaching stdout and stderr but piping into a ringbuffer. Then, when remote controller activates streaming of logs, the log stream is sent to a collector endpoint. Collector does compression and filtering and then streams to remote endpoint.

You specify a fixed size for the ringbuffer (say 25MB).

What do you think about such an approach?

@k-gostev k-gostev removed this from the M3 milestone May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Single unit of work
Projects
Status: Todo
Development

No branches or pull requests

5 participants