Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine readable drenv results #1306

Open
nirs opened this issue Mar 31, 2024 · 0 comments
Open

Machine readable drenv results #1306

nirs opened this issue Mar 31, 2024 · 0 comments
Labels
enhancement New feature or request test Testing related issue

Comments

@nirs
Copy link
Member

nirs commented Mar 31, 2024

When drenv fails we have good logging for human:

drenv.commands.Error: Command failed:
   command: ('addons/ocm-cluster/start', 'dr1', 'hub')
   exitcode: 1
   error:
      Traceback (most recent call last):
        File "/home/nsoffer/ramen/test/addons/ocm-cluster/start", line 173, in <module>
          deploy(cluster_name, hub_name)
        File "/home/nsoffer/ramen/test/addons/ocm-cluster/start", line 60, in deploy
          wait_for_managed_cluster(cluster, hub)
        File "/home/nsoffer/ramen/test/addons/ocm-cluster/start", line 135, in wait_for_managed_cluster
          kubectl.wait(
        File "/home/nsoffer/ramen/test/drenv/kubectl.py", line 141, in wait
          _watch("wait", *args, context=context, log=log)
        File "/home/nsoffer/ramen/test/drenv/kubectl.py", line 157, in _watch
          for line in commands.watch(*cmd, input=input):
        File "/home/nsoffer/ramen/test/drenv/commands.py", line 155, in watch
          raise Error(args, error, exitcode=p.returncode)
      drenv.commands.Error: Command failed:
         command: ('kubectl', 'wait', '--context', 'hub', 'managedcluster/dr1', '--for=jsonpath={.spec.hubAcceptsClient}=true', '--timeout=60s')
         exitcode: 1
         error:
            error: timed out waiting for the condition on managedclusters/dr1

But this is not helpful when trying to analyze 300 runs. We need a machine readable format that can be consumed by a program to compute:

  • number of errors
  • which addon produce most of the errors?
  • which command in the addon produces most of the errors?
  • which time of day most has most errors?
  • run time stats (avg, min, max)

For this example we could use something like:

$ drenv start --output json envs/regional-dr.yaml
{
  "error": {
    "command": [
      "addons/ocm-cluster/start",
      "dr1",
      "hub"
    ],
    "exitcode": 1,
    "error": {
      "command": [
        "kubectl",
        "wait",
        "--context",
        "hub",
        "managedcluster/dr1",
        "--for=jsonpath={.spec.hubAcceptsClient}=true",
        "--timeout=60s"
      ],
      "exitcode": 1,
      "error": "error: timed out waiting for the condition on managedclusters/dr1"
    }
  },
  "time": 447,
  "started": "2024-03-30 20:11:18.147614957 -0400",
  "finished": "2024-03-30 20:18:48.656273959 -0400"
}

The timing info can be computed by a program running drenv, but computing this in drenv makes it easier to collect this info in all environment.

The default format can be yaml, using the same internal representation:

error:
  command:
  - addons/ocm-cluster/start
  - dr1
  - hub
  exitcode: 1
  error:
    command:
    - kubectl
    - wait
    - --context
    - hub
    - managedcluster/dr1
    - --for=jsonpath={.spec.hubAcceptsClient}=true
    - --timeout=60s
    exitcode: 1
    error: 'error: timed out waiting for the condition on managedclusters/dr1'
time: 447
started: 2024-03-30 20:11:18.147614957 -0400
finished: 2024-03-30 20:18:48.656273959 -0400
@nirs nirs added enhancement New feature or request test Testing related issue labels Mar 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request test Testing related issue
Projects
None yet
Development

No branches or pull requests

1 participant