Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding a specific build log is incredibly slow #486

Open
tjprescott opened this issue Feb 15, 2024 · 1 comment
Open

Finding a specific build log is incredibly slow #486

tjprescott opened this issue Feb 15, 2024 · 1 comment

Comments

@tjprescott
Copy link
Member

I have a script where I am trying to retrieve build logs for all PRs we have that are failing so I can parse the logs and collate the data.

I can get all of the PRs that are failing a particular check with the Github API pretty easily. I end up with detail links which all point to our internal ADO project.

It when I need to pull the logs from ADO that things get really slow. The code basically looks like this:

# PHASE 2: Collect logs for failed LintDiff runs from Azure DevOps
failure_logs = []
ado = get_ado_client()
for run in failed_lint_diff_runs:
    link = run.details_url
    build_id = get_build_id(link)
    build_client = ado.clients_v7_1.get_build_client()
    build_logs = build_client.get_build_logs("internal", build_id)
    for log in build_logs:
        lines = build_client.get_build_log_lines("internal", build_id, log.id)
        if "Starting: LintDiff" in lines[0]:
            print(f"Found LintDiff log for build {build_id} at log {log.id}")
            failure_logs.append(lines)
            break

When I run this, it takes about 20 minutes to run... all of the time is spent repeatedly calling get_build_log_lines to examine just the first line of the build log to see if it is the one I care about. The output looks like:

PRs with LintDiff failures: 16
Found LintDiff log for build 3497966 at log 297
Found LintDiff log for build 3493487 at log 375
Found LintDiff log for build 3497244 at log 242
Found LintDiff log for build 3471732 at log 295
Found LintDiff log for build 3468386 at log 417
Found LintDiff log for build 3468168 at log 395
Found LintDiff log for build 3459808 at log 307
Found LintDiff log for build 3475219 at log 296
Found LintDiff log for build 3469709 at log 307
Found LintDiff log for build 3497648 at log 310
Found LintDiff log for build 3462035 at log 212
Found LintDiff log for build 3370030 at log 342
Found LintDiff log for build 3349947 at log 391
Found LintDiff log for build 3328138 at log 306
Found LintDiff log for build 3327852 at log 304
Found LintDiff log for build 3327331 at log 306

You can see, we have a lot of build logs, so much of the time is wasted. I couldn't find anything about the build log object that is identifiable. The ID changes all the time, and while in the UI it has a consistent name, that name doesn't seem to be included here. If it were, it would save a ton of time.

Is there a better way to collect this information?

@tjprescott tjprescott changed the title Find a specific build log is incredibly slow Finding a specific build log is incredibly slow Feb 15, 2024
@konrad-jamrozik
Copy link
Member

@tjprescott for LintDiff build logs you can query our pipeline witness Kusto table.

Example query:

let start = ago(6h);
let end = now();
let buildIds = "*";
SpecsPipelinesBuildsJobLogs("LintDiff", start, end, buildIds)
| where Message has "Starting: LintDiff"

This is using my function of SpecsPipelinesBuildsJobLogs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants