Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a nightly pipeline for GA/EUS releases #4046

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

tkoscieln
Copy link
Contributor

@tkoscieln tkoscieln commented Mar 28, 2024

Separate RHEL GA/EUS releases testing to nightly pipeline from the pipeline ran on every PR.

This pull request includes:

  • adequate testing for the new functionality or fixed issue

@tkoscieln tkoscieln force-pushed the optimize_pipeline_execution branch 3 times, most recently from 887bdd8 to 4905193 Compare April 18, 2024 13:41
@tkoscieln tkoscieln marked this pull request as ready for review April 18, 2024 13:43
@tkoscieln
Copy link
Contributor Author

Sample succesfull run of the GA/EUS Nightly pipeline (debug):
https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/pipelines/1258487788

Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rename it to just GA pipeline? The nightly pipeline is called that way because it's testing nightly composes while this is supposed to be testing GA composes which is something else, nightly GA doesn't make sense. So please also update all instances of "ga" and "nightly" used together to just "ga".

The rules don't seem to work, I see many jobs on GA runners on the regular PR pipeline.

.gitlab-ci.yml Outdated Show resolved Hide resolved
.gitlab-ci.yml Show resolved Hide resolved
Copy link
Contributor

@atodorov atodorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

.gitlab-ci.yml Outdated
@@ -143,7 +158,7 @@ Packer:
stage: test
extends: .terraform
rules:
- if: '$CI_PIPELINE_SOURCE != "schedule"'
- !reference [.ga/eus_rules_all, rules]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this also include - !reference [.upstream_rules_all, rules] ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought since the container build runs on rhel 8.9 GA I would move it out of the regular pipeline as well, but I can add it back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, I don't really know what this job does but the old rule means "execute on PRs" (e.g. the job wasn't scheduled via pipeline schedules) while the new one means the opposite: "execute on this particular scheduled pipeline"

.gitlab-ci.yml Outdated
variables:
SCRIPT: regression-include-excluded-packages.sh

regression-old-worker-new-composer:
rules:
- !reference [.ga/eus_rules_all, rules]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing - !reference [.upstream_rules_all, rules] ? or rather shouldn't this be the same as other regression- jobs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, due to both runners being rhel ga releases, I thought it would be best to move them out of the regular pipeline, but can add them back.

.gitlab-ci.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that it would be enough to just call it GA not GA/EUS because there is also ELS and idk what else and it's basically all a subset of GA. Every released RHEL version no matter the support level can still be considered GA. What do you think @atodorov ?

Also the rules don't seem to work, I see many GA jobs in the PR pipeline. The way these rules work by default is that whenever it matches the conditions it will schedule the job. Since you've added these new rules it will enabled to run it in the scheduled GA pipeline but there is nothing preventing it from running on PRs for that I think you'll need to modify the upstream rules to exclude these jobs

@@ -9,7 +9,11 @@ fi

COMPOSE_ID=$(cat COMPOSE_ID)
COMPOSER_NVR=$(cat COMPOSER_NVR)
MESSAGE="\"Nightly pipeline execution on *$COMPOSE_ID* with *$COMPOSER_NVR* finished with status *$1* $2 \n QE: @atodorov, @jrusz\n Link to results: $CI_PIPELINE_URL\n For edge testing status please see https://url.corp.redhat.com/edge-pipelines \""
if [ "$3" == "ga" ]; then
MESSAGE="\"Nightly GA/EUS releases pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MESSAGE="\"Nightly GA/EUS releases pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""
MESSAGE="\"GA/EUS composes pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""

@jrusz
Copy link
Contributor

jrusz commented Apr 30, 2024

So what seems to work is if you take your exclude rule and add it to the upstream rules and build rules for example:

.build_rules:
  rules:
    - if: '$CI_PIPELINE_SOURCE != "schedule" && $SKIP_CI == "false" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'
    - if: '$CI_PIPELINE_SOURCE != "schedule" && $SKIP_CI == "true" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'
      when: manual

.upstream_rules_all:
  rules:
    - if: '$CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'

There is also the upstream rule for x86_64 which will require a little bit different regex and then there are a few jobs which have their own rules defined like aws.sh and some regression jobs. Use https://gitlab.com/redhat/services/products/image-builder/ci/osbuild-composer/-/ci/lint to check out if it works, just paste the whole CI config there, tick the checkbox to Simulate a pipeline created for the default branch and then just search for any -ga jobs that are still there.

@atodorov
Copy link
Contributor

I'm thinking that it would be enough to just call it GA not GA/EUS because there is also ELS and idk what else and it's basically all a subset of GA. Every released RHEL version no matter the support level can still be considered GA. What do you think @atodorov ?

+1 for the proposed simplified naming. Let's use what you propose for now and update it later if need be.

.gitlab-ci.yml Outdated
@@ -131,7 +146,7 @@ Container:
stage: rpmbuild
extends: .terraform
rules:
- !reference [.build_rules, rules]
- !reference [.ga/eus_rules_all, rules]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this change and keep the old rules.

We want to build the container image on pull requests to make sure that the build isn't broken, but we don't need the container when exercising the pipelines against a nightly RHEL compose or against the GA composes.

@tkoscieln tkoscieln marked this pull request as draft April 30, 2024 12:10
Copy link
Contributor

@jrusz jrusz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments in the code. One of the main issues I see right now is that I don't think it will run on main after a PR is merged. It's defined as one of the goals in the Jira ticket, we just want to remove these jobs from running on each PR.

@thozza I wanted to ask you if you agree with this change. The goal here is to not run tests on GA releases on PRs but instead run them only on main and in a schedule similar to current nightly pipelines. This will make PR testing faster with minimum impact to quality if and an issue does occur on some of the GA releases we'll see it reported on main and in Slack along with nightly pipelines.

.gitlab-ci.yml Outdated
- schutzbot/slack_notification.sh SUCCESS ":partymeow:"
- schutzbot/slack_notification.sh SUCCESS ":partymeow:" nightly

GA/EUS_FAIL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover from rename, please change to just GA

.gitlab-ci.yml Outdated
script:
- schutzbot/slack_notification.sh FAILED ":big-sad:" ga

GA/EUS_SUCCESS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover from rename, please change to just GA

@@ -9,7 +9,11 @@ fi

COMPOSE_ID=$(cat COMPOSE_ID)
COMPOSER_NVR=$(cat COMPOSER_NVR)
MESSAGE="\"Nightly pipeline execution on *$COMPOSE_ID* with *$COMPOSER_NVR* finished with status *$1* $2 \n QE: @atodorov, @jrusz\n Link to results: $CI_PIPELINE_URL\n For edge testing status please see https://url.corp.redhat.com/edge-pipelines \""
if [ "$3" == "ga" ]; then
MESSAGE="\"GA/EUS composes pipeline execution finished with status *$1* $2 \n QE: @atodorov, @jrusz, @tkosciel\n Link to results: $CI_PIPELINE_URL \""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover from rename, please change to just GA

.gitlab-ci.yml Outdated
@@ -426,7 +447,7 @@ aws.sh:
extends: .integration
rules:
# Skip rhel-8.4-ga-x86_64
- if: '$CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-8.4-[\S]+/'
- if: '$CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-8.4-[\S]+/ && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this rule was just skipping 8.4 GA and your new rule skips ALL GA releases it should include that one as well and could be replaced instead of adding it on top.

Comment on lines +79 to +88
.ga_rules_all:
rules:
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $NIGHTLY== "false"'
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $NIGHTLY == "false"'

.ga_rules_x86_64:
rules:
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $RUNNER =~ "/^.*(x86_64).*$/" && $NIGHTLY== "false"'
- if: '$CI_PIPELINE_SOURCE == "schedule" && $RUNNER =~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/ && $RUNNER =~ "/^.*(x86_64).*$/" && $NIGHTLY == "false"'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's sufficient to rely on NIGHTLY == false in addition to being a scheduled pipeline. Why not add a new variable like GA == true and check that instead? If we add a new schedule which will also set NIGHTLY=false but will do something else it would trigger these jobs as well.

Also the rules are duplicate, just extra space on the second one.

In addition to this we also want to run all these jobs on main just like before which I don't think will work now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add a new variable like GA == true and check that instead?

@jrusz see #4046 (comment).

If we add a new schedule which will also set NIGHTLY=false but will do something else it would trigger these jobs as well.

That's true but let's worry about it when it happens. Currently this is not what we intend to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fair enough, let's keep that part as it is now.

# WHITELIST
- if: $RUNNER =~ "/^.*(rhel-8.*|rhel-9.*|centos-stream-8|centos-stream-9).*$/" && $CI_PIPELINE_SOURCE != "schedule"
- if: $RUNNER =~ "/^.*(rhel-8.*|rhel-9.*|centos-stream-8|centos-stream-9).*$/" && $CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above # WHITELIST means that the rule specified here is defining which runners are allowed to run this job. The rule you're adding is doing the opposite, it's defining which runner to NOT run this on which would be a # BLACKLIST rule.

We have added these comments in the past above these seemingly complicated rules to indicate what they're trying to do to help anyone looking at this understand what's the rule trying to do. This change you're adding would confuse the user now.

variables:
SCRIPT: regression-excluded-dependency.sh

regression-include-excluded-packages:
extends: .regression
rules:
# BLACKLIST: Skipped on fedora systems
- if: $RUNNER !~ "/^.*(fedora).*$/" && $CI_PIPELINE_SOURCE != "schedule"
- if: $RUNNER !~ "/^.*(fedora).*$/" && $CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment above. The rule addition is applied correctly here as it is a blacklist rule just please update the comment above.

@@ -287,8 +304,9 @@ regression-insecure-repo:
extends: .regression
rules:
# WHITELIST
- if: $RUNNER =~ "/^.*(rhel-*).*$/" && $CI_PIPELINE_SOURCE != "schedule"
- if: $RUNNER =~ "/^.*(rhel-*).*$/" && $CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update to a whitelist rule. See comment above for details.

@@ -297,8 +315,9 @@ regression-no-explicit-rootfs-definition:
extends: .regression
rules:
# BLACKLIST: Skipped on fedora systems
- if: $RUNNER !~ "/^.*(fedora).*$/" && $CI_PIPELINE_SOURCE != "schedule"
- if: $RUNNER !~ "/^.*(fedora).*$/" && $CI_PIPELINE_SOURCE != "schedule" && $RUNNER !~ /[\S]+rhel-[\S]+-(?:(?:ga)|(?:eus))[\S]+/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment. See my comment above for details.

@thozza
Copy link
Member

thozza commented May 3, 2024

@thozza I wanted to ask you if you agree with this change. The goal here is to not run tests on GA releases on PRs but instead run them only on main and in a schedule similar to current nightly pipelines. This will make PR testing faster with minimum impact to quality if and an issue does occur on some of the GA releases we'll see it reported on main and in Slack along with nightly pipelines.

We should keep testing on PRs at least for GA releases used in our SaaS deployment. Otherwise, this LGTM.

@ondrejbudai @croissanne WDYT?

@thozza
Copy link
Member

thozza commented May 21, 2024

So, for the SaaS version, we need to keep testing the latest RHEL-9 GA release (9.4).

WRT the specific test cases, the following ones are relevant for the Saas:

@ondrejbudai @croissanne feel free to add / correct me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants