Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable/disable some e2e tests; rm TODOs; rework regex #4350

Merged
merged 4 commits into from Nov 19, 2020

Conversation

kolyshkin
Copy link
Collaborator

@kolyshkin kolyshkin commented Nov 6, 2020

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

TL;DR:

  • filter out some flaky tests in e2e-features (less flakes in CI 馃帀)
  • remove some (obsoleted?) elements from the skip regex in e2e
  • simplify adding tests to skip (improve maintainability 馃帀)
  1. contrib/test/int/e2e: rm obsoleted TODO

    Commit 7d2bde1 (Dec 8 2017) adds the TODO item, and excludes
    the "should.support.inline.execution.and.attach" test case.

    Now, commit 54d1971 (Jun 24 2019) removes this test from the
    exclude regexp, but forgots to remove the TODO.

  2. contrib/test/int/e2e: rework skip regex

    In its current form, it is very hard to read, maintain, or make any
    sense out of.

    Play with ansible/jinja a bit to enable much simpler way to list
    the tests to skip.

    The downside is we have to play some tricks to create a regular
    expression:

    1. join the elements of skip_tests via |
    2. replace spaces with \\s
    3. prepend square brackets and parentheses with \\

    This commit also removes some apparently obsoleted elements from the regexp.

  3. contrib/test/int/e2e-features: rework "skip" regex
    

See the previous commit for details.

  1. contrib/test/int/e2e-features: skip Serial tests

    This is mostly done to fix the flake caused by the test case

    Kubernetes e2e suite: [sig-network] IngressClass [Feature:Ingress] should not set default value if no default IngressClass [Serial]

    which fails like this:

    ingressclasses.networking.k8s.io "ingressclass1" already exists

    This test depends on a system state and thus can't be run in parallel.
    It was marked as [Serial] recently (see Mark default ingressclass tests serial, do not set default ingressclass in conformance聽kubernetes/kubernetes#93427) but I just found out
    we do not skip Serial tests in e2e-features.

    Fix this.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

Does this PR introduce a user-facing change?

None

@openshift-ci-robot openshift-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Nov 6, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2020
@kolyshkin kolyshkin changed the title [WIP]] re-enable some e2e tests; rm TODOs [WIP] re-enable some e2e tests; rm TODOs Nov 6, 2020
@codecov
Copy link

codecov bot commented Nov 6, 2020

Codecov Report

Merging #4350 (22ce1d7) into master (0fcc2a5) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #4350   +/-   ##
=======================================
  Coverage   39.29%   39.29%           
=======================================
  Files         112      112           
  Lines        8835     8835           
=======================================
  Hits         3472     3472           
  Misses       4977     4977           
  Partials      386      386           

@kolyshkin kolyshkin force-pushed the e2e-nits branch 2 times, most recently from 2e33868 to 6e738b8 Compare November 6, 2020 03:35
@kolyshkin
Copy link
Collaborator Author

OK so I managed to insert some invisible unicode charaters into e2e.yml on which Ansible choked. This is supposedly fixed now.

@kolyshkin
Copy link
Collaborator Author

Using the following machinery to see the failed tests (from downloaded e2e.log files):

$ grep -a '"failures"' ~/tmp/e2e*.log  | sed 's/^.*"failures":\["//' | sed 's/"\]}$//' | sed 's/","/\n/g' | sort | uniq -c | sort -nr

which gives me quite a lot (now with --ginkgo.skip=\[Slow\]|\[Flaky\]).

Next thing, we can add a grep to test the skip regex, e.g.

$  grep -a '"failures"' ~/tmp/e2e*.log  | sed 's/^.*"failures":\["//' | sed 's/"\]}$//' | sed 's/","/\n/g' | sort | uniq -c | sort -nr | grep -vE '\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature|\[sig-instrumentation\] MetricsGrabber'
    101 [sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (default fs)] subPath should support readOnly file specified in the volumeMount [LinuxOnly]
     67 [sig-storage] PersistentVolumes-local  [Volume type: block] One pod requesting one prebound PVC should be able to mount volume and write from pod1
     55 [sig-storage] PersistentVolumes-local  [Volume type: dir-link] One pod requesting one prebound PVC should be able to mount volume and write from pod1
     55 [sig-network] EndpointSliceMirroring should mirror a custom Endpoints resource through create update and delete
     47 [sig-storage] In-tree Volumes [Driver: hostPathSymlink] [Testpattern: Inline-volume (default fs)] subPath should support existing directory
     44 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: block] [Testpattern: Pre-provisioned PV (default fs)] subPath should support file as subpath [LinuxOnly]
     43 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: dir] [Testpattern: Pre-provisioned PV (default fs)] subPath should support existing directories when readOnly specified in the volumeSource
     33 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: dir-link] [Testpattern: Pre-provisioned PV (default fs)] subPath should support existing directory
     25 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: dir-bindmounted] [Testpattern: Pre-provisioned PV (default fs)] subPath should support existing single file [LinuxOnly]
     15 [sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] listing validating webhooks should work [Conformance]
     11 [sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (block volmode)] provisioning should provision storage with pvc data source
      7 [sig-storage] PersistentVolumes-local  [Volume type: block] Two pods mounting a local volume one after the other should be able to write from pod1 and read from pod2
      7 [sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (block volmode)] volumes should store data

Considering some tests were disruptive, let's redo CI with this skip regex...

@kolyshkin
Copy link
Collaborator Author

Also, the way skip regex is specified is horrendous. Surely we can add some bash magic to create a regex out of array, but I want to try this first: onsi/ginkgo#735

@kolyshkin
Copy link
Collaborator Author

$ grep -a '"failures"' ~/tmp/e2e*.log  | sed 's/^.*"failures":\["//' | sed 's/"\]}$//' | sed 's/","/\n/g' | sort | uniq -c | sort -nr
    459 [sig-network] Services should have session affinity work for NodePort service [LinuxOnly] [Conformance]
    387 [sig-network] Services should be able to switch session affinity for NodePort service [LinuxOnly] [Conformance]
    328 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: block] [Testpattern: Pre-provisioned PV (block volmode)] volumes should store data
    254 [sig-storage] PersistentVolumes-local  [Volume type: block] One pod requesting one prebound PVC should be able to mount volume and write from pod1
    237 [sig-network] Services should have session affinity timeout work for NodePort service [LinuxOnly] [Conformance]
    174 [sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (block volmode)] provisioning should provision storage with pvc data source
    164 [sig-storage] PersistentVolumes-local  [Volume type: block] One pod requesting one prebound PVC should be able to mount volume and read from pod1
    150 [sig-storage] PersistentVolumes-local  [Volume type: block] Two pods mounting a local volume at the same time should be able to write from pod1 and read from pod2
    102 [sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (block volmode)] volumes should store data
     50 [sig-storage] PersistentVolumes-local  [Volume type: block] Two pods mounting a local volume one after the other should be able to write from pod1 and read from pod2

Let's see if I can have onsi/ginkgo#736 merged and list individual tests in a sane manner (or maybe don't wait but add some bash machinery to make a regex out of list).

@kolyshkin
Copy link
Collaborator Author

Mocked up a better-looking way to specify the tests to skip. Not an Ansible expert here, so not sure if it'll work.

@kolyshkin kolyshkin force-pushed the e2e-nits branch 2 times, most recently from 79fb469 to d9536af Compare November 8, 2020 07:41
@haircommander
Copy link
Member

/retest

@fidencio
Copy link
Contributor

@kolyshkin, LGTM! Thanks for improving the readability!

@haircommander
Copy link
Member

is this ready @kolyshkin

@kolyshkin
Copy link
Collaborator Author

is this ready @kolyshkin

not yet -- sorry, I am on vacation but will try to finish

@kolyshkin kolyshkin force-pushed the e2e-nits branch 3 times, most recently from a28a521 to f838b84 Compare November 13, 2020 20:55
@kolyshkin
Copy link
Collaborator Author

wow, this finally works!

@kolyshkin
Copy link
Collaborator Author

/retest

@kolyshkin kolyshkin changed the title [WIP] re-enable some e2e tests; rm TODOs re-enable some e2e tests; rm TODOs Nov 14, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 14, 2020
@kolyshkin
Copy link
Collaborator Author

/retest

(e2e-aws is still supposed to fail atm)

@kolyshkin
Copy link
Collaborator Author

/test integration_fedora

@haircommander
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 17, 2020
@kolyshkin
Copy link
Collaborator Author

/test e2e_features_fedora
/test e2e-aws

@kolyshkin
Copy link
Collaborator Author

/test e2e_crun

Commit 7d2bde1 (Dec 8 2017) adds the TODO item, and excludes
the "should.support.inline.execution.and.attach" test case.

Now, commit 54d1971 (Jun 24 2019) removes this test from the
exclude regexp, but forgots to remove the TODO.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In its current form, it is very hard to read, maintain, or make any
sense out of.

Play with ansible/jinja a bit to enable much simpler way to list
the tests to skip.

The downside is we have to play some tricks to create a regular
expression:

 1. join the elements of skip_tests via |
 2. replace spaces with \\s
 3. prepend square brackets and parentheses with \\

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Nov 18, 2020
@kolyshkin
Copy link
Collaborator Author

/test e2e_cgroupv2
/test e2e-gcp

See the previous commit for details.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is mostly done to fix the flake caused by the test case

> Kubernetes e2e suite: [sig-network] IngressClass [Feature:Ingress] should not set default value if no default IngressClass [Serial]

which fails like this:

> ingressclasses.networking.k8s.io "ingressclass1" already exists

This test depends on a system state and thus can't be run in parallel.
It was marked as [Serial] recently (see [1]) but I just found out
we do not skip Serial tests in e2e-features.

Fix this.

[1] kubernetes/kubernetes#93427

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin kolyshkin changed the title re-enable some e2e tests; rm TODOs enable/disable some e2e tests; rm TODOs; rework regex Nov 19, 2020
@kolyshkin
Copy link
Collaborator Author

contrib/test/int/e2e-features: skip Serial tests

Some context to this: #4253 (comment)

@fidencio
Copy link
Contributor

/retest
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 19, 2020
@openshift-merge-robot
Copy link
Contributor

openshift-merge-robot commented Nov 19, 2020

@kolyshkin: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/openshift-jenkins/e2e_crun_cgroupv2 22ce1d7 link /test e2e_cgroupv2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kolyshkin, mrunalp, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [kolyshkin,mrunalp,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 2428fcc into cri-o:master Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants