Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing latest changes from upstream main for ramen #269

Merged
merged 16 commits into from
May 16, 2024
Merged

Conversation

df-build-team
Copy link

PR containing the latest commits from upstream main branch

rakeshgm and others added 16 commits May 14, 2024 13:54
add ConfigMap under dr-cluster kustomize
transformer to update label "app: ramen-dr-cluster"

Signed-off-by: rakeshgm <rakeshgm@redhat.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
When fetching the same cache item concurrently, for example when from
same addon on 2 clusters, or addon and fetch cron job running
concurrently,  one fetcher can delete the temporary file used by the
other fetcher, causing this error:

    drenv.commands.Error: Command failed:
       command: ('addons/rook-cephfs/start', 'dr1')
       exitcode: 1
       error:
          Traceback (most recent call last):
            File "/home/.../go/src/github.com/ramendr/ramen/test/addons/rook-cephfs/start", line 46, in <module>
              deploy(cluster)
            File "/home/.../go/src/github.com/ramendr/ramen/test/addons/rook-cephfs/start", line 17, in deploy
              cache.fetch(".", path)
            File "/home/.../go/src/github.com/ramendr/ramen/test/drenv/cache.py", line 28, in fetch
              os.rename(tmp, dest)
          FileNotFoundError: [Errno 2] No such file or directory: '/home/.../.cache/drenv/addons/rook-cephfs.yaml.tmp'
                             -> '/home/.../.cache/drenv/addons/rook-cephfs.yaml'

Fixed by using temporary file per process. If we have 2 fetchers, the
last one will win, renaming its temporary file to the actual cache file.

Example run with multiple fetchers:

    $ drenv clear
    2024-05-13 00:15:59,145 INFO    [main] Clearing cache
    2024-05-13 00:15:59,146 INFO    [main] Cache cleared in 0.00 seconds

    $ for i in 1 2 3 4; do (drenv fetch envs/regional-dr.yaml &); done
    2024-05-13 00:15:59,318 INFO    [rdr] Fetching
    2024-05-13 00:15:59,320 INFO    [rdr] Running addons/rook-operator/fetch
    2024-05-13 00:15:59,321 INFO    [rdr] Fetching
    2024-05-13 00:15:59,322 INFO    [rdr] Running addons/rook-cluster/fetch
    2024-05-13 00:15:59,322 INFO    [rdr] Running addons/rook-toolbox/fetch
    2024-05-13 00:15:59,323 INFO    [rdr] Running addons/rook-operator/fetch
    2024-05-13 00:15:59,323 INFO    [rdr] Running addons/rook-cephfs/fetch
    2024-05-13 00:15:59,323 INFO    [rdr] Running addons/recipe/fetch
    2024-05-13 00:15:59,323 INFO    [rdr] Running addons/csi-addons/fetch
    2024-05-13 00:15:59,325 INFO    [rdr] Running addons/rook-cluster/fetch
    2024-05-13 00:15:59,325 INFO    [rdr] Running addons/rook-toolbox/fetch
    2024-05-13 00:15:59,325 INFO    [rdr] Running addons/rook-cephfs/fetch
    2024-05-13 00:15:59,327 INFO    [rdr] Running addons/ocm-controller/fetch
    2024-05-13 00:15:59,333 INFO    [rdr] Running addons/csi-addons/fetch
    2024-05-13 00:15:59,341 INFO    [rdr] Running addons/ocm-controller/fetch
    2024-05-13 00:15:59,345 INFO    [rdr] Running addons/recipe/fetch
    2024-05-13 00:15:59,356 INFO    [rdr] Fetching
    2024-05-13 00:15:59,365 INFO    [rdr] Running addons/rook-operator/fetch
    2024-05-13 00:15:59,371 INFO    [rdr] Fetching
    2024-05-13 00:15:59,374 INFO    [rdr] Running addons/rook-operator/fetch
    2024-05-13 00:15:59,377 INFO    [rdr] Running addons/rook-cluster/fetch
    2024-05-13 00:15:59,378 INFO    [rdr] Running addons/csi-addons/fetch
    2024-05-13 00:15:59,388 INFO    [rdr] Running addons/rook-cluster/fetch
    2024-05-13 00:15:59,391 INFO    [rdr] Running addons/recipe/fetch
    2024-05-13 00:15:59,395 INFO    [rdr] Running addons/rook-cephfs/fetch
    2024-05-13 00:15:59,397 INFO    [rdr] Running addons/rook-cephfs/fetch
    2024-05-13 00:15:59,411 INFO    [rdr] Running addons/ocm-controller/fetch
    2024-05-13 00:15:59,412 INFO    [rdr] Running addons/csi-addons/fetch
    2024-05-13 00:15:59,414 INFO    [rdr] Running addons/rook-toolbox/fetch
    2024-05-13 00:15:59,418 INFO    [rdr] Running addons/recipe/fetch
    2024-05-13 00:15:59,419 INFO    [rdr] Running addons/rook-toolbox/fetch
    2024-05-13 00:15:59,450 INFO    [rdr] Running addons/ocm-controller/fetch
    2024-05-13 00:16:00,521 INFO    [rdr] addons/rook-toolbox/fetch completed in 1.20 seconds
    2024-05-13 00:16:00,638 INFO    [rdr] addons/csi-addons/fetch completed in 1.26 seconds
    2024-05-13 00:16:00,793 INFO    [rdr] addons/rook-cephfs/fetch completed in 1.47 seconds
    2024-05-13 00:16:00,804 INFO    [rdr] addons/rook-cephfs/fetch completed in 1.41 seconds
    2024-05-13 00:16:00,830 INFO    [rdr] addons/rook-toolbox/fetch completed in 1.51 seconds
    2024-05-13 00:16:00,831 INFO    [rdr] addons/csi-addons/fetch completed in 1.51 seconds
    2024-05-13 00:16:00,922 INFO    [rdr] addons/rook-cluster/fetch completed in 1.54 seconds
    2024-05-13 00:16:00,938 INFO    [rdr] addons/rook-toolbox/fetch completed in 1.52 seconds
    2024-05-13 00:16:00,987 INFO    [rdr] addons/rook-cephfs/fetch completed in 1.66 seconds
    2024-05-13 00:16:01,106 INFO    [rdr] addons/rook-toolbox/fetch completed in 1.69 seconds
    2024-05-13 00:16:01,130 INFO    [rdr] addons/rook-cluster/fetch completed in 1.81 seconds
    2024-05-13 00:16:01,191 INFO    [rdr] addons/csi-addons/fetch completed in 1.86 seconds
    2024-05-13 00:16:01,234 INFO    [rdr] addons/rook-cluster/fetch completed in 1.91 seconds
    2024-05-13 00:16:01,267 INFO    [rdr] addons/rook-cluster/fetch completed in 1.88 seconds
    2024-05-13 00:16:01,314 INFO    [rdr] addons/csi-addons/fetch completed in 1.90 seconds
    2024-05-13 00:16:01,414 INFO    [rdr] addons/rook-cephfs/fetch completed in 2.02 seconds
    2024-05-13 00:16:01,591 INFO    [rdr] addons/recipe/fetch completed in 2.25 seconds
    2024-05-13 00:16:01,597 INFO    [rdr] addons/recipe/fetch completed in 2.27 seconds
    2024-05-13 00:16:01,696 INFO    [rdr] addons/recipe/fetch completed in 2.31 seconds
    2024-05-13 00:16:01,938 INFO    [rdr] addons/recipe/fetch completed in 2.52 seconds
    2024-05-13 00:16:02,094 INFO    [rdr] addons/rook-operator/fetch completed in 2.73 seconds
    2024-05-13 00:16:02,248 INFO    [rdr] addons/rook-operator/fetch completed in 2.87 seconds
    2024-05-13 00:16:02,252 INFO    [rdr] addons/rook-operator/fetch completed in 2.93 seconds
    2024-05-13 00:16:02,321 INFO    [rdr] addons/rook-operator/fetch completed in 3.00 seconds
    2024-05-13 00:16:05,471 INFO    [rdr] addons/ocm-controller/fetch completed in 6.02 seconds
    2024-05-13 00:16:05,472 INFO    [rdr] Fetching finishied in 6.10 seconds
    2024-05-13 00:16:05,918 INFO    [rdr] addons/ocm-controller/fetch completed in 6.51 seconds
    2024-05-13 00:16:05,919 INFO    [rdr] Fetching finishied in 6.56 seconds
    2024-05-13 00:16:06,020 INFO    [rdr] addons/ocm-controller/fetch completed in 6.69 seconds
    2024-05-13 00:16:06,021 INFO    [rdr] Fetching finishied in 6.70 seconds
    2024-05-13 00:16:06,394 INFO    [rdr] addons/ocm-controller/fetch completed in 7.05 seconds
    2024-05-13 00:16:06,394 INFO    [rdr] Fetching finishied in 7.07 seconds

Fixes: RamenDR#1386
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
The csi-hostpath-driver and volumesnapshots addons start much slower
with minikube 1.33. Replacing them with rook ceph rbd storage, the
kubevirt environments start up to 1.93 times faster.

Start times before and after this change:

| env          | local before | local after | lab before | lab after |
|--------------|--------------|-------------|------------|-----------|
| rdr-kubevirt |          600 |         475 |        920 |       603 |
| kubevirt     |          270 |         230 |        603 |       312 |

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
It is easier to debug issues with a minimal environment. With
rook-cephfs and the required volumesnapshots minikube addon, the rook
environment is less minimal, but it is still quicker to start compared
with the full regional-dr environment.

Example run:

    $ drenv start envs/rook.yaml
    2024-05-12 21:44:00,426 INFO    [rook] Starting environment
    2024-05-12 21:44:00,483 INFO    [dr1] Starting minikube cluster
    2024-05-12 21:44:00,483 INFO    [dr2] Starting minikube cluster
    2024-05-12 21:44:38,650 INFO    [dr1] Cluster started in 38.17 seconds
    2024-05-12 21:44:39,090 INFO    [dr1/0] Running addons/rook-operator/start
    2024-05-12 21:44:39,090 INFO    [dr1/1] Running addons/csi-addons/start
    2024-05-12 21:44:59,732 INFO    [dr2] Cluster started in 59.25 seconds
    2024-05-12 21:45:00,218 INFO    [dr2/0] Running addons/rook-operator/start
    2024-05-12 21:45:00,218 INFO    [dr2/1] Running addons/csi-addons/start
    2024-05-12 21:45:08,913 INFO    [dr1/1] addons/csi-addons/start completed in 29.82 seconds
    2024-05-12 21:45:13,552 INFO    [dr1/0] addons/rook-operator/start completed in 34.46 seconds
    2024-05-12 21:45:13,552 INFO    [dr1/0] Running addons/rook-cluster/start
    2024-05-12 21:45:30,186 INFO    [dr2/1] addons/csi-addons/start completed in 29.97 seconds
    2024-05-12 21:45:41,444 INFO    [dr2/0] addons/rook-operator/start completed in 41.23 seconds
    2024-05-12 21:45:41,444 INFO    [dr2/0] Running addons/rook-cluster/start
    2024-05-12 21:46:21,806 INFO    [dr1/0] addons/rook-cluster/start completed in 68.25 seconds
    2024-05-12 21:46:21,806 INFO    [dr1/0] Running addons/rook-toolbox/start
    2024-05-12 21:46:25,669 INFO    [dr1/0] addons/rook-toolbox/start completed in 3.86 seconds
    2024-05-12 21:46:25,669 INFO    [dr1/0] Running addons/rook-pool/start
    2024-05-12 21:46:40,768 INFO    [dr1/0] addons/rook-pool/start completed in 15.10 seconds
    2024-05-12 21:46:40,768 INFO    [dr1/0] Running addons/rook-cephfs/start
    2024-05-12 21:47:01,116 INFO    [dr2/0] addons/rook-cluster/start completed in 79.67 seconds
    2024-05-12 21:47:01,116 INFO    [dr2/0] Running addons/rook-toolbox/start
    2024-05-12 21:47:01,689 INFO    [dr1/0] addons/rook-cephfs/start completed in 20.92 seconds
    2024-05-12 21:47:01,689 INFO    [dr1/0] Running addons/rook-cephfs/test
    2024-05-12 21:47:04,421 INFO    [dr2/0] addons/rook-toolbox/start completed in 3.31 seconds
    2024-05-12 21:47:04,421 INFO    [dr2/0] Running addons/rook-pool/start
    2024-05-12 21:47:08,994 INFO    [dr1/0] addons/rook-cephfs/test completed in 7.30 seconds
    2024-05-12 21:47:29,597 INFO    [dr2/0] addons/rook-pool/start completed in 25.18 seconds
    2024-05-12 21:47:29,597 INFO    [dr2/0] Running addons/rook-cephfs/start
    2024-05-12 21:47:44,236 INFO    [dr2/0] addons/rook-cephfs/start completed in 14.64 seconds
    2024-05-12 21:47:44,236 INFO    [dr2/0] Running addons/rook-cephfs/test
    2024-05-12 21:47:51,296 INFO    [dr2/0] addons/rook-cephfs/test completed in 7.06 seconds
    2024-05-12 21:47:51,296 INFO    [rook/0] Running addons/rbd-mirror/start
    2024-05-12 21:48:41,169 INFO    [rook/0] addons/rbd-mirror/start completed in 49.87 seconds
    2024-05-12 21:48:41,169 INFO    [rook/0] Running addons/rbd-mirror/test
    2024-05-12 21:48:52,317 INFO    [rook/0] addons/rbd-mirror/test completed in 11.15 seconds
    2024-05-12 21:48:52,317 INFO    [rook] Environment started in 291.89 seconds

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We added csi-hostpath-driver as a quick temporary solution until we have
cephfs storage. Now that we have it, we can replace it and enjoy reduced
start time, in particular with minikube 1.33.

To replace csi-hostpath-driver, we have to add cephfs to the volsync
development environment. This is slower locally, but faster in the e2e
lab. For regional-dr, this is always faster, up to 1.82 time faster in
the e2e lab.

The main difference is cluster start time - minikube addons are loaded
before minikube start returns.

Before:

    2024-05-12 23:01:42,844 INFO    [dr2] Cluster started in 433.20 seconds
    2024-05-12 23:02:07,215 INFO    [dr1] Cluster started in 457.57 seconds

After:

    2024-05-12 23:18:13,386 INFO    [hub] Cluster started in 71.87 seconds
    2024-05-12 23:18:46,943 INFO    [dr2] Cluster started in 105.43 seconds

Start time before and after this change:

| env          | local before | local after | lab before | lab after |
|--------------|--------------|-------------|------------|-----------|
| regional-dr  |          636 |         426 |        780 |       427 |
| volsync      |          261 |         352 |        520 |       395 |

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>
Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
Looks like recent change in pylint trigger this incorrect report:

    drenv/commands.py:234:28: E0606: Possibly using variable
    'input_view' before assignment (possibly-used-before-assignment)

This cannot happen since we don't register proc.stdin if input is None,
so when we reach this block input_view is assigned. However disabling
the check risk missing a real issue in that block.

Lets change the code so pylint can understand it better. This also make
it easier to understand for humans. The cost is negligible, adding 2
temporary variables even when they are never used.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Copy link

openshift-ci bot commented May 16, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: df-build-team

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ShyamsundarR ShyamsundarR merged commit b905207 into main May 16, 2024
29 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants