Argo CD: PersistentVolumeClaims for volumes #438

docent-net · 2020-09-03T21:58:25Z

Is your feature request related to a problem? Please describe.

I'm trying to mount remote storage to the argocd repo-server. I can provide volumeMounts list as well as volumes list, but I can't create PVC for those volumes using helm chart.

Describe the solution you'd like

volumes.yaml could contain a definition of StorageClass name for PVC. When provided, the volume mount would be created on top of this PVC.

Describe alternatives you've considered

Alternatively, documentation could be extended with information about using PVC and manual creation of PVC.

Additional context

I can see, that similar feature is already implemented in redis-ha:

argo-helm/charts/argo-cd/charts/redis-ha/templates/redis-ha-statefulset.yaml

Line 285 in d7da8e8

Also, this request is only about repo-server, but similar could be requested for Redis and other components.

If this looks like something useful, I can provide PR.

tlvenn · 2020-10-17T00:53:24Z

I think this is more than useful and should definitely comes outside of the box.
@docent-net any chance you can get a stab at this fairly soon ?

docent-net · 2020-10-18T10:28:26Z

Sure, on it :)

docent-net · 2020-11-20T18:42:29Z

@tlvenn I need to ask a question, about this. Read through ArgoCD docs (e.g. here https://argoproj.github.io/argo-cd/operator-manual/high_availability/#argocd-repo-server) and it looked like a straightforward thing - just create volumeClaimTemplates with proper configuration for the PVC and that's all.

However, repo-server can be replicated, so here it hit me, that maybe we'd need a StatefulSet here instead of Deployment, because there is some state.. or not? Unfortunately, I'm not that familiar w/ArgoCD HA, but after going through the documentation and thinking about it I have 2 assumptions to discuss:

ArgoCD is rather a stateless app (but for the Redis, which is a cache, so its data can be lost). We don't really care about contents of the repo-server, and the only "important" directory is /tmp, where repos are cloned. "important" because the only reason we care about mounting it from the remote repo is that its size can grow and Pod disk can run out of space. In this case, when Pod is lost for some reason (node down, Pod was evicted by scheduler) underlying storage can be safely removed by k8s, as it will not be used anymore and its contents are not important. In this case, we can stay with Deployment resource as it's currently defined and basically create new PVs via volumeClaimTemplates.
We care about the state and contents of /tmp and we need to replace Deployment w/Statefulset to be sure, that replicas use the same storage as before.

After writing this whole post I think that the correct answer is nr.1, but just need to validate it w/you :) Thanks

tlvenn · 2020-11-23T21:46:15Z

Hi @docent-net , I am no expert when it comes to argo cd but my feeling is that the repo server is a cache layer and I would opt for option 1 as well.

tlvenn · 2020-11-24T05:12:06Z

For illustration purpose, not saying this a good idea, Grafana chart for example supports both approaches:

https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/deployment.yaml#L1
https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/statefulset.yaml#L1

docent-net · 2020-11-24T07:16:28Z

I'm going w/opt1 - lets face it - the data is stored in /tmp, so let's treat it as temporary :) Thanks, will get into it

github-actions · 2021-04-01T03:05:11Z

Stale issue message

tlvenn · 2021-04-09T17:11:42Z

@docent-net did you make any progress on this ?

jbartlet · 2021-04-22T15:08:12Z

I'm a little confused...I am trying to deploy ArgoCD in HA mode and use a PVC for the repo-server to avoid the disk consumption issue, described above. Of course, there is an issue with one PVC. Option 1 above mentions leave as a Deployment and use volumeClaimTemplates....but those aren't supported for Deployments, only SS. Am I missing something? Do I just convert to SS?

ramanNarasimhan77 · 2021-06-07T17:19:57Z

Can this issue be re-opened? Right now the only possible option without making changes to the helm chart is to use emptyDir

kaidobit · 2021-12-16T22:49:34Z

Please reopen this issue:

I'm using Longhorn replicated blockstorage, indeed it is my default storageclass and I would love to make use of it for argocd.
This should be the default behavior at least for non-HA, @tlvenn gave a good example for a working non-HA/HA Chart:

For illustration purpose, not saying this a good idea, Grafana chart for example supports both approaches:

https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/deployment.yaml#L1
https://github.com/grafana/helm-charts/blob/main/charts/grafana/templates/statefulset.yaml#L1

Furthermore as @docent-net already pointed out, this Chart is partially making use of the ability to set a custom storageclass:

Additional context

I can see, that similar feature is already implemented in redis-ha:

argo-helm/charts/argo-cd/charts/redis-ha/templates/redis-ha-statefulset.yaml

Line 285 in d7da8e8

{{- if .Values.persistentVolume.enabled }}

snuggie12 · 2022-05-07T04:55:49Z

https://kubernetes.io/docs/concepts/storage/_print/#generic-ephemeral-volumes is alpha in 1.19, beta in 1.21 and stable in 1.23.

pierluigilenoci · 2022-05-25T13:32:48Z

Please reopen this issue.

pierluigilenoci · 2022-05-25T13:51:17Z

@mkilchhofer could you please take a look?

mkilchhofer · 2022-05-25T17:42:12Z

@pierluigilenoci I don't understand why someone wants to configure PV on Argo CD components.

We only implement use cases which are supported by the upstream project itself (https://github.com/argoproj/argo-cd).

One thing I am okay with is to add the ability to deploy any kind of resources (extra manifests) like the bitnami charts do.

snuggie12 · 2022-05-25T17:48:28Z

@mkilchhofer quoted from the official docs:

argocd-repo-server clones repository into /tmp ( of path specified in TMPDIR env variable ). Pod might run out of disk space if have too many repository or repositories has a lot of files. To avoid this problem mount persistent volume.

The problem is they officially tell you to do something without telling you how to.

FWIW, I'm using my suggested approach of ephemeral volumes while keeping it a Deployment but I know that's not available to everyone since it's in 1.21+ as beta:

volume for tmp for repo-server

spec:
  template:
    spec:
      volumes:
      - ephemeral:
          volumeClaimTemplate:
            metadata:
              creationTimestamp: null
              labels:
                type: argocd-repo-server-tmp
            spec:
              accessModes:
              - ReadWriteOnce
              resources:
                requests:
                  storage: 10Gi
              storageClassName: standard-ssd
              volumeMode: Filesystem

mkilchhofer · 2022-05-26T12:23:15Z

ACK. That makes sense. There's already an issue upstream for that: argoproj/argo-cd#7927

pierluigilenoci · 2022-05-27T08:53:05Z

@mkilchhofer I don't have a particular fetish for PVs.

I just wish that in the presence of large caches, a situation aggravated also by some bug (argoproj/argo-cd#4002 and helm/helm#10583), the Repo Server pods are not evicted ('Pod The node had condition: [DiskPressure]. ') or, even worse, other pods are evicted in the same node (exactly what happened in my case).

In any case the idea of allowing the deployment of extra manifests is always a good idea (and if you look at my history on GitHub it is something I particularly appreciate).

Thanks for your help and your time.

pierluigilenoci · 2022-06-10T08:23:58Z

It might make sense to link the CNCF's Slack (stranded) discussion as well:
https://cloud-native.slack.com/archives/C020XM04CUW/p1653511733823829

pierluigilenoci · 2022-06-20T07:18:49Z

@docent-net any update on this?

docent-net · 2022-06-20T13:04:16Z

@pierluigilenoci sure. I've been following this discussion (as well as that mentioned on Slack and regarding StatefulSets) and I'm not sure right now.

This issue is about Deployment. My original problem was that having replica=1 for Deployment type, I wanted to have persistent storage for the git cache. Thanks to that, in case of POD replacement, my infra wouldn't be impacted with traffic/performance spike due to cache rebuilding / git repos pulling on a higher scale.

And I have implemented that, and it works on my cluster correctly. But I have only one replica.

Now - with replicas>1 there's problem of AccessModes, that might be very specific. Having it set to ReadWriteMany (and that is a requirement for this workload), could mean a huge performance drop due to underlying file system configurations.

Why do I think so? Thinking about possible scenarios, where this PVC would be used I can only think of considerable caches with many, many (possible little) files (that is how git repositories looks like, thinking .git directories, indexes etc). And FS synchronization across nodes with such huge distributed disk might be really problematic. Now think of possible issues, that could occur as a side effect of it. The user has inconsistent git directories because of some consistency problem of underlying DFS.

Having this said, I'm not sure anymore, that this PR is a good idea. StatefulSet might be a better approach here.

I can still raise a PR, but I'm just afraid, that someone someday will have a big headache due to this.

pierluigilenoci · 2022-07-05T11:41:16Z

@docent-net sorry for the delay in answering.

The answer you are looking for is Statefulset.
If you have multiple repo-server replicas you must also have multiple disks, one for each replica.
So the solution for assigning 1:1 disks is Statefulset.

Using NFS is a problem of overcomplication and poor performance.
From personal experience I can say that sometimes a cache on NFS is slower than re-loading everything from scratch over the network.

The alternative solution is to use emptyDir but it does not solve the node disk load problem.
Unless you put the disk in memory but in this case the cost increases.

The perfect solution does not exist but something needs to be done.

aroundthecode · 2022-08-22T15:05:29Z

I've one solution to propose

default deploy behaviour is the current one with emptyDir
if a different volume definition is provided in values file then it's applied as-is, without any additional implementation, leaving final user the desired approach for volume management .

repoServer:
 existingVolumes:
    tmp_dir:
      persistentVolumeClaim:
        claimName: pvc-argocd-repo-server-tmp

I've performed such changes on my forked repo for details: main...aroundthecode:argo-helm:reposerver-pvc

With such apporach user can create PV/PVC outside helm and then attach to the deploy, or leave everything as is.
It could be a quick fix allowing user to avoid diskpressure issue and going on discussing Statefulset migration

arielly-parussulo · 2022-11-08T17:26:16Z

Hello there!
Do you intend to open a PR for this solution so we can have it in the oficial Helm Chart? I am trying to migrate an internal chart that we use to the official one and this is one of the features that we would need in order to achieve that.
For now I will create a fork but I think that this config should be in the oficial Helm Chart as it is recommended by the High Availability doc.

docent-net · 2022-11-08T18:43:38Z

@aroundthecode

I've performed such changes on my forked repo for details: main...aroundthecode:argo-helm:reposerver-pvc

So, are you able to raise a PR for this, since you already have it worked out?

arielly-parussulo · 2022-11-16T15:03:47Z

I ended up creating a PR to add support to deploy argocd-repo-server as a StatefulSet with PVCs based on what we used in my company:
#1648
Feel free to add suggestions or editing it.

pierluigilenoci · 2023-04-17T11:07:54Z

@arielly-parussulo, your PR is a bit stranded...

aroundthecode · 2024-01-02T11:58:44Z

@aroundthecode

I've performed such changes on my forked repo for details: main...aroundthecode:argo-helm:reposerver-pvc

So, are you able to raise a PR for this, since you already have it worked out?

HI all, I've tried to perform an alternative PR #2410 with my approach on this, not sure if everthing is fine, please check and advise if needed

snuggie12 · 2024-01-02T17:54:45Z

@aroundthecode oddly enough I was probably about a week away from making a PR since we're about to start using the helm chart. Yours looks more comprehensive though. Thanks!

aroundthecode · 2024-01-02T19:51:14Z

@snuggie12 ( or anyone) can you please help me with the PR? it seems something is missing in the docs generation but I can't realize what.

I've added the helm-docs comments to my values.yaml section but is seems something is still missing since it's telling me the documentation is outdated.

snuggie12 · 2024-01-03T03:12:38Z

@snuggie12 ( or anyone) can you please help me with the PR? it seems something is missing in the docs generation but I can't realize what.

I've added the helm-docs comments to my values.yaml section but is seems something is still missing since it's telling me the documentation is outdated.

@aroundthecode It's running helm-docs in the check and then doing a git diff and seeing there's a change to be made. What it wants you to do is run helm-docs yourself and commit the updated README.md. According to this there should be a script which cleanly executes (it mentions a container so I think you'll need some sort of container runtime like docker or containerd/cri.) If that's an issue in your env you can always install the helm plugin yourself.

Or if you just want to cheat you can look at the diff in the GH action output and paste it in the README.md (mostly kidding as you should probably follow their instructions, but wanted to leave it as a last restort.)

docent-net added the enhancement label Sep 3, 2020

seanson added argo-cd good first issue labels Oct 2, 2020

github-actions bot added the no-issue-activity label Apr 1, 2021

github-actions bot closed this as completed Apr 9, 2021

mkilchhofer added on-hold and removed good first issue no-issue-activity labels May 25, 2022

mkilchhofer reopened this May 25, 2022

mkilchhofer changed the title ~~PersistentVolumeClaims for volumes~~ Argo CD: PersistentVolumeClaims for volumes May 25, 2022

mkilchhofer added the awaiting-upstream Is waiting for a change upstream to be completed before it can be merged. label May 26, 2022

pierluigilenoci mentioned this issue Jun 10, 2022

Deploy repo-server as StatefulSet instead of Deployment argoproj/argo-cd#7927

Open

arielly-parussulo linked a pull request Nov 21, 2022 that will close this issue

feat(argo-cd): Add Volume Pesistence support for argocd-repo-server #1648

Draft

6 tasks

mkilchhofer linked a pull request Feb 14, 2023 that will close this issue

feat(argo-cd): Add Volume Pesistence support for argocd-repo-server #1648

Draft

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Argo CD: PersistentVolumeClaims for volumes #438

Argo CD: PersistentVolumeClaims for volumes #438

docent-net commented Sep 3, 2020 •

edited

tlvenn commented Oct 17, 2020

docent-net commented Oct 18, 2020

docent-net commented Nov 20, 2020

tlvenn commented Nov 23, 2020

tlvenn commented Nov 24, 2020 •

edited

docent-net commented Nov 24, 2020

github-actions bot commented Apr 1, 2021

tlvenn commented Apr 9, 2021

jbartlet commented Apr 22, 2021 •

edited

ramanNarasimhan77 commented Jun 7, 2021

kaidobit commented Dec 16, 2021

snuggie12 commented May 7, 2022

pierluigilenoci commented May 25, 2022

pierluigilenoci commented May 25, 2022

mkilchhofer commented May 25, 2022

snuggie12 commented May 25, 2022

mkilchhofer commented May 26, 2022

pierluigilenoci commented May 27, 2022

pierluigilenoci commented Jun 10, 2022

pierluigilenoci commented Jun 20, 2022

docent-net commented Jun 20, 2022

pierluigilenoci commented Jul 5, 2022

aroundthecode commented Aug 22, 2022

arielly-parussulo commented Nov 8, 2022

docent-net commented Nov 8, 2022

arielly-parussulo commented Nov 16, 2022

pierluigilenoci commented Apr 17, 2023

aroundthecode commented Jan 2, 2024

snuggie12 commented Jan 2, 2024

aroundthecode commented Jan 2, 2024

snuggie12 commented Jan 3, 2024

Argo CD: PersistentVolumeClaims for volumes #438

Argo CD: PersistentVolumeClaims for volumes #438

Comments

docent-net commented Sep 3, 2020 • edited

tlvenn commented Oct 17, 2020

docent-net commented Oct 18, 2020

docent-net commented Nov 20, 2020

tlvenn commented Nov 23, 2020

tlvenn commented Nov 24, 2020 • edited

docent-net commented Nov 24, 2020

github-actions bot commented Apr 1, 2021

tlvenn commented Apr 9, 2021

jbartlet commented Apr 22, 2021 • edited

ramanNarasimhan77 commented Jun 7, 2021

kaidobit commented Dec 16, 2021

snuggie12 commented May 7, 2022

pierluigilenoci commented May 25, 2022

pierluigilenoci commented May 25, 2022

mkilchhofer commented May 25, 2022

snuggie12 commented May 25, 2022

mkilchhofer commented May 26, 2022

pierluigilenoci commented May 27, 2022

pierluigilenoci commented Jun 10, 2022

pierluigilenoci commented Jun 20, 2022

docent-net commented Jun 20, 2022

pierluigilenoci commented Jul 5, 2022

aroundthecode commented Aug 22, 2022

arielly-parussulo commented Nov 8, 2022

docent-net commented Nov 8, 2022

arielly-parussulo commented Nov 16, 2022

pierluigilenoci commented Apr 17, 2023

aroundthecode commented Jan 2, 2024

snuggie12 commented Jan 2, 2024

aroundthecode commented Jan 2, 2024

snuggie12 commented Jan 3, 2024

docent-net commented Sep 3, 2020 •

edited

tlvenn commented Nov 24, 2020 •

edited

jbartlet commented Apr 22, 2021 •

edited