Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: workflow concurrency control #12757

Open
Joibel opened this issue Mar 7, 2024 · 1 comment
Open

proposal: workflow concurrency control #12757

Joibel opened this issue Mar 7, 2024 · 1 comment
Assignees
Labels
area/controller Controller issues, panics area/spec Changes to the workflow specification. area/synchronization `parallelism` configurations and other synchronization type/feature Feature request

Comments

@Joibel
Copy link
Member

Joibel commented Mar 7, 2024

Current state

Users have no built-in way of limiting simultaneous runs of identical workflows if they wish to do so.

This is because mutexes & semaphores in Argo only prevent multiple N concurrent runs, but do not limit at the workflow definition level. If they did work it would only work simply in the event that the user would like to delay execution (a queue) rather than cancel one of the runs.

User story

As an Argo user I want the ability to enable a limit for the number of concurrent runs of an identical workflow definition so I can achieve any of the following benefits:

  • Prevent data duplication/ data conflicts
  • Conserve compute resources

Examples:

  • Spark streaming job => FIFO use case: We want only 1 run of it at any given point, preferencing the existing run. We want to prevent someone else from triggering the job. If another user tries to run the same job, we want it to fail to run.
  • CI job => LIFO use case: We want only 1 run of the CI pipeline at any given point, preferencing the new run (often triggered by a new Git event, i.e. commit). We want to stop the existing run if it is still in progress, since it is now outdated, and let the new submitted workflow run.

Proposal

CronWorkflows already have concurrencyPolicy matching that of a native CronJob.

We would add a concurrencyPolicy field in the spec of a Workflow (it would also work if specified int a WorkflowTemplate spec consumed via workflowTemplateRef):

Valid policies are Allow, Forbid and Replace. These are the same concurrencyPolicies as available in CronWorkflow/CronJob.

  • Allow: Would do the current behavior and is the default
  • Forbid: Would prevent a new workflow from running, and a new workflow would be stopped.
  • Replace: Would replace an older workflow with the new one, which would terminate the existing workflow.

As there is no way of knowing the other workflows to group against the optional field concurrencyMatchLabels. Without this field the concurrencyPolicy would do nothing because no other workflows would be in the group.

concurrencyPolicy: Forbid
concurrencyMatchLabels:
  workflows.argoproj.io/cluster-workflow-template: my-ci

The match block is like many other systems and would allow label matching to find other workflows for which this concurrency policy applies.

This would silently do nothing for WorkflowTemplates which are not consumed via workflowTemplateRef in the same way as other similar fields like volumes do.

Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritize the proposals with the most 👍.

@Joibel Joibel added the type/feature Feature request label Mar 7, 2024
@Joibel Joibel self-assigned this Mar 7, 2024
@Joibel Joibel added area/controller Controller issues, panics area/spec Changes to the workflow specification. labels Mar 7, 2024
@agilgur5 agilgur5 added the area/synchronization `parallelism` configurations and other synchronization label May 15, 2024
@agilgur5
Copy link
Member

#13055 essentially proposed this too, but on a semaphore or mutex, which would obviate the need for selectors (although they are a k8s standard and quite powerful, they're usually used for resources that depend on others).

While that wouldn't handle the case of parallelism, but semaphores are more-or-less a superset of parallelism and mutexes, so I think that could be fine.

concurrencyPolicy on semaphore or mutex would be more straightforward to implement and only adds a single field to the spec, one that is already supported by CronWorkflows and CronJobs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller Controller issues, panics area/spec Changes to the workflow specification. area/synchronization `parallelism` configurations and other synchronization type/feature Feature request
Projects
None yet
Development

No branches or pull requests

2 participants