Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make with_ loops configurable #12086

Closed
bcoca opened this issue Aug 25, 2015 · 90 comments
Closed

make with_ loops configurable #12086

bcoca opened this issue Aug 25, 2015 · 90 comments
Assignees
Labels
affects_2.1 This issue/PR affects Ansible v2.1 affects_2.3 This issue/PR affects Ansible v2.3 feature This issue/PR relates to a feature request. support:core This issue/PR relates to code supported by the Ansible Engineering Team.

Comments

@bcoca
Copy link
Member

bcoca commented Aug 25, 2015

ISSUE TYPE

Feature Idea

COMPONENT NAME

core

ANSIBLE VERSION

2.1

CONFIGURATION
OS / ENVIRONMENT
SUMMARY
how: 
    forks: 1
    pause: 0
    squash: name
    label: "{{item.name}}"
    end: on_fail
with_items: ...
  • forks: forks within the loop to do items in parallel, default 1, this needs warnings
  • pause: between loop executions, useful in throttled api scenario Done in 2.2
  • squash: join all items into list and pass to provided option, works like current hardcoded opts for apt, yum, etc, by default it should be None abandon: reversed opinion, we should remove this feature
  • end: when to interrupt the loop, default is 'last item', options? on_fail, on_success (first one)?
  • label: (Feature: ability to specify custom template for item "label" in output #13710) what to display when outputting the item loop Done in 2.2

docs to current state at:

http://docs.ansible.com/ansible/playbooks_loops.html#loop-control

STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
@amenonsen
Copy link
Contributor

Please let's not call it how. That's even worse to read than become: true. But the functionality under it looks great.

@bcoca
Copy link
Member Author

bcoca commented Aug 27, 2015

includes fix for #10695

@mahemoff
Copy link
Contributor

Excellent. In the interests of bikeshedding, maybe call it looping:.

@realcnbs

This comment has been minimized.

1 similar comment
@yikaus

This comment has been minimized.

@jyennaco

This comment has been minimized.

@rbarabas

This comment has been minimized.

@hloeffler

This comment has been minimized.

@jimi-c jimi-c removed the P3 label Dec 7, 2015
@bcoca
Copy link
Member Author

bcoca commented Dec 16, 2015

so here is a workaround for breaking a loop task after the first failure

- hosts: localhost
  vars:
    myvar:
        - 1
        - 2
        - 3
        - 4
        - 5
  tasks:
    - name: break loop after 3
      debug: msg={{item}}
      failed_when: item == 3
      register: myresults
      when: not (myresults|default({}))|failed
      with_items: "{{myvar}}"

@yikaus
Copy link

yikaus commented Dec 16, 2015

@bcoca not working from end (ansible 1.9.3 ubuntu )

TASK: [break loop after 3] ****************************************************
ok: [localhost] => (item=1) => {
"failed": false,
"failed_when_result": false,
"item": 1,
"msg": "1"
}
ok: [localhost] => (item=2) => {
"failed": false,
"failed_when_result": false,
"item": 2,
"msg": "2"
}
failed: [localhost] => (item=3) => {"failed": true, "failed_when_result": true, "item": 3, "verbose_always": true}
msg: 3
ok: [localhost] => (item=4) => {
"failed": false,
"failed_when_result": false,
"item": 4,
"msg": "4"
}
ok: [localhost] => (item=5) => {
"failed": false,
"failed_when_result": false,
"item": 5,
"msg": "5"
}

@bcoca
Copy link
Member Author

bcoca commented Dec 18, 2015

ah, yes, it will work as is in 2.0 as in 1.9 the registration does not occur until after the loop is done.

@mattyb

This comment has been minimized.

@t1m0thy

This comment has been minimized.

@daniel-sc

This comment has been minimized.

@senderista

This comment has been minimized.

@beholt

This comment has been minimized.

@ykuksenko

This comment has been minimized.

1 similar comment
@evenme

This comment has been minimized.

@kustodian
Copy link
Contributor

Both squash and forks would be awesome features which would speed up Ansible execution immensely.

I would also replace how with something like loop_details, loop_settings, loop_options, or anything similar.

@bcoca
Copy link
Member Author

bcoca commented Jun 16, 2016

loop_control , already in 2.1 with the label part implemented.

squash might just go away as it is easy to just pass a list to the modules that support it:

apt: name={{listofpackages}}

and avoid the loop completely

@mahemoff
Copy link
Contributor

@ansibot ansibot added the affects_2.1 This issue/PR affects Ansible v2.1 label Sep 8, 2016
@bcoca bcoca removed the affects_2.0 This issue/PR affects Ansible v2.0 label Sep 15, 2016
@gjcarneiro
Copy link

Agreed. I'm using ansible to run only local tasks. In particular, to build a dozen or so docker images. At the moment, ansible builds them serially, so it takes a lot of time and underutilises the multi-core CPU. I would like to build multiple docker images in parallel.

@bcoca
Copy link
Member Author

bcoca commented Mar 12, 2018

@gjcarneiro then don't define them as data, define them as hosts and target them, then delegate_to: localhost to execute the actions in parallel

@gjcarneiro
Copy link

Hah, thanks for the neat trick :) But still, even if it works (I haven'ted tested) it is a rather convoluted way of running tasks in parallel.

Then again, I may be using ansible for completely different purpose than it was intended, so in a way it's my own fault :(

@bcoca
Copy link
Member Author

bcoca commented Mar 12, 2018

not really convoluted, it is how Ansible is meant to use paralellization, by host, not by variable.

@gjcarneiro
Copy link

Yes, I understand, it's not Ansible's fault, it makes sense. But I'm using Ansible as build system (instead of e.g. make), because Ansible is nice as build system in most ways. But, in my frame of mind, thinking as a build system, "hosts" don't make sense. A build system like "make" doesn't care about "hosts", it only cares about files and tasks. I am forcing Ansible to be used as build system, and that causes a bit of cognitive dissonance, that's all.

@bcoca
Copy link
Member Author

bcoca commented Mar 12, 2018

Ansible only cares about Hosts and Tasks, consider the images you are building 'hosts' and suddenly it fits both paradigms.

@isaacegglestone
Copy link

isaacegglestone commented Mar 12, 2018 via email

@hryamzik
Copy link
Contributor

@gjcarneiro then don't define them as data, define them as hosts and target them, then delegate_to: localhost to execute the actions in parallel

This is a very nice approach but it doesn't seem to work inside workaround for rolling restarts with serial=1 simulation (#12170). So an option for paralellization would add a lot more flexibility.

@bcoca
Copy link
Member Author

bcoca commented Mar 12, 2018

no doubt, but it also adds a huge layer of complexity and the need to deal with concurrent actions on a single host ala hosts:all + lineinfile + delegate_to: localhost

@isaacegglestone
Copy link

isaacegglestone commented Mar 18, 2018 via email

@isaacegglestone
Copy link

isaacegglestone commented Mar 18, 2018 via email

@isaacegglestone
Copy link

isaacegglestone commented Mar 18, 2018 via email

@megakoresh
Copy link
Contributor

megakoresh commented Apr 13, 2018

I have a use case for forks too, which would make this a lot easier. The playbook is deploying a bunch of openstack instances via terraform with randomly picked floating ips. Then I iterate over the ips to check that port 22 is open on each created host. Current method to do this is with a multiplay playbook:

- hosts: localhost
  connection: local
  gather_facts: no
  tasks:
...
  - name: Run terraform
    terraform:
      plan_file: '{{tf_plan | default(omit)}}'
      project_path: '{{terraform_path}}/{{infra}}'
      state: '{{state}}'
      state_file: '{{stat_tfstate.stat.exists | ternary(stat_tfstate.stat.path, omit)}}'
      variables: '{{terraform_vars | default(omit)}}'
    register: tf_output

  - name: Add new hosts
    with_items: '{{tf_output.outputs.ip.value}}' # configued this in terraform to output a list of assigned ips.
    add_host:
      name: '{{item}}'
      groups: new_hosts
 
- hosts: new_hosts
  gather_facts: no
  connection: local
  tasks:
   - name: Wait for port 22 to become available
     wait_for:
       host: '{{ansible_host}}'
       port: 22
       state: started
       timeout: 60

This is run with: ansible-playbook -i localhost, deploy-test-clients.yml --extra-vars="infra=terraform_os_instances state=present"
This is of course a limited workaround since you don't always have a neatly-inventory-parseable list of ips to work with.

@saplla
Copy link

saplla commented May 31, 2018

Since a lot of people seem to be struggling with the performance of templating files locally, maybe a specific template_local module could be created to solve this specific issue instead. At least it'd be a start... I'd have a go myself but won't have time for the forseeable future.

30+ minutes to template 100 files that can be done in 5s with jinja is ridiculous.

@bcoca
Copy link
Member Author

bcoca commented May 31, 2018

@saplla templating always happens locally, the only thing that happens remotely is copying the template and setting permissions.

@saplla
Copy link

saplla commented May 31, 2018

Just to clarify, I'm talking about those users who want to template files as local tasks, e.g. to feed into other build systems, or in my case, to deploy k8s resources using kubectl.

What I mean is to offload the looping and templating to jinja via a module that is a simple wrapper. The module could take some context and the loop definition (what would normally be put into with_nested and friends) and just cut out ansible entirely for this task (perhaps the wrapper could run jinja in parallel if it speeds things up).

It could be invoked like this:

    template_parallel:
      src: "{{ item[0] }}"
      dest: "{{ tempdir }}/{{ item[1] }}-{{ item[0] | basename }}"
      context: "{{ hostvars[inventory_hostname] }}"
      nested:
      - "{{ templates.stdout_lines }}"
      - "{{ namespaces.stdout_lines }}"

The above example takes all variables defined by ansible as the context, but any dict could be passed in.

As I say, I haven't got time to work on this right now, but does the approach sound feasible @bcoca ?

@bcoca
Copy link
Member Author

bcoca commented May 31, 2018

That assumes that each item is independent, that is not always the case, you can make the current item values depend on the previous ones and/or results of previous iterations, or they can just be cumulative.

Most of the time spent templating has to do with the vars, not the templates themselves, since they need to be consistent, you would not gain much in parallelization unless you are willing to change behaviours that would break current assumptions.

Also, templates are already parallel, by host, just not by item.

@saplla
Copy link

saplla commented May 31, 2018

OK thanks for the thoughts. It'd actually be good enough for my use case and it sounds like it might be for some other people in this thread too. I'm just using ansible to load hierarchical configs and template files locally before invoking some binary that deploys them (kubectl, helm, etc). I'd be happy with a dead-simple, lightweight templating module if it was so performant it reduced templating times from minutes to seconds.

I'll try to look at this when it becomes an issue for us, unless someone beats me to it.

@wincent
Copy link
Contributor

wincent commented May 31, 2018

I originally filed #10695 but seeing that this is going to take a while to come together I ended up addressing these use cases with shell scripts (eg. just say I have to do something on 50 Git repos on a single host, I use Ansible to run a single script once that does the thing 50 times). Unfortunately, this means giving up some of the stuff that you get for free with Ansible, like very granular change reporting, and you also have to implement all of the "run only if" logic yourself and be very careful about error handling, but it is probably two orders of magnitude faster. As such, even if we wind up getting a "parallel" option in the future, it might not be as fast as my custom scripts and I probably won't bother switching to it.

@bcoca
Copy link
Member Author

bcoca commented May 31, 2018

@wincent a parallel loop will probably still always be slower than a shell script/dedicated program, as Ansible does much more than just 'apply the action'.

@wincent
Copy link
Contributor

wincent commented May 31, 2018

@bcoca: yep, that confirms my understanding.

@nerzhul
Copy link
Contributor

nerzhul commented Jun 5, 2018

@saplla k8s_raw is better than using template for this, you can inline the yaml in your inventory if needed :) (it's not the subject of this PR)
what is the current state about this ? Can we expect something in 2.6 @bcoca ?
I'm managing thousands of postgresql privileges on my DB clusters and 25 minutes is painfully slow

@saplla
Copy link

saplla commented Jun 5, 2018

@nerzhul Thanks but it's not better for us. Too much magic. We need templating.

@bcoca
Copy link
Member Author

bcoca commented Jun 5, 2018

@sapila you could always create a host target per template to parallelize templating as much as possible and then use subsequent plays or delegation to deliver to the proper actual hosts.

@nerzhul
Copy link
Contributor

nerzhul commented Jun 5, 2018

@bcoca a little bit hacky :)

@bcoca
Copy link
Member Author

bcoca commented Jun 5, 2018

not at all, its a LOT hacky, but works today

@bcoca
Copy link
Member Author

bcoca commented Mar 29, 2019

closing in favor of ansible/proposals#140

@bcoca bcoca closed this as completed Mar 29, 2019
@ansible ansible locked and limited conversation to collaborators Jul 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.1 This issue/PR affects Ansible v2.1 affects_2.3 This issue/PR affects Ansible v2.3 feature This issue/PR relates to a feature request. support:core This issue/PR relates to code supported by the Ansible Engineering Team.
Projects
None yet
Development

No branches or pull requests