-
Notifications
You must be signed in to change notification settings - Fork 23.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make with_ loops configurable #12086
Comments
Please let's not call it |
includes fix for #10695 |
Excellent. In the interests of bikeshedding, maybe call it |
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
so here is a workaround for breaking a loop task after the first failure
|
@bcoca not working from end (ansible 1.9.3 ubuntu ) TASK: [break loop after 3] **************************************************** |
ah, yes, it will work as is in 2.0 as in 1.9 the registration does not occur until after the loop is done. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
Both I would also replace |
squash might just go away as it is easy to just pass a list to the modules that support it:
and avoid the loop completely |
Agreed. I'm using ansible to run only local tasks. In particular, to build a dozen or so docker images. At the moment, ansible builds them serially, so it takes a lot of time and underutilises the multi-core CPU. I would like to build multiple docker images in parallel. |
@gjcarneiro then don't define them as data, define them as hosts and target them, then |
Hah, thanks for the neat trick :) But still, even if it works (I haven'ted tested) it is a rather convoluted way of running tasks in parallel. Then again, I may be using ansible for completely different purpose than it was intended, so in a way it's my own fault :( |
not really convoluted, it is how Ansible is meant to use paralellization, by host, not by variable. |
Yes, I understand, it's not Ansible's fault, it makes sense. But I'm using Ansible as build system (instead of e.g. make), because Ansible is nice as build system in most ways. But, in my frame of mind, thinking as a build system, "hosts" don't make sense. A build system like "make" doesn't care about "hosts", it only cares about files and tasks. I am forcing Ansible to be used as build system, and that causes a bit of cognitive dissonance, that's all. |
Ansible only cares about Hosts and Tasks, consider the images you are building 'hosts' and suddenly it fits both paradigms. |
Ansible is a configuration management tool for many other things, networks
devices, both real and Virtual, for a huge amount of cloud services such as
databases, web services such as eleastic beanstalk, lambda and all the
components that apply to it like IAM security components, while Ansible is
good at hosts if your still running mostly VMs/hosts your basically in
Early 2000s IT. Not offending anyone here there are sometimes important
reasons for running VMs or even docker containers but they all stem back to
legacy reasons. In fact more and more hosts are going to become less of
what it automates. IMO If we don’t get parallel with_items we might as
well scrap ansible all together.
Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.
…On Mon, 12 Mar 2018 at 18:49, Brian Coca ***@***.***> wrote:
Ansible only cares about Hosts and Tasks, consider the images you are
building 'hosts' and suddenly it fits both paradigms.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#12086 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF>
.
|
This is a very nice approach but it doesn't seem to work inside workaround for rolling restarts with |
no doubt, but it also adds a huge layer of complexity and the need to deal with concurrent actions on a single host ala |
Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.
My playbook looks like this:
---
- hosts: "{{ DeploymentGroup }}"
tasks:
- name: remove vm and all associated resources
azure_rm_virtualmachine:
resource_group: "{{ host_vars[item]['resource_group'] }}"
name: "{{ inventory_hostname }}"
state: absent
delegate_to: 127.0.0.1
Unfortunately it still tries to connect to the machines listed in hosts to
execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try this
previously in many different ways so just want to know you are able to do
this.
Does this actually even work? Hopefully this is just some syntax issue.
On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <isaacegglestone@gmail.com
… wrote:
Ansible is a configuration management tool for many other things,
networks devices, both real and Virtual, for a huge amount of cloud
services such as databases, web services such as eleastic beanstalk, lambda
and all the components that apply to it like IAM security components, while
Ansible is good at hosts if your still running mostly VMs/hosts your
basically in Early 2000s IT. Not offending anyone here there are sometimes
important reasons for running VMs or even docker containers but they all
stem back to legacy reasons. In fact more and more hosts are going to
become less of what it automates. IMO If we don’t get parallel
with_items we might as well scrap ansible all together.
Having said that I’m going to think positive here and and try using
delegate_to for some cloud services I mean I never tried executing on 200+
Cloud components that I need to this this way I guess just query the list
and dump it to a hosts file in hosts format with ansible, then try
delegate_to: localhost I’ll feedback my results here. If it does work at
least we can do a documentation pull request on how to work around
with_item loop serial issues this way. We can make sure to have a link to
them on the cloud modules sections and sections for docker.
On Mon, 12 Mar 2018 at 18:49, Brian Coca ***@***.***> wrote:
> Ansible only cares about Hosts and Tasks, consider the images you are
> building 'hosts' and suddenly it fits both paradigms.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#12086 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF>
> .
>
|
Okay so disabling fact gathering fixes this issue, however it causes
another one, host_vars no longer contains the azure dynamic inventory from
standard in.
So resource_group: "{{ host_vars[item]['resource_group'] }}" doesn't
work in the above and needs to be hard coded to a resource group name.
On Sun, Mar 18, 2018 at 11:14 AM, Isaac Egglestone <
isaacegglestone@gmail.com> wrote:
… Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
deletion tasks as this are also a pain in mass scale.
My playbook looks like this:
---
- hosts: "{{ DeploymentGroup }}"
tasks:
- name: remove vm and all associated resources
azure_rm_virtualmachine:
resource_group: "{{ host_vars[item]['resource_group'] }}"
name: "{{ inventory_hostname }}"
state: absent
delegate_to: 127.0.0.1
Unfortunately it still tries to connect to the machines listed in hosts to
execute the azure task azure_rm_virtualmachine.
Am I doing this correctly? Seems I'm missing something, but I did try this
previously in many different ways so just want to know you are able to do
this.
Does this actually even work? Hopefully this is just some syntax issue.
On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <
***@***.***> wrote:
> Ansible is a configuration management tool for many other things,
> networks devices, both real and Virtual, for a huge amount of cloud
> services such as databases, web services such as eleastic beanstalk, lambda
> and all the components that apply to it like IAM security components, while
> Ansible is good at hosts if your still running mostly VMs/hosts your
> basically in Early 2000s IT. Not offending anyone here there are sometimes
> important reasons for running VMs or even docker containers but they all
> stem back to legacy reasons. In fact more and more hosts are going to
> become less of what it automates. IMO If we don’t get parallel
> with_items we might as well scrap ansible all together.
>
> Having said that I’m going to think positive here and and try using
> delegate_to for some cloud services I mean I never tried executing on 200+
> Cloud components that I need to this this way I guess just query the list
> and dump it to a hosts file in hosts format with ansible, then try
> delegate_to: localhost I’ll feedback my results here. If it does work at
> least we can do a documentation pull request on how to work around
> with_item loop serial issues this way. We can make sure to have a link to
> them on the cloud modules sections and sections for docker.
>
>
>
>
>
>
>
>
> On Mon, 12 Mar 2018 at 18:49, Brian Coca ***@***.***>
> wrote:
>
>> Ansible only cares about Hosts and Tasks, consider the images you are
>> building 'hosts' and suddenly it fits both paradigms.
>>
>> —
>> You are receiving this because you commented.
>> Reply to this email directly, view it on GitHub
>> <#12086 (comment)>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF>
>> .
>>
>
|
Okay so I have modified the Playbook below to try a number of things.
1st I tried setting delegate_facts: True in case this helped but of course
even based on the documentation I didn't really expect that to work.
2nd I set gather_facts: no and tried running setup to reduce the fact
gathering to nothing hoping it would opt to not connect at all, but of
course as expected it still tried to connect to the machine.
3rd Tried setting connection: localhost but strangely it still wants to
connect remotely to the machine to gather the facts even though it knows it
will execute the play locally, a bit annoying there but I get the logic as
how else will it know the details of the host in question without doing
this..
I can probably use the playbook to turn the machines on first and then let
ansible login to them and gather the unneeded facts. This would be so that
I can get host_vars to work and then delete the machines. I'd like to know
if anyone has a better solution here as that's also a huge time consuming
effort when I've got a hundred or more machines and I have to power them
all up just to then delete them.
So far I'm seeing using this as a solution instead of a with_items parallel
solution as having potential but the machines in question still need to be
up and reachable if you need any kind of facts from azure_rm.py while you
do this so there is at least one caveat there. That is unless someone knows
how to get access to host_vars from azure that are passed via standard in
when gather_facts: no
Actually I of course have the same problem when I run all this using a
with_items list, however I was hoping to avoid that work around if I'm
going to use hosts again. The work around is dumping the azure_rm.py to a
json file on the command line and then loading into a variable to get
access to them again.
If I look forward to my end goal here to modify hundreds or even thousands
of serverless components in parallel, perhaps this will be okay as I can
use things like azure_rm_functionapp_facts
<http://docs.ansible.com/ansible/latest/azure_rm_functionapp_facts_module.html>
to
gather facts about them and use them in the play in theory although this
has yet to be tested.
I still don't have any great logic on how to do this properly to create a
documentation pull request about it as the method seems so far largely
dependant on what your doing and I'm not sure I want to suggest using the
json dump hack in the documentation.
I'll wait for some feedback from anyone who happens to care about this on
this issue list to decide my next step. Meanwhile I'll use my hack to get
my immediate work done.
…---
- hosts: "{{ DeploymentGroup }}"
gather_facts: no
tasks:
- setup:
gather_subset=!all,!min
- name: remove vm and all associated resources
azure_rm_virtualmachine:
resource_group: "{{ host_vars[inventory_hostname]['resource_group']
}}"
name: "{{ inventory_hostname }}"
state: absent
delegate_to: localhost
# delegate_facts: True
On Sun, Mar 18, 2018 at 12:04 PM, Isaac Egglestone <
isaacegglestone@gmail.com> wrote:
Okay so disabling fact gathering fixes this issue, however it causes
another one, host_vars no longer contains the azure dynamic inventory from
standard in.
So resource_group: "{{ host_vars[item]['resource_group'] }}"
doesn't work in the above and needs to be hard coded to a resource group
name.
On Sun, Mar 18, 2018 at 11:14 AM, Isaac Egglestone <
***@***.***> wrote:
> Hrrm so far I have created a small test to delegate_to: 127.0.0.1 for
> deletion tasks as this are also a pain in mass scale.
>
> My playbook looks like this:
> ---
>
> - hosts: "{{ DeploymentGroup }}"
>
> tasks:
>
> - name: remove vm and all associated resources
> azure_rm_virtualmachine:
> resource_group: "{{ host_vars[item]['resource_group'] }}"
> name: "{{ inventory_hostname }}"
> state: absent
>
> delegate_to: 127.0.0.1
>
> Unfortunately it still tries to connect to the machines listed in hosts
> to execute the azure task azure_rm_virtualmachine.
> Am I doing this correctly? Seems I'm missing something, but I did try
> this previously in many different ways so just want to know you are able to
> do this.
>
> Does this actually even work? Hopefully this is just some syntax issue.
>
>
>
>
>
>
>
>
>
>
> On Mon, Mar 12, 2018 at 7:55 PM, Isaac Egglestone <
> ***@***.***> wrote:
>
>> Ansible is a configuration management tool for many other things,
>> networks devices, both real and Virtual, for a huge amount of cloud
>> services such as databases, web services such as eleastic beanstalk, lambda
>> and all the components that apply to it like IAM security components, while
>> Ansible is good at hosts if your still running mostly VMs/hosts your
>> basically in Early 2000s IT. Not offending anyone here there are sometimes
>> important reasons for running VMs or even docker containers but they all
>> stem back to legacy reasons. In fact more and more hosts are going to
>> become less of what it automates. IMO If we don’t get parallel
>> with_items we might as well scrap ansible all together.
>>
>> Having said that I’m going to think positive here and and try using
>> delegate_to for some cloud services I mean I never tried executing on 200+
>> Cloud components that I need to this this way I guess just query the list
>> and dump it to a hosts file in hosts format with ansible, then try
>> delegate_to: localhost I’ll feedback my results here. If it does work at
>> least we can do a documentation pull request on how to work around
>> with_item loop serial issues this way. We can make sure to have a link to
>> them on the cloud modules sections and sections for docker.
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, 12 Mar 2018 at 18:49, Brian Coca ***@***.***>
>> wrote:
>>
>>> Ansible only cares about Hosts and Tasks, consider the images you are
>>> building 'hosts' and suddenly it fits both paradigms.
>>>
>>> —
>>> You are receiving this because you commented.
>>> Reply to this email directly, view it on GitHub
>>> <#12086 (comment)>,
>>> or mute the thread
>>> <https://github.com/notifications/unsubscribe-auth/AJd59nhLIM3s3BL_xM_WwbJS6-uROzjSks5tdsNCgaJpZM4Fx8zF>
>>> .
>>>
>>
>
|
I have a use case for forks too, which would make this a lot easier. The playbook is deploying a bunch of openstack instances via terraform with randomly picked floating ips. Then I iterate over the ips to check that port 22 is open on each created host. Current method to do this is with a multiplay playbook: - hosts: localhost
connection: local
gather_facts: no
tasks:
...
- name: Run terraform
terraform:
plan_file: '{{tf_plan | default(omit)}}'
project_path: '{{terraform_path}}/{{infra}}'
state: '{{state}}'
state_file: '{{stat_tfstate.stat.exists | ternary(stat_tfstate.stat.path, omit)}}'
variables: '{{terraform_vars | default(omit)}}'
register: tf_output
- name: Add new hosts
with_items: '{{tf_output.outputs.ip.value}}' # configued this in terraform to output a list of assigned ips.
add_host:
name: '{{item}}'
groups: new_hosts
- hosts: new_hosts
gather_facts: no
connection: local
tasks:
- name: Wait for port 22 to become available
wait_for:
host: '{{ansible_host}}'
port: 22
state: started
timeout: 60 This is run with: |
Since a lot of people seem to be struggling with the performance of templating files locally, maybe a specific template_local module could be created to solve this specific issue instead. At least it'd be a start... I'd have a go myself but won't have time for the forseeable future. 30+ minutes to template 100 files that can be done in 5s with jinja is ridiculous. |
@saplla templating always happens locally, the only thing that happens remotely is copying the template and setting permissions. |
Just to clarify, I'm talking about those users who want to template files as local tasks, e.g. to feed into other build systems, or in my case, to deploy k8s resources using kubectl. What I mean is to offload the looping and templating to jinja via a module that is a simple wrapper. The module could take some context and the loop definition (what would normally be put into It could be invoked like this:
The above example takes all variables defined by ansible as the context, but any dict could be passed in. As I say, I haven't got time to work on this right now, but does the approach sound feasible @bcoca ? |
That assumes that each item is independent, that is not always the case, you can make the current item values depend on the previous ones and/or results of previous iterations, or they can just be cumulative. Most of the time spent templating has to do with the vars, not the templates themselves, since they need to be consistent, you would not gain much in parallelization unless you are willing to change behaviours that would break current assumptions. Also, templates are already parallel, by host, just not by item. |
OK thanks for the thoughts. It'd actually be good enough for my use case and it sounds like it might be for some other people in this thread too. I'm just using ansible to load hierarchical configs and template files locally before invoking some binary that deploys them (kubectl, helm, etc). I'd be happy with a dead-simple, lightweight templating module if it was so performant it reduced templating times from minutes to seconds. I'll try to look at this when it becomes an issue for us, unless someone beats me to it. |
I originally filed #10695 but seeing that this is going to take a while to come together I ended up addressing these use cases with shell scripts (eg. just say I have to do something on 50 Git repos on a single host, I use Ansible to run a single script once that does the thing 50 times). Unfortunately, this means giving up some of the stuff that you get for free with Ansible, like very granular change reporting, and you also have to implement all of the "run only if" logic yourself and be very careful about error handling, but it is probably two orders of magnitude faster. As such, even if we wind up getting a "parallel" option in the future, it might not be as fast as my custom scripts and I probably won't bother switching to it. |
@wincent a parallel loop will probably still always be slower than a shell script/dedicated program, as Ansible does much more than just 'apply the action'. |
@bcoca: yep, that confirms my understanding. |
@saplla k8s_raw is better than using template for this, you can inline the yaml in your inventory if needed :) (it's not the subject of this PR) |
@nerzhul Thanks but it's not better for us. Too much magic. We need templating. |
@sapila you could always create a host target per template to parallelize templating as much as possible and then use subsequent plays or delegation to deliver to the proper actual hosts. |
@bcoca a little bit hacky :) |
not at all, its a LOT hacky, but works today |
closing in favor of ansible/proposals#140 |
ISSUE TYPE
Feature Idea
COMPONENT NAME
core
ANSIBLE VERSION
2.1
CONFIGURATION
OS / ENVIRONMENT
SUMMARY
pause: between loop executions, useful in throttled api scenarioDone in 2.2squash: join all items into list and pass to provided option, works like current hardcoded opts for apt, yum, etc, by default it should be Noneabandon: reversed opinion, we should remove this featurelabel: (Feature: ability to specify custom template for item "label" in output #13710) what to display when outputting the item loopDone in 2.2docs to current state at:
http://docs.ansible.com/ansible/playbooks_loops.html#loop-control
STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
The text was updated successfully, but these errors were encountered: