Multiple inventory variable precedence warning #1142

tacatac · 2024-02-19T22:45:21Z

I believe some of the documentation on organizing inventory and variables for multiple static inventories may be lacking a warning and therefore be slightly misleading:

I'll be specific about where I think these could be improved once I've shown the comprehension problem I ran into.

Actually the warning exists here: https://docs.ansible.com/ansible/latest/inventory_guide/intro_inventory.html#managing-inventory-variable-load-order

In my case the issue appeared as follows: when splitting inventories according to host environment and using the same group names with different variable values in the associated group_vars/, running ansible-playbook against all inventories will resolve whatever the last one's values where, regardless of which inventory the host belongs to.

To reproduce:

mkdir ansibletest && cd ansibletest
mkdir -p inventory/{production,staging}/group_vars

cat > inventory/production/hosts
production_server1

[appservers]
production_server2

cat > inventory/staging/hosts
staging_server1

[appservers]
staging_server2

cat > inventory/production/group_vars/all.yml
---
env: "production"
...

cat > inventory/staging/group_vars/all.yml
---
env: "staging"
...

cat > inventory/production/group_vars/appservers.yml
---
fqdn: "production.example.org"
...

cat > inventory/staging/group_vars/appservers.yml
---
fqdn: "staging.example.org"
password: "staging"
...

tree
.
└── inventory
    ├── production
    │   ├── group_vars
    │   │   ├── all.yml
    │   │   └── appservers.yml
    │   └── hosts
    └── staging
        ├── group_vars
        │   ├── all.yml
        │   └── appservers.yml
        └── hosts

Using multiple inventories will change the resulting values:

ansible-inventory -i inventory/production/hosts -i inventory/staging/hosts --host production_server2
{
    "env": "staging",
    "fqdn": "staging.example.org",
    "password": "staging"
}

ansible-inventory -i inventory/staging/hosts -i inventory/production/hosts --host production_server2
{
    "env": "production",
    "fqdn": "production.example.org",
    "password": "staging"
}

Why use the same group names in each environment? Because it makes sense that the groups are equivalent and therefore should have the same name, the difference in environment is accounted for by splitting inventories. It also simplifies playbook targets.

Why apply multiple inventories at once? Granted, by splitting inventories we'd expect to use them mostly separately, but there may be some playbooks we'd like to run on more than one inventory all the same.

In regards to the two points above here are the comments I would make on the previously mentioned documentation pages:

Alternative directory layout specifically shows this setup with group1 and group2 presumably existing in the production and staging inventories. It mentions the advantage of the layout when group and host variables don't have much in common (except their name presumably because otherwise there would be less of a reason to split inventories). It does not remind the user that precisely because the variables don't have much in common the inventories should not be targeted simultaneously.

Inventory setup examples again show splits according to environment with various shared groups (function and location-based). This is presented as a way to avoid applying to all environments and the example given sets only one inventory. However, the last sentence mentions "mixing all these setups" in order to update "all nodes" across function or location which I assume means regardless of environment given the previous sections.

The priorities in Understanding variable precedence I would expect here are 4 and 6, the inventory-related group_vars. I read them as "the group variables defined relative to the inventory where the host is to be found" not "the group variables relative to any loaded inventory, with a preference for the last loaded inventory when variable names are the same".

In other words my expectation is that, when splitting inventories, relative variables would be resolved only according to which inventory the target host is in.

Obviously in my example the hosts are not in both inventories but it seems to me to be the point of splitting inventories: choosing one dimension to divide hosts (eg: environment), other dimensions (eg: function, location) can then be represented as groups in each inventory. And if hosts were defined in more than one inventory then, yes, some kind of merge makes sense if the same variables apply, last listed inventory wins is reasonable in this case.

Here would be some workarounds:

Have unique group names, although I'd say this somewhat defeats the purpose of splitting inventories since one would presumably put the relevant dimensions in the group name.
use multiple inventories only if the playbook does not use conflicting variables
never set multiple inventories in the inventory option for ansible.cfg (as I did...)

I'm happy to add some warnings to this effect in the documentation if the behaviour above is the expected one.

Regards,

The text was updated successfully, but these errors were encountered:

ansible-documentation-bot · 2024-02-19T22:45:47Z

Thanks for your Ansible docs contribution! We talk about Ansible documentation on matrix at #docs:ansible.im and on libera IRC at #ansible-docs if you ever want to join us and chat about the docs! We meet there on Tuesdays (see the Ansible calendar) and welcome additions to our weekly agenda items - scroll down to find the upcoming agenda and add a comment to put something new on that agenda.

samccann · 2024-02-22T19:15:00Z

Thanks for the detail in this issue @tacatac ! I'm asking around for someone to take a look and get back to us on whether your understanding is correct.

tacatac · 2024-02-23T22:34:59Z

Thank you.

I've reread my remarks and I think I can clarify further.

First, all my comments pertain to static inventories. I've not used dynamic inventories seriously and do not know how this issue might apply to them.

Second, whenever I mention "splitting inventories" I mean splitting them into different directories so that they have different relative group_vars and host_vars subdirectories.

What appears to happen when I target multiple inventories simultaneously is that they are merged together first. I should point out I've not looked at any source code so this is purely experimental observation.

However, for me this is at odds with the fact that they are in separate directories, I would expect them to be used separately. From force of habit, I suppose, since using the filesystem to separate concerns is a fundamental mecanism. And that is where the cognitive dissonance lies for me.

In other words I would expect all variables to be resolved for the hosts in one inventory (according to the copious precedence rules) and then for the next inventory, in a sequential way.

Again this is only when each host is only in one split inventory. Placing it in more than one is a bit of a contradiction in terms from this point of view, or at any rate a more complex use-case.

To illustrate with a further experimental observation, from the previous setup we can add the following:

mkdir inventory/staging/host_vars
cat > inventory/staging/host_vars/production_server2.yml
---
bonus: "merge"
...

Which gives us this layout which makes no sense from a filesystem point of view (and loses information if we target only the production inventory):

tree
.
└── inventory
    ├── production
    │   ├── group_vars
    │   │   ├── all.yml
    │   │   └── appservers.yml
    │   └── hosts
    └── staging
        ├── group_vars
        │   ├── all.yml
        │   └── appservers.yml
        ├── hosts
        └── host_vars
            └── production_server2.yml

And yet targeting the inventories simultaneously:

ansible-inventory -i inventory/staging/hosts -i inventory/production/hosts --host production_server2
{
    "bonus": "merge",
    "env": "production",
    "fqdn": "production.example.org",
    "password": "staging"
}

The problem with the existing warning about inventory variable load order is that it presents inventories split into different files but all in the same directory and thus with the same relative variables subdirectories. This is not the confusing case for me.

Regards,

tacatac · 2024-02-24T09:30:25Z

To put this as a practical warning about targeting multiple inventories simultaneously.

Running:

ansible-playbook -i inventory1 -i inventory2 myplaybook

Is not the same as running:

ansible-playbook -i inventory1 myplaybook
ansible-playbook -i inventory2 myplaybook

Even if inventory1 and inventory2 are in separate directories and have separate hosts.

When targeting a single inventory however, directory separation does work to isolate inventories.

Regards,

tvo318 · 2024-02-27T17:20:17Z

There is some discussion about the variable precedence hierarchy in the table right before this section if that helps any: https://ansible.readthedocs.io/projects/awx/en/latest/userguide/job_templates.html#relaunching-job-templates

ansible-documentation-bot bot added needs_triage Needs a first human triage before being processed. new_contributor This PR is the first contribution by a new community member. labels Feb 19, 2024

oraNod removed the needs_triage Needs a first human triage before being processed. label Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple inventory variable precedence warning #1142

Multiple inventory variable precedence warning #1142

tacatac commented Feb 19, 2024

ansible-documentation-bot bot commented Feb 19, 2024

samccann commented Feb 22, 2024

tacatac commented Feb 23, 2024

tacatac commented Feb 24, 2024

tvo318 commented Feb 27, 2024

Multiple inventory variable precedence warning #1142

Multiple inventory variable precedence warning #1142

Comments

tacatac commented Feb 19, 2024

ansible-documentation-bot bot commented Feb 19, 2024

samccann commented Feb 22, 2024

tacatac commented Feb 23, 2024

tacatac commented Feb 24, 2024

tvo318 commented Feb 27, 2024