Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GZ doesn't come online when MTU is configured for absent aggr #454

Open
DvdGiessen opened this issue Sep 26, 2023 · 5 comments
Open

GZ doesn't come online when MTU is configured for absent aggr #454

DvdGiessen opened this issue Sep 26, 2023 · 5 comments

Comments

@DvdGiessen
Copy link

DvdGiessen commented Sep 26, 2023

When I remove a secondary NIC from one of my SmartOS systems, the admin NIC doesn't get fully configured in the GZ. The global zone doesn't come online; however VM's configured on that same admin NIC do show up on the network.

From reading the code (it's a headless system build from random consumer-grade parts I had lying around, so no IPMI to easily debug the problem in-situ while the admin NIC is down) my problem is probably because the NIC I removed was part of an aggr with a custom MTU set in /usbkey/config, and configuring this MTU happens before the admin NIC is initialized, and if configuring that MTU fails we exit with a fatal error.

Note that not being able to create the aggr in itself does not seem to trigger an immediate exit; only when a custom MTU is configured we end with a fatal error. It would probably be nice that if configuring some other NIC failed, the admin NIC would still be fully configured to make the GZ at least accessible over the network?

Context: This is where the aggr setup and MTU setup happens before the admin NIC configuration:

# Create aggregations
create_aggrs
# Make any mtu adjustments that may be necessary
setup_mtu
# Setup admin NIC

To fix this, we can perhaps skip trying to set the MTU on the aggr if creating that aggr failed, since that in itself is not an fatal error (apparently). Would be as simple as moving the MTU part into the if-check above it here:

echo "Creating aggr: ${aggr} (mode=${mode}, links=${links})"
dladm create-aggr -l ${links//,/ -l } -L ${mode} ${aggr}
if [[ $? -eq 0 ]]; then
add_active_aggr_links ${aggr} ${macs}
fi
if [[ -n "$mtu" ]]; then
dladm set-linkprop -p mtu=${mtu} ${aggr}
if [[ $? -ne 0 ]]; then
echo "Failed to set mtu on aggr ${aggr} to ${mtu}"
exit $SMF_EXIT_ERR_FATAL
fi
fi

(And perhaps something can be said for also moving setup_mtu so that MTU failures don't impact the admin interface being brought up, though that can also be a separate change.)

@bahamat
Copy link
Member

bahamat commented Sep 26, 2023

The admin nic comes up with the early-admin service using the net-early-admin service method for exactly this reason. The admin nic will already come up separately from other nics.

I would need to see your config to determine exactly what went wrong. It's possible that you're running into a bug, but it's also possible that you've specified an invalid configuration that we would never be able to make sense of, in which case failure is the only option. Again, I need to see your config file to reproduce the situation.

@DvdGiessen
Copy link
Author

DvdGiessen commented Sep 26, 2023

My config file is as follows:

# cat /usbkey/config
# Note: This file must be source-able by bash

# Ethernet configuration
admin_nic=18:03:73:ad:6c:8e
admin_ip=dhcp
admin_ip6=addrconf
headnode_default_gateway=none

# Aggregated SFP+ configuration
aggr0_aggr=90:1b:e:6d:c4:82,90:1b:e:6d:c4:83
aggr0_lacp_mode=active
aggr0_mtu=9000
internal_nic=aggr0
internal_mtu=9000
internal_ip=10.255.255.2
internal_netmask=255.255.255.252

# Hostname
hostname=blackserver

# DNS setup
dns_domain=internal.dvdgiessen.nl
dns_resolvers=192.168.1.1,8.8.8.8

# NTP servers (see http://www.pool.ntp.org/zone/nl)
ntp_hosts=0.nl.pool.ntp.org,1.nl.pool.ntp.org,2.nl.pool.ntp.org,3.nl.pool.ntp.org
compute_node_ntp_hosts=dhcp

# Load SSH authorized keys
root_authorized_keys_file=authorized_keys

Both interfaces work fine; the problem occurs if I physically remove it the SPF+ PCIe card and boot the machine; I'd expect the machineglobal zone to still be reachable via the built-in Ethernet.

EDIT: Clarified that the problem is the GZ does not come online.

@bahamat
Copy link
Member

bahamat commented Sep 26, 2023

Ok, I'll see what I can figure out.

@DvdGiessen
Copy link
Author

DvdGiessen commented Sep 26, 2023

To further clarify: If I comment out the *_mtu settings, the global zone comes online without problems.

# cat /usbkey/config
# Note: This file must be source-able by bash

# Ethernet configuration
admin_nic=18:03:73:ad:6c:8e
admin_ip=dhcp
admin_ip6=addrconf
headnode_default_gateway=none

# Aggregated SFP+ configuration
aggr0_aggr=90:1b:e:6d:c4:82,90:1b:e:6d:c4:83
aggr0_lacp_mode=active
#aggr0_mtu=9000
internal_nic=aggr0
#internal_mtu=9000
internal_ip=10.255.255.2
internal_netmask=255.255.255.252

# Hostname
hostname=blackserver

# DNS setup
dns_domain=internal.dvdgiessen.nl
dns_resolvers=192.168.1.1,8.8.8.8

# NTP servers (see http://www.pool.ntp.org/zone/nl)
ntp_hosts=0.nl.pool.ntp.org,1.nl.pool.ntp.org,2.nl.pool.ntp.org,3.nl.pool.ntp.org
compute_node_ntp_hosts=dhcp

# Load SSH authorized keys
root_authorized_keys_file=authorized_keys

With this config the built-in admin NIC works fine / the GZ comes online regardless of whether the aggr NIC is physically present in the system. Thus, it is specifically the failure to configure the MTU on a non-existant NIC that seems to cause this.

However if these lines are not commented out (as in the config in the previous comment) AND the aggr NIC is not physically present in the machine, then the global zone does not come online (but the NIC does; other VM's on that NIC do appear on the network).

The admin nic comes up with the early-admin service using the net-early-admin service method for exactly this reason. The admin nic will already come up separately from other nics.

Ah, I did see the early-admin service but did not assume it was activated by default since the comments mentioned it was used specifically for PXE-booting compute nodes.

After checking I see the service is active. But, looking through the code now it seems that early-admin does not do anything when /system/boot/networking.json does not exist, which I don't think exists by default on my SmartOS system.

@DvdGiessen
Copy link
Author

I got around to taking another look at this. This time, with some helpful log output. :)

[ Oct 23 20:31:46 Executing start method ("/lib/svc/method/net-physical"). ]
[ Oct 23 20:31:46 Timeout override by svc.startd.  Using infinite timeout. ]
+ smf_configure_ip
+ /sbin/zonename -t
+ [ global '=' global -o shared '=' exclusive ]
+ return 0
+ LD_LIBRARY_PATH=/lib
+ export LD_LIBRARY_PATH
+ ADMIN_DHCP_TIMEOUT=300
+ ActiveAggrLinks=''
+ typeset -A ActiveAggrLinks
+ smf_netstrategy
+ smf_is_nonglobalzone
+ [ global '!=' global ]
+ return 1
+ /sbin/netstrategy
+ set -- ufs none none
+ [ 0 -eq 0 ]
+ [ ufs '=' nfs ]
+ _INIT_NET_STRATEGY=none
+ export _INIT_NET_STRATEGY
+ typeset -A plumbedifs
+ smf_is_globalzone
+ [ global '=' global ]
+ return 0
+ EARLY_ADMIN=''
+ [[ -f /etc/svc/volatile/.early_admin_setup ]]
+ [[ -n '' ]]
+ /usr/sbin/dladm init-phys
+ log_if_state before
== debug start: before ==
NAME           MACADDRESS         LINK           TYPE            
internal       -                  aggr0          aggr            
admin          18:03:73:ad:6c:8e  e1000g0        normal          
LINK         MEDIA                STATE      SPEED    DUPLEX   DEVICE
e1000g0      Ethernet             unknown    0        half     e1000g0
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
	inet 127.0.0.1 netmask ff000000 
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
	inet6 ::1/128 

Routing Table: IPv4
  Destination            Gateway          Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
127.0.0.1            127.0.0.1            UH        2          0 lo0       

Routing Table: IPv6
  Destination/Mask            Gateway                   Flags Ref   Use    If   
--------------------------- --------------------------- ----- --- ------- ----- 
::1                         ::1                         UH      2       0 lo0   
== debug end: before ==
+ load_sdc_sysinfo
+ boot_file_config_enabled
+ load_sdc_config
+ load_sdc_bootparams
+ sed -e 's/,/ /g'
+ echo ''
+ create_aggrs
+ typeset links macs mode mtu
+ [[ -z '' ]]
+ return 0
+ setup_mtu
+ typeset tag oldifs val mac link curmtu
+ typeset -A mtus
+ typeset -A tagmap
+ set -o xtrace
+ oldifs=$' \t\n'
+ IFS=,
+ eval val='${CONFIG_internal_mtu}'
+ val=9000
+ eval mac='${CONFIG_internal_nic}'
+ mac=aggr0
+ [[ -z 9000 ]]
+ valid_mtu internal 9000
+ typeset tag mtu
+ tag=internal
+ mtu=9000
+ echo 9000
+ grep -E '(^[0-9]{1,5}$)'
+ mtu_is_int=9000
+ [[ -z 9000 ]]
+ (( 9000 > 65535 || 9000 < 1500 ))
+ [[ -z '' ]]
+ tagmap[aggr0]=internal
+ [[ -z '' ]]
+ mtus[aggr0]=9000
+ eval val='${CONFIG_admin_mtu}'
+ val=''
+ eval mac='${CONFIG_admin_nic}'
+ mac=18:03:73:ad:6c:8e
+ [[ -z '' ]]
+ continue
+ IFS=$' \t\n'
+ tag=internal
+ eval link='${SYSINFO_NIC_internal}'
+ link=''
+ [[ -z '' ]]
+ echo '/usbkey/config error: Missing link name for internal'
/usbkey/config error: Missing link name for internal
+ exit 95
[ Oct 23 20:31:46 Method "start" exited with status 95. ]

So it is indeed failing because it cannot set the MTU on a non-existing device.

To fix this, we can perhaps skip trying to set the MTU on the aggr if creating that aggr failed

My assumption was a bit off. This wouldn't have helped, because it never reaches this point because sysinfo doesn't have the aggregation and create_aggrs only acts upon aggregations that are already found by sysinfo.

Instead, it fails in setup_mtu because there it loops over tags that have an MTU defined (such as my internal tag, which doesn't have a link name since the link couldn't be created since the device isn't there).

moving setup_mtu so that MTU failures don't impact the admin interface being brought up

So a variation of this might instead be more appropriate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants