Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAS-128704 / 13.3 / Fix truecommand issues on HA in CORE 13 #13756

Merged
merged 6 commits into from
May 20, 2024

Conversation

sonicaj
Copy link
Member

@sonicaj sonicaj commented May 19, 2024

This PR adds changes to backport some of the fixes which were made to SCALE and not backported to CORE and also fix an issue where wireguard service started on standby whenever the service was started on active. Reason behind that was that we propagate service actions to standby automatically by default and that is not desirable in this case as wireguard should only be running on 1 node at a time. To address that truecommand has been added to list of blacklisted services which shouldn't be touched when any of the service verb is called for it.

An edge case for nginx config has also been handled where we were not adding wireguard interface ip to listen directive on failover and a subsequent nginx config reload was warranted because of that.

This commit adds changes to reduce the time to check if truecommand connection is active to 30 seconds instead of 30 minutes after setting up the interfaces and everything because the latter was way too long and system only updated the status before if truecommand.config was explicitly called. Setting it to 30 seconds works nicely and is enough to ensure the relevant wireguard interface is up and everything.
This commit fixes an issue where if tc container is down, we immediately try to start wireguard service which will initiate a health check which is destined to fail and we will continue stop/start cycle. So adding a delay of 5 minutes before each time we initiate this.
@sonicaj sonicaj requested a review from a team May 19, 2024 11:57
@sonicaj sonicaj self-assigned this May 19, 2024
@sonicaj sonicaj marked this pull request as ready for review May 19, 2024 11:57
@bugclerk bugclerk changed the title Fix truecommand issues on HA in CORE 13 NAS-128704 / 13.3 / Fix truecommand issues on HA in CORE 13 May 19, 2024
@bugclerk
Copy link
Contributor

@sonicaj
Copy link
Member Author

sonicaj commented May 19, 2024

backport

@yocalebo yocalebo merged commit 148e6e6 into truenas/13.3-stable May 20, 2024
1 check passed
@yocalebo yocalebo deleted the NAS-128704 branch May 20, 2024 22:47
@bugclerk
Copy link
Contributor

This PR has been merged and conversations have been locked.
If you would like to discuss more about this issue please use our forums or raise a Jira ticket.

@truenas truenas locked as resolved and limited conversation to collaborators May 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants