Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update azure instance state recognition #3346

Closed
djmitche opened this issue Aug 5, 2020 · 1 comment · Fixed by #3489
Closed

Update azure instance state recognition #3346

djmitche opened this issue Aug 5, 2020 · 1 comment · Fixed by #3489
Assignees

Comments

@djmitche
Copy link
Collaborator

djmitche commented Aug 5, 2020

https://bugzilla.mozilla.org/show_bug.cgi?id=1646061 shows an instance with powerStates ['ProvisioningState/creating'] which should probably be considered transitional. Also, it seems that vmInfo.provisioningState is redundant with the power states?

Now that we have registrationTimeout and reregistrationTimeout, the state recognition can be a little more conservative about killing instances -- if it misses an instance that has failed to start up, that instance's registrationTimeout will expire and it will be terminated anyway. And if it misses an instance that has failed after startup, reregistrationTimeout will expire and it will be terminated. So the worker scan's responsibilities are:

  • in state REQUESTED
    • detect failed startup conditions and remove the worker
  • in state RUNNING
    • detect instance stopping or deallocation and remove the worker

Since the documentation on lifecycles is next to useless, let's just look for known, observed situations, and treat everything else as OK (with debug logging for anything that's not known and observed to be good).

@djmitche
Copy link
Collaborator Author

djmitche commented Sep 9, 2020

Pete and I began pairing on this, and perhaps we can try again tomorrow. It will conflict with #3480, but we'll work that out..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants