Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postmortem: 5/8/2024 Module Installation Broke Automated Production Deploy #18092

Open
1 of 4 tasks
Tracked by #18048 ...
gracekretschmer-metrostar opened this issue May 9, 2024 · 1 comment
Open
1 of 4 tasks
Tracked by #18048 ...
Assignees
Labels
Epic Issue type Needs refining Issue status

Comments

@gracekretschmer-metrostar
Copy link

gracekretschmer-metrostar commented May 9, 2024

Background

On 5/8/2024, CMS prod went offline due to a module installation. The folder required for installing the translations of the admin UI were not there and caused the deploy to fail and take prod CMS offline.

User Story or Problem Statement

Drupal developers need to be able to install contributed modules when necessary to add or maintain functionality needed by editors to ensure that content is up-to-date and complete. When new modules are enabled within Drupal, they write a file to the docroot/sites/default/files/translations directory that contains translations of the UI elements for that module. If the directory is not present or doesn't have correct permissions there is an error generated which fails the deployment.

Reference Links

Affected users and stakeholders

  • CMS_editors
  • Product teams working within CMS

Hypothesis

  • Ensure that the Ansible creates the folder that the translations file config will go into
  • Would it make sense to have the Apache user manage all the folders/files that the CMS uses to release new modules?
  • Make CMS the owner and apache a user in the group.

Solution

Add scripting to the deployment process to fix the file permissions and to check if the docroot/sites/default/files/translations directory exists and create it if it does not. This could extend the deployment time but will result in a more consistent and reliable deployment process.

Acceptance Criteria

  • The root cause of why prod CMS went offline on 5/8/2024.
  • A solution is implemented that resolves prod CMS from going offline during daily deploy.
  • Opportunities to proactively prevent prod CMS from going offline in the future.
  • Postmortem report submitted to Github repo.
@gracekretschmer-metrostar gracekretschmer-metrostar changed the title Postmortem: 5/8/2024 Module Installation Broke Automated Production Deploy [Draft] Postmortem: 5/8/2024 Module Installation Broke Automated Production Deploy May 9, 2024
@gracekretschmer-metrostar gracekretschmer-metrostar changed the title [Draft] Postmortem: 5/8/2024 Module Installation Broke Automated Production Deploy Postmortem: 5/8/2024 Module Installation Broke Automated Production Deploy May 22, 2024
@gracekretschmer-metrostar
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Issue type Needs refining Issue status
Projects
None yet
Development

No branches or pull requests

3 participants