Tugboat Next.js server can be restarted via Composer command #17209

timcosgrove · 2024-02-12T19:14:32Z

Requirements

We want a Composer command that restarts a running Next.js server on Tugboat, so that we switch the code checkout used for the Next.js for QA purposes.

Background & implementation details

This follows on from #16853. That ticket set up the infrastructure for choosing a branch for next-build and vets-website. The goal of this ticket is to make use of those branches and to restart the server.

In order to test preview functionality, the feature flag that currently manages CMS preview being switched between Content Build and Next Build will need to be enabled.

The text was updated successfully, but these errors were encountered:

alexfinnarn · 2024-02-28T21:38:23Z

This work will be done based on the #17285 PR still in review to help with unblocking. Otherwise, we have to wait until that other PR is merged. We could keep updating the referenced PR but decided to branch off it instead.

alexfinnarn · 2024-03-04T20:33:28Z

I spent some time looking over the Tugboat configuration and Tugboat documentation...probably more than I needed, but the build stages are a bit confusing and docs like https://docs.tugboatqa.com/reference/tugboat-configuration/ help to sort out the intricacies between builds with base previews, those without, and non-build related stages.

One thought I had while looking at the Tugboat config file relates to pre-built images. I know Tugboat maintains a Cypress related image at tugboatqa/cypress-included but in this repo's setup, all the Cypress related dependencies are gathered each build. Furthermore, I would think the VA could make an image on top of tugboatqa/php:8.1-apache-bookworm with the additional packages to save time in future builds.

One bad thing about building each time is that the versions of all these extra dependencies aren't versioned like we have for composer and npm dependencies. Seeing as how the Tugboat environments have been having reliability issues recently, using pre-built images and removing a lot of the package updates in the init stage would make sense to me. Then, only update the images when you need to update the base, e.g. when updating PHP versions and/or Cypress versions.

Another thing I noticed is a section where variables are echoed into a config file which makes checking out git branches harder. I wonder if we can simply set the env variables at that step?

# Old...
echo "NEXT_PUBLIC_DRUPAL_BASE_URL=https://cms-${TUGBOAT_SERVICE_TOKEN}.${TUGBOAT_SERVICE_CONFIG_DOMAIN}" >> ${TUGBOAT_ROOT}/next/envs/.env.tugboat

# New...
NEXT_PUBLIC_DRUPAL_BASE_URL=https://cms-${TUGBOAT_SERVICE_TOKEN}.${TUGBOAT_SERVICE_CONFIG_DOMAIN}

I think that would get merged into process.env without having to edit the envs file and therefore no need to stash/pop when trying to restart the next server.

Finally, getting to the actual point of this issue, I think the server can be killed and restarted in scripts/next-start.sh by first checking for a running process, killing any, and then starting the server again. Install and build commands don't interfere with the server commands and as long as the dependencies have been updated and the build without SSG enabled completes, the server should still respond to previews.

# In scripts/next-start.sh ...
cd ${ROOT}/next

# Kill any current running server.
# We can look for "/scripts/yarn/start.js" since that is what "yarn start" runs.
NEXT_SERVER_PIDS=$(ps aux | grep '[.]/scripts/yarn/start.js' | awk '{print $2}')

# In case we have multiple processes, loop through them.
for pid in ${NEXT_SERVER_PIDS}; do
    kill $pid
done

# Start the dev server. Vets-website assets need to be in place prior to this build.
yarn start

I might be missing that there are two builds, one for preview and one for checking to see the whole site builds, but I think previews are the main thing that needs to work in next-build, and this method will build then start that server.

alexfinnarn · 2024-03-05T18:34:50Z

Now, I'm getting this error on Tugboat which comes from bash -lc 'composer va:next:start' in the Tugboat config.yml file:

> # Prepare the next build server
> ! ./scripts/should-run-directly.sh || ./scripts/next-build.sh
/var/lib/tugboat/next/packages/env-loader/dist/cms-feature-flags.js:5
const removeTrailingSlash = (s) => s.endsWith('/') ? s.substring(0, s.length - 1) : s;

TypeError: Cannot read properties of undefined (reading 'endsWith')
    at removeTrailingSlash (/var/lib/tugboat/next/packages/env-loader/dist/cms-feature-flags.js:5:38)
    at getCmsFeatureFlags (/var/lib/tugboat/next/packages/env-loader/dist/cms-feature-flags.js:8:31)
    at processEnv (/var/lib/tugboat/next/packages/env-loader/dist/index.js:23:76)
    at Object.<anonymous> (/var/lib/tugboat/next/scripts/yarn/build.js:3:1)
    at Module._compile (node:internal/modules/cjs/loader:1191:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1245:10)
    at Module.load (node:internal/modules/cjs/loader:1069:32)
    at Function.Module._load (node:internal/modules/cjs/loader:904:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
    at node:internal/main/run_main_module:22:47
Script ! ./scripts/should-run-directly.sh || ./scripts/next-build.sh handling the va:next:build event returned with error code 1

I did get this locally since I didn't copy the env file and properly set up next-build after cloning the repo. On Tugboat this should be fine with the env vars being written into the file also in config.yml but maybe the git stash/pop isn't working...so I'm rebuilding the PR attached to #17285 to see if that is successful.

One idea would be to turn the "echo variable into file" commands into a script and then call that after checking out the new code, if the problem is env vars not being picked up.

alexfinnarn · 2024-03-05T18:40:58Z

Also weird on the Tugboat previews...I added one commit to this issues draft PR that is based on another PR, but the size of the previews is way different.

Newer preview:

Older preview:

Since we are now cloning the repos multiple times and building...is this what's causing the Tugboat issues with space? Maybe we can't clone and build frontends both for next-build and content-build?

Also possible that we could look for files to delete in one of the stages before running commands...maybe some files are being cached that aren't needed?

tjheffner · 2024-03-05T18:54:00Z

so the tugboat preview size is not necessarily tied to cloning the repos, it's tied to using the base preview or not. the newer PR env is using the base preview, so it's relatively small. the older one was built without a base preview (to test changes to init:), so it has the full size of everything. here's the size of the current CMS PR main base preview:

long way of saying, yes it is contributing to tugboat issues with space, but unfortunately there's no way around it until we are able to merge that PR (which depends on the BRD PR, which depends on them bumping the AMI) 😵

alexfinnarn · 2024-03-05T19:00:07Z

Okay, that makes sense. But then this Tugboat should be deleted, right? https://tugboat.vfs.va.gov/65d690d5067b1809aade9992 It is a test of the same PR without a base preview that then was abandoned after confirming an issue...I haven't exactly been going through all the Tugboats and making sure I don't have old ones around, but this is the first time I've seen why/how this could be an issue.

And then it makes me wonder if dependabot PRs should be sequenced so they only ever have so many open. Like maybe https://github.com/department-of-veterans-affairs/va.gov-cms/blob/main/.github/dependabot.yml#L9 should be 5 instead of 10?

On a quick count, I think about half the Tugboat PRs are dependabot...and IMHO, you ought to merge one in and then wait for the whole build and tests to run before merging in others. So, reducing that limit would reduce the desire to merge many in at once and potentially cause reliability issues.

tjheffner · 2024-03-05T19:14:01Z

yeah, that first one could be deleted. the dependabot ones... usually they get moved through quicker, but I think with the other infra issues they've been lagging a bit. hopefully they get moved through this week, it's a CMS team task.

re: echoing env vars,

Another thing I noticed is a section where variables are echoed into a config file which makes checking out git branches harder. I wonder if we can simply set the env variables at that step?

Old...

echo "NEXT_PUBLIC_DRUPAL_BASE_URL=https://cms-${TUGBOAT_SERVICE_TOKEN}.${TUGBOAT_SERVICE_CONFIG_DOMAIN}" >> ${TUGBOAT_ROOT}/next/envs/.env.tugboat

New...

NEXT_PUBLIC_DRUPAL_BASE_URL=https://cms-${TUGBOAT_SERVICE_TOKEN}.${TUGBOAT_SERVICE_CONFIG_DOMAIN}

exporting directly to the ENV instead of to the .env file may work, but we have to test it. the next command is going to load envs from the .env.tugboat file based on the APP_ENV var being present (it's always present on tugboat). we do have a mechanism to pass vars as part of the commands, but that may break the script on different non-tugboat envs. so some fiddling is likely.

alexfinnarn · 2024-03-05T19:41:45Z

oh yeah, and I think I have to build this with no preview or off of the other PR's preview...which I didn't specify on the Tugboat attached to this issue...so that is rebuilding now.

I will try debugging as much as I can locally and seeing why the git stash/pop might be causing issues or if that part is fine and it is another issue with env vars.

alexfinnarn · 2024-03-05T20:24:37Z

Maybe I'll make some suggestions for tickets of Tugboat things to look into, but I wondered how the hell the previews are so big and then I saw the assets pulled in from a backup.

2024-03-05T20:11:12.636Z - 65e7718f067b1809aa330f30# /bin/sh -c curl --remote-name https://dsva-vagov-prod-cms-backup-sanitized.s3-us-gov-west-1.amazonaws.com/files/cms-prod-files-latest.tgz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 24 63.8G   24 15.3G    0     0  75.4M      0  0:14:25  0:03:28  0:10:57 76.0M

63.8G is about two-thirds of the total size I saw for a preview without a base, and in Tugboat's own documentation they use stage_file_proxy to pull in needed files on demand in their config examples. I bet there is some networking restriction where Tugboat can't reach the production assets and therefore they all need to be backed up...or something like that; however, I've always seen junk in Drupal file directories. Maybe this files directory is clean (and I will look locally once my download finishes), but it will be the first time I've seen a clean files directory.

Another interesting benefit of using stage_file_proxy is that you could analyze the Tugboat files directory against production to see how much of a difference there is. Building on Tugboat or locally should show the currently used files whereas production could have many years of files that are no longer used but still backed up.

alexfinnarn · 2024-03-05T20:58:11Z

Well, looky here: https://nextjs.org/docs/pages/building-your-application/configuring/environment-variables#referencing-other-variables So at least in page router, I think this works now?

Nope...it's only for variables within the actual .env file you are using or a higher-level .env file. I thought maybe it would be possible to write to a .env.fake file that isn't version controlled and then read from there, but that probably won't work. Oh well, it would be nice to simply use interpolation in the .env.tugboat and have dotenv support that by default but no use looking into that option any more.

I think https://github.com/motdotla/dotenv-expand/tree/master might work but I'm not going to try and modify next-build to make that happen. It should work to stash/pop...or maybe git reset, checkout, re-echo env vars, and then start the server.

alexfinnarn · 2024-03-06T17:46:15Z

I see now in Slack that Tugboat will likely be upgraded one minor version so that might play into this issue...but I will gather some questions for discussion purposes that I have from setting up next-build locally and how that might play into updating code and restarting the server.

Should manual steps listed in next-build readme go into the next-install.sh script? I know the env vars are updated on Tugboat but not sure about vets-website part. For local setup, we'd only need to run ddev composer va:next:install and not any manual steps.
If vets-web-setup.sh is re-run in next-build-frontend.sh do the symlinks mentioned here need to be recreated? https://github.com/department-of-veterans-affairs/va.gov-cms/blob/main/.tugboat/config.yml#L207
I get an error prerendering page "/_playground/api-explorer" due to the oauth token so locally I delete that route. How do I get the proper keys? The readme says they are on AWS...
I should make a fake PR in next-build so that I can check that out and make sure the server restarts with a noticeable change between the main branch and the PR branch. What is the best process to do that?
How does the build:preview and start interact? I know restarting the server is needed and I think rebuilding as well, but not sure if there's more to preview vs. other server needs like building everything out.

alexfinnarn · 2024-03-06T19:09:52Z

Answers from chat with @tjheffner...

Might be a good idea but no need to think about now. Possible candidate for making team transition easier.
Symlinks should persist, but recreate them if they don't exist.
Is an actual bug but a fix in content release workflow. will open up a smaller PR. Can get by with secrets in .env.local and keys from Tugboat.
Try changing the layout template and adding dummy CSS. Keep as draft PR and add do not merge label.
Two build commands. Preview builds API/preview routes and server. we are only concerned with previews, not the whole build.

alexfinnarn · 2024-03-06T20:16:28Z

I thought this error was related to the symlinking so I added those commands before the vets-web-setup.sh call in next-build-frontend.sh but it still errored at the same part.

error Command failed with exit code 137.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
cp: cannot stat './script/../build/localhost/generated/vendor.entry.js': No such file or directory
error Command failed with exit code 1.

I will look into this tomorrow.

alexfinnarn · 2024-03-07T17:05:15Z

This actually fails locally for me even on the main branch even with just running scripts/vets-web-setup.sh. I get the same error as above. I thought maybe this was an issue with generated assets colliding when trying to rebuild, but I would expect the first build to succeed.

Now I will read more on the vets-website codebase and why this code is needed and fails.

# From "yarn build" in vets-website...

# Only run Webpack if the assetSource = local
if [ "${assetSource}" = "local" ]; then
    echo "Building application assets"
    yarn build:webpack $webpackArgs
   # This step fails...
    cp -v "${buildDir}generated/vendor.entry.js" "${buildDir}generated/shared-modules.entry.js"
else
    echo "Will fetch application assets from the content build script"
fi

alexfinnarn · 2024-03-07T18:47:36Z

Even if I comment out the copy step the script fails. The webpack build runs NODE_OPTIONS=--max-old-space-size=8192 webpack --config config/webpack.config.js and I see that a SIGINT 137 is when the system kills the script potentially for memory issues. Since the -max-old-space-size=8192 is set, it makes me think they added that due to webpack consuming too much memory. I bet the build fails due to memory issues and that's why the directory doesn't exist.

So, I will play around with changing that setting and looking at building this not in the DDEV container.

alexfinnarn · 2024-03-07T19:27:29Z

Success on the vets-website build in DDEV! I had to increase my memory allocation for Docker to 10GB but it worked after that. For some reason, I had 7.9GB before so maybe 8GB would have worked.

I will make a note in the AC to update any docs on Docker resources for setting up the va.gov-cms project since I had to alter mine. Not sure if 8GB is default for Docker Desktop or not.

alexfinnarn · 2024-03-07T21:00:35Z

Seems like things are working locally. I have the whole shebang running from "/admin/content/deploy/next", and the next server appears to be running with the script exiting successfully.

# From the next-build.txt log file.
> ! ./scripts/should-run-directly.sh || ./scripts/next-start.sh
Started next server with PID: 51391
> ./scripts/should-run-directly.sh || ddev composer va:next:start --
==> Done

# Looking for server start command.
alexfinnarn@va-gov-cms-web:/var/www/html$ ps -ax | grep "[.]/scripts/yarn/start.js"
51841 pts/0    Sl     0:01 /run/rosetta/rosetta /usr/local/bin/node /usr/local/bin/node --no-opt -r /proc/.reset ./scripts/yarn/start.js

Even though the pids don't match, when I killed 51841 the other process of 51391 was killed.

The thing I'm not sure how to test is exposing the port from within the PHP web container since it's not started locally on my machine. I see a vhost for storybook and maybe next-build should have one too? However, I think things are working well enough that I will start a build on Tugboat and see what happens.

alexfinnarn · 2024-03-11T15:13:11Z

After spending a while debugging Tugboat builds, I figured out an error in the Tugboat shell. Apparently, even though APP_ENV is set earlier in the next build/start scripts, the yarn command doesn't pick up that env var so it is undefined when needed. Switching to APP_ENV=${APP_ENV} yarn in those commands does get the env var picked up. I think I removed that from the code thinking it would be picked up and the code would be cleaner, but after putting it back the build/start commands picked up the right env vars.

I also found that skipping the files and tests can save at least 30 minutes, and that made me wish we had commit flags for CI so we could turn things off when we want to debug faster rather than having to comment out lines in a config file. I've used commit flags to target only certain tests being run and it was helpful.

alexfinnarn · 2024-03-11T20:53:37Z

hmm...I now have the PR working on Tugboat but I had to manually check if the next server was running and restart it. At first, I wasn't seeing a red footer like the PR I uploaded, but after manually restarting I did see what I wanted to see.

I think the issue might be with the code looking to kill the server and since that code is run in the background and via delegation scripts I'm not sure what the right processes are at this point. I saw next-route-worker that I had to kill to actually stop the next server entirely.

So, the current code might be building the next site version, updating git info, trying to kill the wrong PIDs, and then the changes don't look like they've taken effect since the server isn't actually restarted. I didn't see anything in the logs as far as errors go. I'll try rebuilding and cataloging the PIDs/processes that are up when Tugboat starts and then as the server is restarted. I was hoping to grep for a certain string but not sure that will work.

alexfinnarn · 2024-03-12T18:47:38Z

After a few more Tugboat build errors...sometimes it says it's done half-way through running test, which is a lie...I was able to have next running by default and then switch to a branch with a red footer. However, upon trying to reset the environment for another QA tester, the footer is still red and the header looks broken. I didn't see any errors in the logs.

I'm not sure if this matters since someone would probably only switch branches once per environment...and once we're past the base preview issue, they should be able to refresh the preview to try another branch.

I will look into this more, but the code seems really close to doing what is expected.

alexfinnarn · 2024-03-12T19:36:38Z

Actually, it works on Tugboat if I choose a branch and then another branch to build. Looking at the code, I don't think the current content-build process takes into account someone resetting things to the main branch as there is no checkout that happens in that case...so the code could be improved in that regard but if content-build doesn't have it, then no need to have next-build have it. Plus, I'm not sure how often, if ever, people use this feature. If people use it often maybe there is a case to keep improving but for now I think it is okay to have a dev reset the git state via Tugboat shell instead of add code if people want to switch back to the main, default git branch.

timcosgrove · 2024-03-26T17:11:20Z

still blocked by needing updated AMI for BRD

timcosgrove added Accelerated Publishing Needs refining Issue status labels Feb 12, 2024

timcosgrove assigned timcosgrove, tjheffner and alexfinnarn and unassigned timcosgrove Feb 13, 2024

This was referenced Feb 14, 2024

[AP] Sprint 104 #17243

Closed

[AP] Sprint 103 #17245

Closed

tjheffner mentioned this issue Feb 23, 2024

Reapply "VACMS-16853: next build branches ftw (#17154)" (#17261) #17285

Closed

24 tasks

timcosgrove added the Blocked Issues that are blocked on factors other than blocking issues. label Feb 26, 2024

alexfinnarn linked a pull request Mar 4, 2024 that will close this issue

Tugboat Next.js server can be restarted via Composer command #17400

Draft

19 tasks

timcosgrove mentioned this issue Mar 7, 2024

[AP] Sprint 105 #17437

Closed

alexfinnarn mentioned this issue Mar 13, 2024

Next Build and Vets Website branches can be chosen on Tugboat #16853

Closed

This was referenced Mar 28, 2024

[AP] Sprint 106 #17672

Closed

[AP] Sprint 107 #17673

Closed

alexfinnarn mentioned this issue Apr 29, 2024

Replicate CMS & Next Build communication on CMS side #16851

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tugboat Next.js server can be restarted via Composer command #17209

Tugboat Next.js server can be restarted via Composer command #17209

timcosgrove commented Feb 12, 2024 •

edited by alexfinnarn

Acceptance criteria

alexfinnarn commented Feb 28, 2024 •

edited

alexfinnarn commented Mar 4, 2024

alexfinnarn commented Mar 5, 2024 •

edited

alexfinnarn commented Mar 5, 2024 •

edited

tjheffner commented Mar 5, 2024

alexfinnarn commented Mar 5, 2024 •

edited

tjheffner commented Mar 5, 2024

Old...

New...

alexfinnarn commented Mar 5, 2024

alexfinnarn commented Mar 5, 2024

alexfinnarn commented Mar 5, 2024

alexfinnarn commented Mar 6, 2024 •

edited

alexfinnarn commented Mar 6, 2024

alexfinnarn commented Mar 6, 2024

alexfinnarn commented Mar 7, 2024

alexfinnarn commented Mar 7, 2024

alexfinnarn commented Mar 7, 2024

alexfinnarn commented Mar 7, 2024 •

edited

alexfinnarn commented Mar 11, 2024

alexfinnarn commented Mar 11, 2024

alexfinnarn commented Mar 12, 2024

alexfinnarn commented Mar 12, 2024

timcosgrove commented Mar 26, 2024

Tugboat Next.js server can be restarted via Composer command #17209

Tugboat Next.js server can be restarted via Composer command #17209

Comments

timcosgrove commented Feb 12, 2024 • edited by alexfinnarn

Requirements

Acceptance criteria

Background & implementation details

alexfinnarn commented Feb 28, 2024 • edited

alexfinnarn commented Mar 4, 2024

alexfinnarn commented Mar 5, 2024 • edited

alexfinnarn commented Mar 5, 2024 • edited

tjheffner commented Mar 5, 2024

alexfinnarn commented Mar 5, 2024 • edited

tjheffner commented Mar 5, 2024

Old...

New...

alexfinnarn commented Mar 5, 2024

alexfinnarn commented Mar 5, 2024

alexfinnarn commented Mar 5, 2024

alexfinnarn commented Mar 6, 2024 • edited

alexfinnarn commented Mar 6, 2024

alexfinnarn commented Mar 6, 2024

alexfinnarn commented Mar 7, 2024

alexfinnarn commented Mar 7, 2024

alexfinnarn commented Mar 7, 2024

alexfinnarn commented Mar 7, 2024 • edited

alexfinnarn commented Mar 11, 2024

alexfinnarn commented Mar 11, 2024

alexfinnarn commented Mar 12, 2024

alexfinnarn commented Mar 12, 2024

timcosgrove commented Mar 26, 2024

timcosgrove commented Feb 12, 2024 •

edited by alexfinnarn

alexfinnarn commented Feb 28, 2024 •

edited

alexfinnarn commented Mar 5, 2024 •

edited

alexfinnarn commented Mar 5, 2024 •

edited

alexfinnarn commented Mar 5, 2024 •

edited

alexfinnarn commented Mar 6, 2024 •

edited

alexfinnarn commented Mar 7, 2024 •

edited