Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build stuck at running jobs (image transformation) #34051

Open
2 tasks done
NickBarreto opened this issue Nov 22, 2021 · 73 comments
Open
2 tasks done

Build stuck at running jobs (image transformation) #34051

NickBarreto opened this issue Nov 22, 2021 · 73 comments
Labels
status: confirmed Issue with steps to reproduce the bug that’s been verified by at least one reviewer. topic: media Related to gatsby-plugin-image, or general image/media processing topics topic: source-plugins Relates to the Gatsby source plugins (e.g. -filesystem) type: bug An issue or pull request relating to a bug in Gatsby

Comments

@NickBarreto
Copy link
Contributor

NickBarreto commented Nov 22, 2021

If you're coming new to this issue, please see this first: #34051 (comment)


Preliminary Checks

Description

Gatsby's build process is hanging and not completing. I suspect the issue is with Sharp, as my site has quite a few images, and I saw this brought up in a previous issue, #33557.

When I upgraded to v4 I initially had no issues. However, the next day my builds all started going exceeding Netlify's maximum build time of 30 minutes.

I mentioned this problem in the thread to the other issue, as others apparently had the same problem where run queries in workers seems to take longer than expected.

This issue is difficult to reproduce because I think in part it is to do with the scale of my site, which is moderately large and has ~1600 images. There must be something that isn't quite right in the worker process because my builds on netlify went from roughly taking around 13 or 14 minutes, to exceeding the build limit every time.

To try and diagnose the issue I tried a local build, which while it took a long-ish time, did actually complete

Since @LekoArts suggested that Gatsby Cloud's build process is better optimised for processing images, I thought I'd give that a go.

After trying out a build in Gatsby Cloud, I had no build problems at all and the whole site build with a clear cache in 7 minutes. OK, I thought, seems like the problem isn't so much with Gatsby, but in how Netlify is interacting with v4's worker process.

However, the next push I ran into the problem once again, this time in Gatsby Cloud. The bottom end of Gatsby Cloud's logs are useful, because they give me a little more information than Netlify:

17:38:38 PM:
info Total nodes: 7987, SitePage nodes: 1695 (use --verbose for breakdown)

17:38:38 PM:
success Checking for changed pages - 0.001s

17:38:38 PM:
success onPreExtractQueries - 0.000s

17:38:38 PM:
success Cleaning up stale page-data - 0.024s

17:38:38 PM:
success createPages - 1.351s

17:38:40 PM:
success extract queries from components - 1.596s

17:38:40 PM:
success write out redirect data - 0.004s

17:38:40 PM:
success onPostBootstrap - 0.046s

17:38:40 PM:
success write out requires - 0.030s

17:38:40 PM:
info bootstrap finished - 48.635s

17:39:15 PM:
warning warn - You have enabled the JIT engine which is currently in preview.

17:39:15 PM:
warning warn - Preview features are not covered by semver, may introduce breaking changes, and can change at any time.

17:39:15 PM:
warning ⠀

17:39:22 PM:
success Building production JavaScript and CSS bundles - 42.093s

17:39:24 PM:
 [webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)

17:39:24 PM:
 [webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)

17:39:24 PM:
 [webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)

17:39:59 PM:
success Building Rendering Engines - 37.719s

17:40:13 PM:
success Building HTML renderer - 13.051s

17:40:13 PM:
success Execute page configs - 0.039s

17:40:15 PM:
success Caching Webpack compilations - 0.001s

17:40:15 PM:
success Validating Rendering Engines - 2.094s

17:40:39 PM:
success run queries in workers - 23.276s - 1662/1662 71.40/s

17:45:38 PM:
warning This is just diagnostic information (enabled by GATSBY_DIAGNOSTIC_STUCK_STATUS_TIMEOUT):

17:45:38 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

17:45:38 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:

17:45:38 PM:
Process will be terminated in 1500.000 seconds if nothing will change.

17:45:38 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"

18:10:38 PM:
ERROR Terminating the process (due to GATSBY_WATCHDOG_STUCK_STATUS_TIMEOUT):

18:10:38 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

18:10:38 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 1800.000 seconds. Activities preventing Gatsby from transitioning to idle state:

18:10:38 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"

The fact that a full, uncached build on Gatsby Cloud can run in 7 minutes, suggests to me that actually the issue isn't one of scale, but that the worker process is hanging, but only sometimes.

Is it to do with incremental builds? Maybe. I am using the preserved download cache, because as I said my site has quite a few images which are coming from a custom source plugin (which is relatively simple, and contains all the image links from AWS that are passed over to createRemoteFileNode).

To test things out once I had the first timeout on Gatsby Cloud, I tested a manual deploy without clearing the cache. I was hoping the process would hang again so I'd know the issue was with the cache and incremental builds, but alas, it did not. The whole build was completed in 6 minutes. Strangely, the issue does appear to occur on Netlify more frequently than not, and happens more occasionally in Gatsby Cloud. It may be to do with build process resources, because I just signed up to Gatsby Cloud, and so am in the free preview of performance builds.

Are there other diagnostic tools I can use to more closely inspect the build process? How would I be able to see which process is failing or never finishing?

Reproduction Link

I can't seem to reproduce this error as it is intermittent

Steps to Reproduce

  1. Attempt to build site with gatsby build in either Netlify or Gatsby Cloud
  2. Sometimes, the build never finishes

Expected Result

gatsby build should eventually finish and build the site

Actual Result

The state run queries in workers never finishes/moves on to merge worker state, the build eventually times out and fails.

Environment

My local environment isn't really the issue, builds have failed in both Netlify and Gatsby Cloud with this problem.

However, this is my local env:

  System:
    OS: macOS Mojave 10.14.6
    CPU: (4) x64 Intel(R) Core(TM) i7-4578U CPU @ 3.00GHz
    Shell: 3.2.57 - /bin/bash
  Binaries:
    Node: 16.1.0 - /usr/local/bin/node
    npm: 8.1.4 - /usr/local/bin/npm
  Languages:
    Python: 3.9.5 - /usr/local/opt/python/libexec/bin/python
  Browsers:
    Chrome: 95.0.4638.69
    Firefox: 94.0.1
    Safari: 14.1.2
  npmPackages:
    gatsby: ^4.1.6 => 4.2.0 
    gatsby-plugin-gdpr-cookies: ^2.0.8 => 2.0.8 
    gatsby-plugin-image: ^2.1.3 => 2.2.0 
    gatsby-plugin-loadable-components-ssr: ^4.1.0 => 4.1.0 
    gatsby-plugin-local-search: ^2.0.1 => 2.0.1 
    gatsby-plugin-netlify: ^4.0.0-next.0 => 4.0.0-next.0 
    gatsby-plugin-netlify-cms: ^6.1.0 => 6.2.0 
    gatsby-plugin-postcss: ^5.1.0 => 5.2.0 
    gatsby-plugin-react-helmet: ^5.1.0 => 5.2.0 
    gatsby-plugin-sharp: ^4.1.4 => 4.2.0 
    gatsby-remark-copy-linked-files: ^5.1.0 => 5.2.0 
    gatsby-remark-images: ^6.1.4 => 6.2.0 
    gatsby-remark-relative-images: ^2.0.2 => 2.0.2 
    gatsby-remark-responsive-iframe: ^5.1.0 => 5.2.0 
    gatsby-remark-smartypants: ^5.1.0 => 5.2.0 
    gatsby-source-filesystem: ^4.1.3 => 4.2.0 
    gatsby-transformer-remark: ^5.1.4 => 5.2.0 
    gatsby-transformer-sharp: ^4.1.0 => 4.2.0 
  npmGlobalPackages:
    gatsby-cli: 4.2.0
    gatsby: 3.5.0

Config Flags

PRESERVE_FILE_DOWNLOAD_CACHE: true

@NickBarreto NickBarreto added the type: bug An issue or pull request relating to a bug in Gatsby label Nov 22, 2021
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Nov 22, 2021
@DennisKraaijeveld
Copy link

DennisKraaijeveld commented Nov 23, 2021

I have exactly the same going on. I am on Standard/Standard plan and since 2 days fresh build will take forever. Sometimes it will finish after 25-30 minutes.

10:40:15 AM:
info [gatsby-plugin-perf-budgets] hooking into webpack

10:40:25 AM:
warning warn - Preview features are not covered by semver, may introduce breaking changes, and can change at any time.

10:40:25 AM:
warning warn - You have enabled the JIT engine which is currently in preview.

10:40:25 AM:
warning ⠀

10:41:05 AM:
Webpack Bundle Analyzer saved report to /usr/src/app/www/public/report.html

10:41:05 AM:
success Building production JavaScript and CSS bundles - 50.349s

10:41:32 AM:
success Building HTML renderer - 27.319s

10:41:32 AM:
success Caching Webpack compilations - 0.001s

10:41:32 AM:
success Execute page configs - 0.093s

10:41:34 AM:
success run queries in workers - 1.793s - 307/307 171.24/s

10:46:34 AM:
warning This is just diagnostic information (enabled by GATSBY_DIAGNOSTIC_STUCK_STATUS_TIMEOUT):

10:46:34 AM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

10:46:34 AM:
Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:

10:46:34 AM:
Process will be terminated in 1500.000 seconds if nothing will change.

I wanted to open an issue, but I guess we have something similar.

I am also using createRemoteFileNode to download and optimise remote images. I will run a build without that to see what happens.

@LekoArts LekoArts added gatsby 4 and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Nov 23, 2021
@LekoArts
Copy link
Contributor

@NickBarreto @DennisKraaijeveld can you both please post the URL to a failed build where you see this? Then we can on our side check it out.

@DennisKraaijeveld
Copy link

@NickBarreto
Copy link
Contributor Author

NickBarreto commented Nov 23, 2021

Sure thing.

Here's a Gatsby Cloud build that failed with this error: https://www.gatsbyjs.com/dashboard/e1ae5b97-e312-4fe3-88f2-5f4f81ac0d9d/sites/1703f5eb-05d0-46ad-8fdf-e1f7129e83b7/builds/74b00a67-fbe3-4048-9f78-8150895fc298/details

This is the exact same build, triggered manually, not clearing cache, immediately after, which built successfully in 6 minutes: https://www.gatsbyjs.com/dashboard/e1ae5b97-e312-4fe3-88f2-5f4f81ac0d9d/sites/1703f5eb-05d0-46ad-8fdf-e1f7129e83b7/builds/beb61dcb-be64-4cc5-844a-d6cd3f1a2326/details

There were no changes in the codebase between these two builds, but one failed and the other did not.

It's also not failing every time for me on Gatsby Cloud, although as I said on Netlify it is nearly every time. I suspect that may be to do with resources in the build machine.

@DennisKraaijeveld
Copy link

DennisKraaijeveld commented Nov 23, 2021

@LekoArts I scanned through my builds, and the build without the remote images (onCreateNode, createSchemaCustomization) did run for 4 minutes.. Might be helpful information

EDIT: Never mind. Found a build yesterday without remote images, building forever as well with exactly the same issues:
https://www.gatsbyjs.com/dashboard/c9ba2b9c-76c6-4e2e-94b6-047574fb963f/sites/d18769be-381e-458e-b4fb-d59bbc168935/builds/af4c2d52-8f48-4fba-97b6-4c1630032b3a/details

@LekoArts LekoArts changed the title Build failure: sometimes "Run Queries in Workers" never finishes Build stuck at running jobs (image transformation) Nov 24, 2021
@LekoArts
Copy link
Contributor

Thanks for providing the URLs. We've looked at the builds from @NickBarreto @DennisKraaijeveld and in summary these are the findings:

  • It's not stuck at "run queries in workers" as that step finishes
  • Sometimes (not always) it hangs on running image transformation (jobs). If you add up the times of the activities it doesn't go up to e.g. 20min, it only takes that long because in between it waits for communication from the outside which it doesn't get. It retries that a bunch of times and in some cases it then fails the build, in some cases it continues
  • The PR fix(gatsby): Add back an activity for jobs #34061 won't fix it but it'll give us and the users a way of seeing this being stuck and make debugging things easier

@NickBarreto
Copy link
Contributor Author

Hi @LekoArts, thanks so much for the information.

What would you advise as a next step? Watch this PR until it is merged into a release, then upgrade to that release and do a few further builds to gather more diagnostic details?

Is there any other way in which I could contribute?

@startinggravity
Copy link

Following because I get this problem a lot.

@pieh
Copy link
Contributor

pieh commented Nov 25, 2021

@NickBarreto (and other folks following this issue)

What would you advise as a next step? Watch this PR until it is merged into a release, then upgrade to that release and do a few further builds to gather more diagnostic details?

I did publish "canary" (gatsby@alpha-job-progress) from the PR branch and we are running that (+ some additionaly debugging code on top) internally with test site that we are able to reproduce the issue (eventually, as it does need multiple runs to eventually reproduce the problem). You can use that canary release yourself, but it really won't help much for unstucking builds (it will just show additional progress bar in logs tab)

So in short, we don't need more information from you folks (at least about being stuck on image generation in Gatsby Cloud), we already can reproduce and are in process of tracking down the problem and we will post update here once we find the problem, implement a fix and have reasonably high level of confidence that the fix is correct ( we can never be 100% sure due to intermittent nature of the problem )

@pieh
Copy link
Contributor

pieh commented Nov 25, 2021

Oh, and more thing:

We also found that diagnostic message printing information about "activities" in progress is not always fully correct.

warning This is just diagnostic information (enabled by GATSBY_DIAGNOSTIC_STUCK_STATUS_TIMEOUT):
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

We do see messages like this mentioning only build when in fact there is also Running jobs one (which I did chase thinking there are 2 problems of stuck builds originally). What we found in builds we were checking is that all of them were related to jobs/image processing, but some of them didn't mention jobs which is separate bug, but not root of the stuck builds issue.

@pieh
Copy link
Contributor

pieh commented Nov 26, 2021

Yesterday we published new version of Gatsby Cloud build runner image with fixes, migrated our test site to use it and were monitoring behaviour overnight. We didn't see problems anymore on our test site - it did handle over 300000 jobs successfully in that time (before that fix, it would get stuck at most around 60000 jobs, but more often it was getting stuck much quicker than that)

We are rolling out this update to all sites now. Please note that migration won't happen if the site is busy (like constantly rebuilding), so good way to give a chance to migrate is to temporarily disable builds in Site Settings -> Builds (for ~5 minutes) and re-enable them after that.

@DennisKraaijeveld
Copy link

Thanks! @everyone :)

@buzinas
Copy link
Contributor

buzinas commented Dec 10, 2021

@pieh Thanks for the update. And how about local builds? I have the same problem running this gatsby build on my codespaces machine (32GB).

Gatsby Cloud is still failing for me as well:

https://www.gatsbyjs.com/dashboard/1dfaf52f-9c4f-46c4-9410-6f9966814f9d/sites/9a4191f0-68ba-4271-8b61-fc8572eaf75b/builds/9bf0ccc4-6d08-4f8d-91b1-290be83ce825/details#all

@askibinski
Copy link

askibinski commented Dec 14, 2021

I have read the thread but I'm not using Gatsby Cloud.

Currently migrating from v3 to v4 and testing everything on local and now this quite often happens on gatsby build.

success run queries in workers - 14.927s - 96/96 6.43/s
success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 50.515s - 206/206 4.08/s
not finished Merge worker state - 0.276s

Never had issues before, it's a moderate site, definitely nothing large. Is there a way to get more debug info here what's going on?

Edit: played around with NODE_OPTIONS=--max_old_space_size and GATSBY_CPU_COUNT on my local 32Gb laptop, but without success.

@askibinski
Copy link

I got more debug information when I upgraded gatsby-cli to 4.4.0:

(...)
success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 41.998s - 206/206 4.91/s

ERROR 

Assertion failed: all worker queries are not dirty (worker #3)

  Error: Assertion failed: all worker queries are not dirty (worker #3)
  
  - queries.ts:391 assertCorrectWorkerState
    [quietquality_gatsby]/[gatsby]/src/redux/reducers/queries.ts:391:13
(...)

@pieh
Copy link
Contributor

pieh commented Dec 17, 2021

@askibinski if you are hitting this issue locally - could you manually edit a file in your node_modules - node_modules/gatsby/dist/redux/reducers/queries.js - if you are on 4.4 line should be line 439

And instead of just

throw new Error(`Assertion failed: all worker queries are not dirty (worker #${workerId})`);

Let's add information on our state that assertion fails on:

throw new Error(`Assertion failed: all worker queries are not dirty (worker #${workerId})\n${require(`util`).inspect(queryStateChunk, { depth: Infinity })}`);

This should additionally print information like one below alongside assertion error:

{
    byNode: Map(2) {
      'Site' => Set(1) { '/using-typescript/' },
      'SiteBuildMetadata' => Set(1) { '/using-typescript/' }
    },
    byConnection: Map(0) {},
    queryNodes: Map(0) {},
    trackedQueries: Map(1) { '/using-typescript/' => { dirty: 0, running: 0 } },
    trackedComponents: Map(0) {},
    deletedQueries: Set(0) {},
    dirtyQueriesListToEmitViaWebsocket: []
  }

I currently have no idea how we end up in situation like that. Possibly something fails earlier and we swallow/ignore error? Or maybe we have some stale state?

@askibinski
Copy link

@pieh
Thanks! Yeah I was going that route but your snippet really helped:

So apparantly I still had an old Image (image.js) component laying around from an earlier version/iteration which was used in one place and that debug info showed me:

    trackedQueries: Map(6) {
      'sq--src-components-header-header-js' => { dirty: 0, running: 0 },
      'sq--src-components-meta-js' => { dirty: 0, running: 0 },
      'sq--src-components-blocks-latest-posts-js' => { dirty: 0, running: 0 },
      'sq--src-components-media-image-js' => { dirty: 4, running: 0 },
      'sq--src-components-forms-ebook-form-js' => { dirty: 0, running: 0 },
      'sq--src-components-node-body-js' => { dirty: 0, running: 0 }
    },

Adding that debug info by default might help a lot of people migrating and running into an issue like this.

@bkaldor
Copy link

bkaldor commented Dec 17, 2021

We're trying to upgrade to 4.4 from 3 as well and are running into this exact issue - both in Gatsby Cloud https://www.gatsbyjs.com/dashboard/e156da66-cda0-4df5-b3c0-a7fdca6bf65e/sites/43774e74-f15a-4923-b6f7-d215d0ba104b/builds/e82328f4-29ff-44bc-945e-81b886afd8f8/details#rawLogs and locally:

success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 211.318s - 2175/2175 10.29/s
⠋ Merge worker state

ERROR

Assertion failed: all worker queries are not dirty (worker #3)`

@buzinas
Copy link
Contributor

buzinas commented Dec 17, 2021

@pieh any thoughts on #34051 (comment)?

Right now I'm using a setting with [WEBP] only in order to build my website while it's in development, but we plan to go live in January, and I need the fallback images. I tried the default value ([AUTO, WEBP] if I'm not wrong), and I also tried [WEBP, PNG], [WEBP, NO_CHANGE], but had no success. They all time out after "success run queries in workers".

@askibinski
Copy link

askibinski commented Dec 18, 2021

@bkaldor and others finding this issue: I guess there might be different reasons the build stops/stalls and this issue can get messy. I summarized below:

  • You might be upgrading to v4 and have some old component with a dirty query. Then adding the debug info using the the comment above by @pieh should give you enough info to trace and fix it.
  • You might be using Gatsby cloud and its an image processing thing, which appears (not confirmed, see comment below) to be fixed in comment above.
  • If it's an image processing issue on local gatsby build, try playing with the NODE_OPTIONS=--max_old_space_size and GATSBY_CPU_COUNT environment variables.
  • Also make sure you upgraded gatsby-cli to v4.

@buzinas
Copy link
Contributor

buzinas commented Dec 18, 2021

@askibinski As I said in my two last comments, the image processing on Gatsby Cloud is not fixed. I'm facing timeouts every time I use WEBP with some fallback (NO_CHANGE, AUTO, PNG or JPG).

And I'm also facing the same issue on local Gatsby build (GitHub Codespaces). I'll try to play with these environment variables you suggested in order to fix the local issue, but the Gatsby Cloud issue still persists.

@startinggravity
Copy link

I resolved the problem I was having with Gatsby Cloud build fails during image processing. As I read through the comments here, it appears that my situation could be different than most, but I figured my situation might be helpful to someone with a similar problem who landed on this issue discussion.

Gatsby Cloud did not specifically say why the build was stopping, other than to give an obscure message: "Failed to validate error Error [ValidationError]: "name" is not allowed.” Eventually, I discovered three image files were being referenced in my Drupal backend's database but were missing from the files directory.

When I removed the database references, I stopped getting stuck build attempts. Oddly, I started using Gatsby a year ago and the problem didn't appear until a few weeks ago, even though the files had always been missing from my backend.

@ghost
Copy link

ghost commented Jan 3, 2022

Currently migrating from v3 to v4 on local and I also get this error on gatsby build. I added the snippet that @pieh suggested but the files that were flagged as dirty didn't seem like they had anything fishy in them. As others have mentioned, this happens intermittently. Other times, I get this generic error:

There was an error in your GraphQL query:

Too many requests, retry after 20ms.

and

An error occurred during parallel query running.
Go here for troubleshooting tips: https://gatsby.dev/pqr-feedback



Error: Worker exited before finishing task

@sensedrive
Copy link

DEBUG=gatsby:gatsby-plugin-sharp npm run build

If I run this command I get the error

Error: /Users/martin/PhpstormProjects/finksmart/org.bluoverda.www/.cache/page-ssr/routes/render-page.js:4577
  "gatsby:gatsby-plugin-sharp" = namespaces;
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  SyntaxError: Invalid left-hand side in assignment

It seems to me the debugging does not work. Because the error message appears quite at the beginning of the build process, whereas the original error message appears at the very end of the build process.

@SebastianMera
Copy link

SebastianMera commented Apr 19, 2022

@LekoArts I run multiple builds through and through for the entire week, only having the same results as @tyhopp. I will add the logging output below. I could not find absolutely no pattern to understand which image or why some images fail the rendering engines validation, as every build I ran, the images were random.

gatsby:gatsby-plugin-sharp Start processing /Users/my-user/Desktop/projects/my-project/public/static/186be79251497704906aefc1d719d805/0ceda/GettyImages-1225946537.jpg +0ms
  gatsby:gatsby-plugin-sharp Start processing /Users/my-user/Desktop/projects/my-project/public/static/186be79251497704906aefc1d719d805/92368/GettyImages-1225946537.jpg +0ms
  gatsby:gatsby-plugin-sharp Start processing /Users/my-user/Desktop/projects/my-project/public/static/186be79251497704906aefc1d719d805/fbd6b/GettyImages-1225946537.webp +65ms
  gatsby:gatsby-plugin-sharp Start processing /Users/my-user/Desktop/projects/my-project/public/static/186be79251497704906aefc1d719d805/dd2c8/failed Validating Rendering Engines - 2.853s
 ERROR #98001  WEBPACK

Built Rendering Engines failed validation failed validation.

Please open an issue with a reproduction at https://github.com/gatsbyjs/gatsby/issues/new for more help

@NickBarreto
Copy link
Contributor Author

Just had another build fail in Gatsby Cloud which was because gatsby-plugin-sharp never finished it's jobs.

https://www.gatsbyjs.com/dashboard/e1ae5b97-e312-4fe3-88f2-5f4f81ac0d9d/sites/1703f5eb-05d0-46ad-8fdf-e1f7129e83b7/builds/c0749b26-4d13-4d65-8175-6aa4e949df2a/details#rawLogs

Not sure if it is at all helpful in diagnosing further, but the issue still persists. We usually build once a day or so to incorporate any recent changes, and I've not had builds outright fail for a while until today.

@zhanglun
Copy link

zhanglun commented May 2, 2022

my gatsby version is v4.13.1, it works well in macOS v12.3.1. but crashed in Github Action. here is the logs:

success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 115.850s - 168/168 1.45/s
error Assertion failed: all worker queries are not dirty (worker #1)


  Error:Assertion failed: all worker queries are not dirty (worker #1)
  
  - queries.ts:395 assertCorrectWorkerState
    [zhanglun.github.io]/[gatsby]/src/redux/reducers/queries.ts:395:13
  
  - queries.ts:228 queriesReducer
    [zhanglun.github.io]/[gatsby]/src/redux/reducers/queries.ts:228:7
  
  - redux.js:536 combination
    [zhanglun.github.io]/[redux]/lib/redux.js:536:29
  
  - redux.js:296 next
    [zhanglun.github.io]/[redux]/lib/redux.js:296:22
  
  - index.ts:72 
    [zhanglun.github.io]/[gatsby]/src/redux/index.ts:72:68
  
  - index.js:27 Object.dispatch
    [zhanglun.github.io]/[redux-thunk]/lib/index.js:27:16
  
  - pool.ts:117 mergeWorkerState
    [zhanglun.github.io]/[gatsby]/src/utils/worker/pool.ts:117:13
  
  - build.ts:305 build
    [zhanglun.github.io]/[gatsby]/src/commands/build.ts:305:11
  

not finished Merge worker state - 0.0[97](https://github.com/zhanglun/zhanglun.github.io/runs/6247339375?check_suite_focus=true#step:8:97)s

@ku8ar
Copy link

ku8ar commented May 5, 2022

Probably workaround is to add env var: GATSBY_DISABLE_CACHE_PERSISTENCE=true

@wardpeet
Copy link
Contributor

wardpeet commented May 6, 2022

@NickBarreto we fixed the issue, you were seeing on the cloud side.

@pragmaticpat pragmaticpat added topic: source-wordpress Related to Gatsby's integration with WordPress topic: source-drupal Related to Gatsby's integration with Drupal topic: source-contentful Related to Gatsby's integration with Contentful topic: source-shopify Related to the gatsby-source-shopify plugin topic: source-plugins Relates to the Gatsby source plugins (e.g. -filesystem) labels May 6, 2022
@LekoArts LekoArts removed topic: source-wordpress Related to Gatsby's integration with WordPress topic: source-drupal Related to Gatsby's integration with Drupal topic: source-contentful Related to Gatsby's integration with Contentful topic: source-shopify Related to the gatsby-source-shopify plugin labels May 9, 2022
@bisclever
Copy link

bisclever commented May 31, 2022

I'm using gatsby v3.14.6 and I get an error when trying to use the debug env variable
DEBUG=gatsby:gatsby-plugin-sharp npm run build

This is the error:

`failed Building static HTML for pages - 1.256s

ERROR #95313

Building static HTML failed

See our docs page for more info on this error: https://gatsby.dev/debug-html

5 | }, module.exports.__esModule = true, module.exports["default"] =
module.exports;
6 | return _setPrototypeOf(o, p);

7 | }
| ^
8 |
9 | module.exports = _setPrototypeOf, module.exports.__esModule = true,
module.exports["default"] = module.exports;

WebpackError: D:\Documents\Projects\OVEA\ovea-site\public\render-page.js:6373

  • setPrototypeOf.js:7
    [ovea-site]/[@babel]/runtime/helpers/setPrototypeOf.js:7:1

  • utils.js:129
    [ovea-site]/[@gatsbyjs]/reach-router/lib/utils.js:129:1

  • utils.js:73
    [ovea-site]/[@gatsbyjs]/reach-router/lib/utils.js:73:1

  • index.js:79
    [ovea-site]/[core]/[dot-object]/index.js:79:1

  • index.js:79
    [ovea-site]/[core]/[dot-object]/index.js:79:1

error Command failed with exit code 1.`

Any development on solving this issue?

EDIT: My mistake. I put the variable in the wrong place. I put it in the .env file instead of the package.json script

@ui-jb
Copy link

ui-jb commented Jun 7, 2022

We just started experiencing this issue yesterday. We're on Gatsby 3.14.0 using a Wordpress backend.
Here's the output we've been getting:

14:20:23 PM:
success Building production JavaScript and CSS bundles - 45.820s

14:20:23 PM:
success Rewriting compilation hashes - 0.024s

14:20:38 PM:
success Writing page-data.json files to public directory - 15.325s - 381/381 24.86/s

14:21:28 PM:
success Caching JavaScript and CSS webpack compilation - 64.905s

14:26:28 PM:
warning This is just diagnostic information (enabled by GATSBY_DIAGNOSTIC_STUCK_STATUS_TIMEOUT):

14:26:28 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:

14:26:28 PM:
Process will be terminated in 1500.000 seconds if nothing will change.

14:26:28 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

14:26:28 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"

14:51:28 PM:
ERROR Terminating the process (due to GATSBY_WATCHDOG_STUCK_STATUS_TIMEOUT):

14:51:28 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"

14:51:28 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 1800.000 seconds. Activities preventing Gatsby from transitioning to idle state:

14:51:28 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

@wbertolo
Copy link

My build takes about 4h as we have 8k images and somehow the incremental build does not work. So, every build takes 4h. Is there a way to not process images at all?

@naeluh
Copy link

naeluh commented Jul 22, 2022

I am getting this issue as well. Here is the build logs from Gatsby Cloud => https://www.gatsbyjs.com/dashboard/3968ecf3-ded8-4641-ad25-c4801a6f0d9c/sites/730a6654-269f-44cf-9d59-d0a78b2c1906/builds/67e06291-2707-4baa-8885-d63ae8c99aed/details?returnTo=%2Fdashboard%2F3968ecf3-ded8-4641-ad25-c4801a6f0d9c%2Fsites%2F730a6654-269f-44cf-9d59-d0a78b2c1906%2FcmsPreview thanks !

Gets to here and then hangs and then times out

Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:

Activity "build" of type "hidden" is currently in state "IN_PROGRESS"
Activity "Building static HTML for pages" of type "progress" is currently in state "IN_PROGRESS"
Process will be terminated in 1500.000 seconds if nothing will change.

@trevj
Copy link

trevj commented Aug 26, 2022

I just ran into this issue with my blog, hosted on Vercel. I found a workaround for Vercel and spotted a couple of things that might be of interest to anyone working on this issue.

Background: I'm writing a plugin that sources my Instagram posts so they can be included in my blog alongside regular markdown posts. A markdown node is created for each Instagram post; the images are downloaded by gatsby-remark-images-remote for processing with gatsby-plugin-sharp. Basically: my site has a ton of images.

I develop on macOS (M1) and didn't encounter this issue until I tried to deploy on Vercel. The error message is similar to those reported above:

success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 34.278s - 256/256 7.47/s

ERROR

Assertion failed: all worker queries are not dirty (worker #3)

Error: Assertion failed: all worker queries are not dirty (worker #3)

...
...
...

Scouring the comments here, I was able to reproduce locally by setting the environment variable GATSBY_CPU_COUNT=2. I soon realised that by increasing the value of this environment variable, my site would build just fine again.

Surprisingly, the same trick works on Vercel: simply override the default build command (find the Project Settings page then jump down to Build & Development Settings) then my site deploys just fine:

GATSBY_CPU_COUNT=8 gatsby build

I can't find much documentation on Vercel's build environment but I presume it's a tiny VM running somewhere where GATSBY_CPU_COUNT gets set by default to 1 and this is why I don't typically see this issue on my M1 (because my site actually deploys pretty quickly on Vercel, I suspect it's not in fact such a tiny VM at all).

A couple of observations that may help debug this issue:

  • The value of GATSBY_CPU_COUNT needed to get my site to build is proportional to the number of images. Even without overriding the default I was able to deploy my site on Vercel if I simply limited the number of Instagram posts to ~10. I currently set it to 8.
  • I configure gatsby-transformer-remark to use both gatsby-remark-images and gatsby-remark-images-remote. If I disable one or the other, my site deploys just fine on Vercel regardless of how many images I include.

@DevSide
Copy link
Contributor

DevSide commented Jan 18, 2023

Same problem on 4.22.0, in my case the error "Assertion failed: all worker queries are not dirty" happens because of an existing cache.

@engineergit
Copy link

Stuck with same issue and it is still running from last 4+ hours.
Anyone found solution to this problem?
Here is the description
extract queries from components
[ ] 0.091 s 0/2 0% Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs

@cfxd
Copy link

cfxd commented Feb 5, 2023

@engineergit I ran into the same issue as @stephzero1 did 👆🏻 up there using Gatsby: 4.19.2, Node: 16.14.2, npm: 8.5.0, macOS: 12.4. Turns out it's related to sharp (this one and this one).

What resolved this for me was to simply run $ brew install vips (thanks to @lovell 's suggestion).

@yanchenhao57
Copy link

I was troubled by this problem for a long time, until I upgraded the version of Gatsby from 4.x to 5.x today, and upgraded all gatsby-xxx-xxx used in gatsby-config.js to the latest version, this problem solved 😆

@robwilkerson
Copy link

Ugh, I'm not yet in a position to upgrade to v5 and still seeing this error when running a build inside of a Docker container:

success Building Rendering Engines - 190.084s
success Building HTML renderer - 162.631s
success Execute page configs - 0.283s
success Validating Rendering Engines - 18.172s
success Caching Webpack compilations - 0.009s


 ERROR #85928

An error occurred during parallel query running.
Go here for troubleshooting tips: https://gatsby.dev/pqr-feedback



  Error: Worker exited before finishing task

  - index.js:117 ChildProcess.<anonymous>
    [app]/[gatsby-worker]/dist/index.js:117:45

  - node:events:513 ChildProcess.emit
    node:events:513:28

  - child_process:291 Process.ChildProcess._handle.onexit
    node:internal/child_process:291:12


not finished Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 645.111s
not finished run queries in workers - 6.742s

qemu: uncaught target signal 11 (Segmentation fault) - core dumped
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
qemu: uncaught target signal 11 (Segmentation fault) - core dumped

I've tried upping the number of CPUS (both at the container and gatsby level), but never seem able to get past it. Is there any hope at all?

@eugenehp
Copy link

eugenehp commented Oct 3, 2023

Sharing our similar problem and how we solved it.

In our case we had some page URLs generated with special symbols like *, Ü, ) and in the template code we were running regex filter that was failing to find the data, and as a result gatsby build was stalling on last 50 pages.

We had to go into gatsby/src/worker/pool to filter out which URLs were loose, and then fixed it by removing all non-ascii and all regex sensitive symbols.

@dav92lee
Copy link

dav92lee commented Oct 31, 2023

I am getting this error as well on gatsby v5.

success run queries in workers - 30.531s - 157/157 5.14/s
success Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs - 401.241s - 1229/1229 3.06/s
⠋ Merge worker state

 ERROR  UNKNOWN

Assertion failed: all worker queries are not dirty (worker #3)

I realized this file is causing the issue:

import React from "react";
import styled from "styled-components";
import { useStaticQuery, graphql } from "gatsby";

const StyledFlag = styled.img`
  margin: 0;
  width: 18px;
`;

const FlagIcon = ({ name = "" }) => {
  const { flagImages, site } = useStaticQuery(graphql`
    query {
      flagImages: allFile(filter: { relativeDirectory: { eq: "flags" } }) {
        nodes {
          publicURL
          name
        }
      }
    }
  `);
  if (name) {
    const flagIcon = flagImages.nodes.find(
      (f) => f.name === name.replace(/\s+/g, "").replace(/&/g, "and")
    );
    return <StyledFlag src={flagIcon && flagIcon.publicURL} />;
  } else {
    return <></>;
  }
};

export default FlagIcon;

but can't seem to pinpoint why that might be the case. was working on previous version of gatsby (3)

@pierrenel
Copy link
Contributor

Also seeing this 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: confirmed Issue with steps to reproduce the bug that’s been verified by at least one reviewer. topic: media Related to gatsby-plugin-image, or general image/media processing topics topic: source-plugins Relates to the Gatsby source plugins (e.g. -filesystem) type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

No branches or pull requests