Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: remove turbo #8732

Merged
merged 5 commits into from
Jan 26, 2023
Merged

chore: remove turbo #8732

merged 5 commits into from
Jan 26, 2023

Conversation

benmccann
Copy link
Member

@benmccann benmccann commented Jan 26, 2023

The caching that Turbo offers does not buy us a lot. If I change any code or tests I want the tests to run and not be cached. If I just change the docs, I don't necessarily need to wait for the tests to finish running before merging, so the caching isn't much of a win.

While using Turbo does not provide much benefit for our particular use case, it does impose significant costs. The configuration is inherently brittle - it's easy to add something that's written to disk, change project configuration, etc. and forget to update the Turbo config or incorrectly update it. It's extremely difficult to debug. There have been numerous occasions where we see that it's just returning from cache rather than rerunning and it's impossible to tell why it's made the decision that it has and whether it's a misconfiguration on our part or a bug in Turbo. Even when the caching works, it can sometimes be frustrating - e.g. it becomes very difficult to tell if I've fixed a flaky test because I can't just rerun the tests since they will just return from the cache.

There was recently a proposal to setup Nx for Vite and the Vite team decided they didn't yet need the extra layer of complexity. I think that'd be a good call for us too. We don't have a build step, so we don't have much need for caching. There are a few hard things in computer science like naming things and caching. Let's remove this source of difficulty

What Vite did instead was just check for changed files. If we do need some sort of perf boost, I think that would be a simpler solution in our case. Since we don't have a build step, we don't need to retrieve the output of any build jobs, so it removes the need to talk to external services, etc. But I'm not sure we need even that extra layer of complexity

@changeset-bot
Copy link

changeset-bot bot commented Jan 26, 2023

⚠️ No Changeset found

Latest commit: d9c8e8a

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@gsoltis
Copy link

gsoltis commented Jan 26, 2023

Super bummed to hear this! The turborepo team would love a chance to chat about your experience, offer any support we can, and get any feedback you have to offer.

For what it’s worth, I know understanding what files turbo is caching and what inputs get included in the hashing algorithm is a bit of a black box right now and hard to debug. We are actively working on a feature to produce a build artifact including the information turbo used, and if you’re amenable, I’d like to see if having that information available would be sufficient to understand what turbo is doing.

Please feel free to email me at greg.soltis@vercel.com and I would be happy to set up a time.

@dominikg
Copy link
Member

turbo currently prevents us from leveraging githubs log parsing feature to improve playwright test reporting too

#7650 (comment)

instead of faking results by "replaying" past logs, a more obvious approach of skipping something outright, eg do not run test if the change is in /docs only would be an alternative.

There must not be a way that tests appear to pass where they may not have. Any logic determining to skip something needs to be as failsafe as possible and must not silently fail if files are added or scripts are changed.

@gsoltis
Copy link

gsoltis commented Jan 26, 2023

@dominikg From the linked issue, it seems like feature request is detecting the github actions environment and altering the task prefix accordingly to fit github's log line grouping? If I'm getting that correct, that seems very doable.

Re: skipping tests, would one of the --output-logs options help? For instance, new-only would not replay logs of previous test runs that were skipped.

Understanding the files consider by turbo for hashing is definitely the number 1 user pain point currently. The work I mentioned above to address it is in progress in #3431. Feedback on what you would like to see there is definitely welcome.

@Rich-Harris Rich-Harris merged commit 9cb2d0d into master Jan 26, 2023
@Rich-Harris Rich-Harris deleted the rm-turbo branch January 26, 2023 21:47
@benmccann
Copy link
Member Author

I did like new-only and we used it everywhere. I think it might be a nice default.

One of the things I'd most want is to be to print out some details such as the list of the directories that Turbo checked. We got cache hits semi-frequently when I thought we shouldn't be. Sometime we were able to figure out we'd misconfigured something and sometimes we saw strange behavior we thought was likely a bug in turbo. I think it'd be faster to find the misconfigurations to see the computed list of directories and in the cases we thought it was a bug it'd be nice to be able to have something to point at to demonstrate we had things setup correctly as I never felt confident enough we weren't just doing something wrong to actually file an issue.

Ultimately, even if everything were perfect (bug-free, debuggable, etc.), I'm not sure we'd be your ideal users. I think projects with multiple build steps depending on each other would find much more value from Turbo whereas we just distribute the source without building anything.

@gsoltis
Copy link

gsoltis commented Jan 27, 2023

@benmccann The file list you're describing is what we're currently building. Files that were hashed (and which glob they came from) and environment variables hashed, as well as output files cached as a result when a task does run.

I understand your build graph might not be super deep or complicated, our aim is that you would still benefit from caching, especially when it comes to only running necessary tests. It's clear we have some work to do to make it easily configurable in that regard.

@dominikg
Copy link
Member

@dominikg From the linked issue, it seems like feature request is detecting the github actions environment and altering the task prefix accordingly to fit github's log line grouping? If I'm getting that correct, that seems very doable.

Yes, the main problem is that unremovable prefixes alter the actual output an prevent tools from working that parse the output and don't expect the prefixes. So the minimal solution would be to just have a way to omit the prefixes. To retain the information provided, detecting environments and making use of their grouping feature is one way, there may be others.

you would still benefit from caching, especially when it comes to only running necessary tests.

I don't think there is a failsafe way to do this and risk/reward just doesn't pan out for kit in my opinion. The story might be different for larger internal monorepos where you have a bigger build graph, more control, less contributors and less dependants. One botched sveltekit release because of a false positive cache hit undoes any savings we could ever have from this and more.

@gsoltis
Copy link

gsoltis commented Jan 29, 2023

the minimal solution would be to just have a way to omit the prefixes

This is in progress

I don't think there is a failsafe way to do this

I get it, especially for something expected to gate a production release. How is your test suite when run locally or via CI for a PR? Do you think caching would be a benefit there? I'd be curious if there's a divergence between e.g. caching-enabled runs on a PR vs no-caching pre-release CI runs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants