Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying Folders In buf generate --path Arguments Causes Massive Performance Hit #2713

Open
JesseObrien opened this issue Jan 18, 2024 · 3 comments

Comments

@JesseObrien
Copy link

Hi, I've been chatting through this problem on the buf slack in a thread here. I've been discussing it with @jhump for the most part.

The problem is arising when we're calling buf generate with a folder in the --path arguments versus calling it with files in the --path arguments.

In simple terms:
A) buf generate --config=buf.yaml --path=/foo/bar/directory takes ~1 minute to generate .ts files for 11 .proto files nested in that directory.
B) buf generate --config=buf.yaml --path=/foo/bar/directory/file1.proto,/foo/bar/directory/file2.proto,... takes <1 second to generated .ts files for 11 .proto files nested in the same directory.

The root folder we're calling buf generate from is a very large monorepo with hundreds of thousands of files. If we do not recursively expand all .proto files and inject them into that one --path argument (or specify them as 11 separate --path arugments), buf generate becomes 60+x slower.

If I can provide any more context let me know. I verified this by running buf generate a bunch of different times without expanding the files and specifying the folder to make sure it's the folder that's causing it.

@bufdev
Copy link
Member

bufdev commented Jan 18, 2024

That doesn't seem that unexpected - if buf has to search /foo/bar/directory for all relevant .proto files, that's going to take some time (and I'm certain that however buf searches for it is not as optimized as some typical bash tools are) - we can look into optimizing that path a bit, but searching a directory with 100,000+ files for 11 specific .proto files is going to take some time.

@jhump
Copy link
Member

jhump commented Jan 18, 2024

@bufdev, IIRC, the foo/bar/directory folder does not have that many files. The issue is that the "input" to buf was unspecified, and thus default to the current working directory. The current working directory is the root of the repo and huge. When --path indicates a file, it is fast. But it seems like --path with a directory name isn't actually looking only at that one directory but instead collecting everything in the "input" module (so scanning the huge repo root directory) and then filtering the result based on prefix match. (The above is my suspicion based on the observed behavior; I haven't gone through the implementation code yet to confirm what it's doing.)

@bufdev
Copy link
Member

bufdev commented Jan 18, 2024

That shouldn't be the case - we have optimized for that scenario, so it should only do the search on the directory specified in --path. There may be a regression - we have to play with this locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants