Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigquery: No way to get job statistics from Query #4745

Closed
jameshartig opened this issue Sep 11, 2021 · 11 comments · Fixed by #4748
Closed

bigquery: No way to get job statistics from Query #4745

jameshartig opened this issue Sep 11, 2021 · 11 comments · Fixed by #4748
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@jameshartig
Copy link

Is your feature request related to a problem? Please describe.
I'd like to use Query.Read to query and read the rows which can use the fast path queries API however the returned iterator does not give me any statistics on the query.

Describe the solution you'd like
Can the job's Statistics be put on the iterator so we could get them for metrics/billing purposes without having to resort to using the jobs.insert API?

Describe alternatives you've considered
Alternatively a new method could be added to only do the fast path and return an error if its not supported but this seems more complicated than just adding another field/method onto the iterator.

@jameshartig jameshartig added the triage me I really want to be triaged. label Sep 11, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Sep 11, 2021
@shollyman
Copy link
Contributor

What stats are you interested in? A specific subset, or all?

@shollyman shollyman added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed triage me I really want to be triaged. labels Sep 11, 2021
@jameshartig
Copy link
Author

jameshartig commented Sep 11, 2021

What stats are you interested in? A specific subset, or all?

Ideally as many as possible but for now just TotalBytesProcessed and CacheHit.

I realize the response from the queries API only exposes certain fields.

@shollyman
Copy link
Contributor

Would it be sufficient to expose the RowIterator source I wonder? With the job reference you could then get the job metadata at your leisure.

@jameshartig
Copy link
Author

In the case of the slow path the source would be a Job and in the fast path the source would be the QueryResponse? That sounds good to me!

@shollyman
Copy link
Contributor

Yeah, both paths retain a reference to the job identifier/reference. However, a RowIterator can be instantiated from a table directly, or by a query. So we either have a job reference or a table reference.

@shollyman
Copy link
Contributor

Take a look at #4748 and see if that would address your needs.

@jameshartig
Copy link
Author

Yep! That should work perfectly. Thanks for such a quick turnaround! I wasn't originally aware that the queries BQ endpoint actually still generated a job that you could get the status of but I see now that it returns a JobReference. Great

gcf-merge-on-green bot pushed a commit that referenced this issue Sep 15, 2021
RowIterator can currently come from calling Read() on a Table,
Read() on a Job, and Read() directly from a Query.  The third invocation
is used for fast path queries and doesn't return a query identifier
suitable for gathering statistics.

This PR exposes a SourceJob() method on RowIterator to address this
issue.  Users still get the benefits of optimized query execution, but
can get the reference to a Job for looking up additional statistics
should they need it.

Fixes: #4745
@jameshartig
Copy link
Author

@shollyman thank you so much for the quick resolution! 🙏

@jameshartig
Copy link
Author

@shollyman what's the process for getting this into a release? I saw that v0.95.0 was created for the repo but it seems like the bigquery package is on v1.22.0 and a different version structure/scheme?

@shollyman
Copy link
Contributor

Working on that this week actually. bigquery is its own submodule and released independently. Keep an eye on #4738 for updates.

@shollyman
Copy link
Contributor

https://github.com/googleapis/google-cloud-go/releases/tag/bigquery/v1.23.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants