Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client API: don't read entire response into a buffer before parsing it #977

Open
bboreham opened this issue Jan 24, 2022 · 9 comments
Open

Comments

@bboreham
Copy link
Member

The API client functions all follow this pattern, where they make an http call to get back a []byte then decode the JSON:

_, body, warnings, err := h.client.DoGetFallback(ctx, u, q)
if err != nil {
return nil, warnings, err
}
var qres queryResult
return model.Value(qres.v), warnings, json.Unmarshal(body, &qres)

For larger responses, this buffer gets quite expensive (see #976).

I propose that instead we parse JSON from the response body as it comes in.
I can see that this would make handling timeouts more complicated.

@kakkoyun
Copy link
Member

Thanks for bringing it up. I'm happy to accept any changes for the API part. It is still considered experimental and we have wiggle room to experiment.
@bboreham Do you need a hand on this or do you plan to take care of it?

@bboreham
Copy link
Member Author

Right now I don't expect to start on this. It's quite a big job and not critical to what I'm doing.

There was some discussion in #763 of code-generating the API, which seems an attractive way to make such structural changes.

@importhuman
Copy link

Hey @kakkoyun, can I take this up?

This is basically changing the methods to return parsed data instead of []byte, or something else?

@bboreham
Copy link
Member Author

The current code calls h.client.DoGetFallback() which returns a []byte, via bytes.Buffer.ReadFrom().
(This call is made in multiple places, for each different API method).

My idea was to parse JSON from resp.Body, and not create the buffer.

@importhuman
Copy link

The []byte is being returned by a Do method in client.go. Did you mean changing this method to just return the response and not return []byte (and consequently wherever changes are required), or did I go too deep?

Full flow through which I reached here: DoGetFallback -> apiClientImpl.Do -> httpClient.Do

@importhuman
Copy link

importhuman commented Jun 20, 2022

Had some discussion on this thread on Slack that the aim is likely to parse resp.Body directly into JSON, would like to clarify some more points:

  • Should the unmarshalling be done as early as possible when the actual HTTP request is made, or as late as possible?
  • Should there be new methods that use this signature, or should existing ones be modified?

cc @kakkoyun @bboreham

@kakkoyun
Copy link
Member

kakkoyun commented Jun 20, 2022

@importhuman

  • Should the unmarshalling be done as early as possible when the actual HTTP request is made, or as late as possible?
  • Should there be new methods that use this signature, or should existing ones be modified?

I think we need to check the codebase for this.

If the result of this method is always consumed as JSON, we should unmarshal once and as early as possible.

In any case, we should do this by introducing a new method and marking older methods as deprecated. We can remove them with v2.
In this way, we can make sure we don't break compatibility for external consumers.

@bboreham
Copy link
Member Author

The main API is methods like:

func (h *httpAPI) Query(ctx context.Context, query string, ts time.Time) (model.Value, Warnings, error) {

this returns structured data, not JSON.

My proposal is to change how we arrive at the structured data.

No public API needs to change to achieve this.

importhuman added a commit to importhuman/client_golang that referenced this issue Jul 18, 2022
Signed-off-by: Ujjwal Goyal <importujjwal@gmail.com>
importhuman added a commit to importhuman/client_golang that referenced this issue Sep 5, 2022
Signed-off-by: Ujjwal Goyal <importujjwal@gmail.com>
@bboreham
Copy link
Member Author

I've just noticed that the code calls the JSON parser twice to read each request. Once in `apiClientImpl.Do:

if jsonErr := json.Unmarshal(body, &result); jsonErr != nil {

and once in the individual API methods like Query:

return qres.v, warnings, json.Unmarshal(body, &qres)

Then, different data types make nested calls:

err := json.Unmarshal(b, &v)

I suspect the first two could be merged without major change, but the last kind needs the code rewritten in "streaming" style to avoid building an intermediate buffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants