Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata to push payload #9694

Merged
merged 39 commits into from Jul 13, 2023
Merged

Conversation

salvacorts
Copy link
Contributor

@salvacorts salvacorts commented Jun 13, 2023

What this PR does / why we need it:
We are adding support for attaching labels to each log line. This is one of the series of the PRs broken up to make it easier to review changes.

This PR updates the push payload to send labels with each log entry optionally. The log labels are supposed to be in the same format as the stream labels. Just to put it out, here is how it would look for proto and json push payload with same data:

proto(endpoint: (/loki/api/v1/push|/api/prom/push), Content-Type: application/x-protobuf)(payload built using push.Stream):

push.Stream{
	Entries: []logproto.Entry{
			{
				Timestamp: time.Unix(0, 1688515200000000000),
				Line:      "log line",
				Labels:    `{foo="bar"}`,
			},
	},
	Labels: `{app="test"}`,
}

v1(endpoint: /loki/api/v1/push, Content-Type: application/json):

{
    "streams": [{
        "stream": {
            "app": "test"
        },
        "values": [
            ["1688515200000000000", "log line", {
                "foo": "bar"
            }]
        ]
    }]
}

legacy-json(/api/prom/push, Content-Type: application/json):

{
    "streams": [{
        "labels": "{app=\"test\"}",
        "entries": [{
                "ts": "2023-07-05T00:00:00.000000000Z",
                "line": "log line",
                "labels": "{foo=\"bar\"}"
            }]
    }]
}

Which issue(s) this PR fixes:

Special notes for your reviewer:

We may need to add more thoughtful tests.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR

@pull-request-size pull-request-size bot added size/L and removed size/M labels Jun 13, 2023
pkg/logql/log/fmt_test.go Outdated Show resolved Hide resolved
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, please add a description about the PR to help reviewers. A bit of history + the design choices taken and the implications is sufficient and makes it much easier to know what I'm looking at.

As for the PR, I think JSON content types should support json sub-structure like we do for stream labels in https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki

Basically:

{
  "streams": [
    {
      "stream": {
        "label": "value"
      },
      "values": [
          [ "<unix epoch in nanoseconds>", "<log line>", {"traceID": "foo"} ],
          [ "<unix epoch in nanoseconds>", "<log line>", {"order_id": "bar"} ]
      ]
    }
  ]
}

type PushRequest struct {
Streams []*Stream `json:"streams"`
Streams []LogProtoStream `json:"streams"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this stored a pointer, now it stores a struct. Not sure we want this (would need to be tested independently for performance)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from this PR:
#1145

The only reason why I think this is a pointer is so we don't make a copy of the stream when calling NewStream which receives a ptr to loghttp.Stream
https://github.com/grafana/loki/pull/1145/files#diff-71619e1c80a73b34eade235a55d012a0ddbb3375b8d4ac89c1f4fd672145b915R34

With Sandeep's changes, we no longer use that since now we decode from json to loghttp.PushRequest to then cast it to logproto.PushRequest

Streams: *(*[]logproto.Stream)(unsafe.Pointer(&request.Streams)),

So having that as a ptr or not shouldn't make much difference.


// LogProtoStream helps with unmarshalling of each log stream for push request.
// This might look un-necessary but without it the CPU usage in benchmarks was increasing by ~25% :shrug:
type LogProtoStream struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to implement UnmarshalJSON on logproto.Stream directly -- is that feasible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

Edit: I think that's not possible right-away. logproto.Stream is an alias to push.Stream provided in the package github.com/grafana/loki/pkg/push which is used in this package.

UnmarshalJSON uses the LabelsSet, so pkg/push would need to include pkg/loki which would end up in a cycling dependency.
https://github.com/grafana/loki/pull/9694/files#diff-5b050fb13a302741e2f4a781fa54987e96da67696ae36ea41aa971ef431bfeccR84

I think this Unmarshalling method make sense to live here since:

  • This pkg is the one needing to populate a logproto.Stream from a json. Moreover, this json has a format for entries (json array) that only this package should know/care about.
  • Workarounding the cycling dependency would require adding more changes than the ones required to implement it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can bring all the parsing code to a single place when we work on the endpoint for supporting native OTLP ingestion.

@@ -18,6 +18,7 @@ func init() {
type Entry struct {
Timestamp time.Time
Line string
Labels string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this a string feels counterintuitive compared to using map[string]string, which is what we use via the LabelSet struct used in our v1 push payloads in loghttp/push. Point being, I think we should use JSON directly (not json with embedded stringified json fields) when application/json is the Content-Type. When snappy-encoding is used, logproto can use the string variants for consistency with how it treats labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I see with making this a map, and keeping as a string the labels for an entry in logproto, is that converting from loghttp.Entry to logproto.Entry is going to be way slower.

Right now (both with string labels), we just cast the pointer for the entries array of a stream since they have the same memory layout.

entries := *(*[]logproto.Entry)(unsafe.Pointer(&s.Entries))

If we decide to make one a map and another one a string, we won't be able to do that cast anymore (different memory layouts). Hence, we'll need to go entry by entry converting it from loghttp.Entry to logproto.Entry.

IMO, we should make both use the same layout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this in f43dfab

This is what we need to do instead of the cast:

loki/pkg/loghttp/query.go

Lines 236 to 239 in f43dfab

entries := make([]logproto.Entry, len(s.Entries), len(s.Entries))
for i, e := range s.Entries {
entries[i] = e.ToProto()
}

If we decide to keep this as a string, we can revert the commit. If we decide to make logproto.Entry to use a map for labels, we can build on top of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to our API docs, we use JSON object for stream labels )https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki
so, it makes sense to use JSON object for line metadata also… but you are right @salvacorts that there is no need to convert it to the string when we convert loghttp.Entry to logproto.Entry ….

Copy link
Member

@owen-d owen-d Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@salvacorts You're right, making this anything but a string here will result in performance losses because this it would need to parse labels potentially on every entry coming back from subqueries. Since this happens before we filter out unnecessary entries (over limit, etc), this can quickly result in higher resource utilization, OOMs, etc.

Now, we have a few options:

  1. Keep this as a string in memory, but turn it into a map when return the json response from the http api. This will maintain performance (no conversion from protos) & correctly return a map via the json api.
  2. Use LabelSet for this field (stored as a map). This requires parsing strings -> json for potentially every log line returned from queriers (yikes!). Even when combined with (4), this would require transforming slice -> map.
  3. Use labels.Labels (stored as a slice). When combined with (4) this incurs no transformation cost and the resulting conversion to the json map for the external API is simple & relatively inexpensive.
  4. Due to the performance implications here, I've changed my opinion on making EntryAdapter.labels a string for parity with StreamAdapter.labels. Instead, I think we should use the LabelAdapter pattern to send the labels in the protos as a slice and combine it with (3) to not need to pay conversion costs.

Does this make sense?

P.S. We'll actually need to do this on the Entry and not EntryAdapter types (we only keep the adapter ones around for bench comparisons, but they're unused). Happy to talk through this more when I've got a bit more time, but we did this primarily for serialization optimizations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for 3 and 4. looks like the most optimal way for us. wdyt @salvacorts ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Implemented it in 4028758. We now use labels.Labels in both loghttp.Entry and logproto.Entry.

I wonder if we should take care of duplicated labels. I think may not (?) provided the push payload in json is an object so it should have duplicated keys (is that somehow enforced?).

Note that proto3 supports maps. In the wire, the map is the same as an array so it needs extra logic to have the map API. I ran a benchmark and looks like maps compared to arrays are ~x3 slower to marshal and ~x2 slower to unmarshal (happy to share/discuss the benchmark). Therefore I agree the way to go is using arrays in protobuf.

Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great )

pkg/push/push.proto Outdated Show resolved Hide resolved
integration/client/client.go Outdated Show resolved Hide resolved
vendor/github.com/grafana/loki/pkg/push/push.proto Outdated Show resolved Hide resolved
if parseError != nil {
return nil, parseError
}
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weird -- is there a chance err can be non-nil but parseError is nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling of the implementation for jsonparser.ArrayEach doesn't seems correct to me. The error being passed to the callback is weird as it'll always be nil. Then, after calling the callback, the library checks for the value of the error, which will always be nil since it's passed by value to the callback. There's an issue open for this:
buger/jsonparser#255

IIUC, parseError can be nil while err can be non-nil as it can return for example a MalformedJsonError.
Having said that, if err is not nil, we should return err, not parseError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with 384d556

case 2: // labels
labels := make(LabelSet)
err := jsonparser.ObjectEach(value, func(key, val []byte, _ jsonparser.ValueType, _ int) error {
labels[yoloString(key)] = yoloString(val)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to check types for the values here to ensure they're strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with 384d556

pkg/push/push.proto Outdated Show resolved Hide resolved
@@ -126,6 +143,8 @@ func (EntryEncoder) Encode(ptr unsafe.Pointer, stream *jsoniter.Stream) {
stream.WriteRaw(`"`)
stream.WriteMore()
stream.WriteStringWithHTMLEscaped(e.Line)
stream.WriteMore()
stream.WriteString(e.Labels)
Copy link
Member

@owen-d owen-d Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types are used in the external API, meaning we'd return stringified labels as a field, which we shouldn't do. We should use json semantics here. Ideally, we should also reflexively be able to encode -> decode, meaning json unmarshaling should expect a json object of non-indexed labels as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with f43dfab

Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 66 to 67
Name: yoloString(key),
Value: yoloString(value),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have doubts about using yoloString here.
if underlying array of value is reused , the values here will be also updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove yoloString here. I had used it earlier because the labels were anyways converted to string immediately. The references to string are now being retained longer so it is not safe to use yoloString here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in Slack, removed.

Comment on lines 565 to 567
func yoloString(buf []byte) string {
return *((*string)(unsafe.Pointer(&buf)))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the scariest thing in the PR ) we should use it only if we check that the buffer is not updated, however, a lot of functions in stack trace mean that somebody can do some optimization to reuse the slice and it will break all our implementation. Why do we need it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in Slack, removed.

@salvacorts
Copy link
Contributor Author

I ran the changes in this PR for a couple of hours in a dev cluster and seems to be working fine:

Write path
image
image

Read path
image

Manually ran some queries for recent data that succeeded

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

address @sandeepsukhani 's comment about yolostring and LGTM

@salvacorts salvacorts merged commit aae13c3 into main Jul 13, 2023
3 checks passed
@salvacorts salvacorts deleted the salvacorts/metadata-push-payload branch July 13, 2023 07:13
salvacorts added a commit that referenced this pull request Jul 17, 2023
**What this PR does / why we need it**:

In #9694, we modified the `push` pkg
to import Prometheus as a dependency to use the `labels.Labels` type for
the entries' non-indexed labels.

Having Prometheus as a dependency is problematic since any project
importing the `push` pkg will need to import Prometheus as a result. In
fact, this is one of the reasons why the `push` pkg was extracted from
Loki in the first place (#8259).

This PR removes the dependency of Prometheus from the `push` pkg by
copying some bits of the implementation for Prometheus' `labels.Labels`.
We copy:
- The Labels struct definition
- The JSON Marshaling and Unmarshaling methods for the labels. We need
this so labels are encoded as maps instead of an array of objects.

---

**Notes for reviewers:**

- To implement the JSON Marshaling and Unmarshaling methods the `push`
pkg now depends on `golang.org/x/exp`. I think it should be fine for
projects importing the `push` pkg to also depend on `golang.org/x/exp`,
right?
salvacorts added a commit that referenced this pull request Jul 17, 2023
**What this PR does / why we need it**:
In #9694 we added support for adding
metadata labels to each entry in a push payload. In this PR we update
the following metrics to take into account the bytes needed to store
those labels:

-  `loki_distributor_bytes_received_total`
- `distributor_bytes_received`

**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)

---------

Co-authored-by: Sandeep Sukhani <sandeep.d.sukhani@gmail.com>
grafanabot pushed a commit that referenced this pull request Jul 17, 2023
**What this PR does / why we need it**:

In #9694, we modified the `push` pkg
to import Prometheus as a dependency to use the `labels.Labels` type for
the entries' non-indexed labels.

Having Prometheus as a dependency is problematic since any project
importing the `push` pkg will need to import Prometheus as a result. In
fact, this is one of the reasons why the `push` pkg was extracted from
Loki in the first place (#8259).

This PR removes the dependency of Prometheus from the `push` pkg by
copying some bits of the implementation for Prometheus' `labels.Labels`.
We copy:
- The Labels struct definition
- The JSON Marshaling and Unmarshaling methods for the labels. We need
this so labels are encoded as maps instead of an array of objects.

---

**Notes for reviewers:**

- To implement the JSON Marshaling and Unmarshaling methods the `push`
pkg now depends on `golang.org/x/exp`. I think it should be fine for
projects importing the `push` pkg to also depend on `golang.org/x/exp`,
right?

(cherry picked from commit 15af77b)
vlad-diachenko pushed a commit that referenced this pull request Jul 17, 2023
Backport 15af77b from #9937

---

**What this PR does / why we need it**:

In #9694, we modified the `push` pkg
to import Prometheus as a dependency to use the `labels.Labels` type for
the entries' non-indexed labels.

Having Prometheus as a dependency is problematic since any project
importing the `push` pkg will need to import Prometheus as a result. In
fact, this is one of the reasons why the `push` pkg was extracted from
Loki in the first place (#8259).

This PR removes the dependency of Prometheus from the `push` pkg by
copying some bits of the implementation for Prometheus' `labels.Labels`.
We copy:
- The Labels struct definition
- The JSON Marshaling and Unmarshaling methods for the labels. We need
this so labels are encoded as maps instead of an array of objects.

---

**Notes for reviewers:**

- To implement the JSON Marshaling and Unmarshaling methods the `push`
pkg now depends on `golang.org/x/exp`. I think it should be fine for
projects importing the `push` pkg to also depend on `golang.org/x/exp`,
right?

Co-authored-by: Salva Corts <salva.corts@grafana.com>
sandeepsukhani pushed a commit that referenced this pull request Jul 18, 2023
**What this PR does / why we need it**:
In #9694 we support adding metadata
labels for each entry in the push payload. In this PR we take that
metadata and add it to the entries in the chunk. Supporting
serialization and deserialization of those metadata labels.

---------

Signed-off-by: Vladyslav Diachenko <vlad.diachenko@grafana.com>
Co-authored-by: Vladyslav Diachenko <vlad.diachenko@grafana.com>
salvacorts added a commit that referenced this pull request Sep 12, 2023
**What this PR does / why we need it**:
Even though the[ Loki HTTP API docs for the push endpoint][1] state that
the stream label values should be strings, we previously didn't enforce
this requirement. With #9694, we
started enforcing this requirement, and that broke some users.

In this PR we are reverting this type of assertion and adding a bunch of
tests to avoid the regression in the future.


[1]:
https://grafana.com/docs/loki/latest/reference/api/#push-log-entries-to-loki
grafanabot pushed a commit that referenced this pull request Sep 12, 2023
**What this PR does / why we need it**:
Even though the[ Loki HTTP API docs for the push endpoint][1] state that
the stream label values should be strings, we previously didn't enforce
this requirement. With #9694, we
started enforcing this requirement, and that broke some users.

In this PR we are reverting this type of assertion and adding a bunch of
tests to avoid the regression in the future.

[1]:
https://grafana.com/docs/loki/latest/reference/api/#push-log-entries-to-loki

(cherry picked from commit 5ab9515)
salvacorts added a commit that referenced this pull request Sep 12, 2023
Backport 5ab9515 from #10550

---

**What this PR does / why we need it**:
Even though the[ Loki HTTP API docs for the push endpoint][1] state that
the stream label values should be strings, we previously didn't enforce
this requirement. With #9694, we
started enforcing this requirement, and that broke some users.

In this PR we are reverting this type of assertion and adding a bunch of
tests to avoid the regression in the future.


[1]:
https://grafana.com/docs/loki/latest/reference/api/#push-log-entries-to-loki

Co-authored-by: Salva Corts <salva.corts@grafana.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
**What this PR does / why we need it**:
Even though the[ Loki HTTP API docs for the push endpoint][1] state that
the stream label values should be strings, we previously didn't enforce
this requirement. With grafana#9694, we
started enforcing this requirement, and that broke some users.

In this PR we are reverting this type of assertion and adding a bunch of
tests to avoid the regression in the future.


[1]:
https://grafana.com/docs/loki/latest/reference/api/#push-log-entries-to-loki
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants