Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update expfmt/text_parse.go to support the new UTF-8 syntax #554

Open
Tracked by #527
ywwg opened this issue Dec 14, 2023 · 13 comments
Open
Tracked by #527

Update expfmt/text_parse.go to support the new UTF-8 syntax #554

ywwg opened this issue Dec 14, 2023 · 13 comments
Assignees

Comments

@ywwg
Copy link
Contributor

ywwg commented Dec 14, 2023

text_parse is a go implementation of a parser for the plain prometheus text format

@jmichalek132
Copy link

I spoke with @ywwg and I would be willing to help with this one over the holidays, could I get this one assigned to me please?

@jmichalek132
Copy link

Notes from the chat:

the basic changes are:

  • current code assumes there must be a metric name before "{". Now, it is ok if a line begins with "{"
  • currently, terms inside {} must have an operator, like = or =~. Now, if there is a quoted term without an operator, that is the * metric name. There can only be one term without an operator.
  • the left hand side of operators inside {} can be quoted
  • for HELP and TYPE and UNIT lines, the metric name may be in quotes.
  • best way to attack this is probably creating the test cases and then fixing the code so they all pass

@bwplotka
Copy link
Member

👍🏽 Thanks!

@ywwg
Copy link
Contributor Author

ywwg commented Jan 11, 2024

@jmichalek132 checking in, any progress on this?

@jmichalek132
Copy link

@jmichalek132 checking in, any progress on this?

sorry unfortunately not I was sick over the holidays, but I'll take a stab at this over the weekend.

@jmichalek132
Copy link

jmichalek132 commented Jan 15, 2024

So finally looking into this today, first thing I am unclear on based on the proposal is what should be the behaviour with the metric name in the prometheus metadata.
I assume if the metric name is valid it should keep working as is.
If it has utf-8 characters it has to be quoted around in the metadata.
Is this assumption correct @ywwg ?

Examples:
This should be ok:

# HELP "my.noncompliant.metric" help text
# TYPE "my.noncompliant.metric" counter
{"my.noncompliant.metric", label="value"} 1

This is not ok due to:
Escape syntax if the metric has a quote: sum(rate({"my "quoted" metric", region="east"}[5m])) or use single quotes in PromQL (not available in the exposition format): sum(rate({'my "quoted" metric', region="east"}[5m])

# HELP 'my.noncompliant.metric' help text
# TYPE 'my.noncompliant.metric' counter
{'my.noncompliant.metric', label="value"} 1

This is not ok:

# HELP my.noncompliant.metric help text
# TYPE my.noncompliant.metric counter
{"my.noncompliant.metric", label="value"} 1

This is not ok either:

# HELP my.noncompliant.metric help text
# TYPE "my.noncompliant.metric" counter
{"my.noncompliant.metric", label="value"} 1

Also when implemeting all these changes, should I do it in a way that can turn on support for utf-8 with a flag, or just implement the changes directly.

@jmichalek132
Copy link

jmichalek132 commented Jan 15, 2024

Also when the metric name is surrounded by the double quote I assume we need to validate that if the metric name contains double qoute it's properly escaped right? Otherwise the parsing could break.
I.e. {"my "quoted\" metric", label="value"} 1

@ywwg
Copy link
Contributor Author

ywwg commented Jan 17, 2024

In the other parsers, I changed them so they can just read the new format without a flag. It was not practical to create switching logic without duplicating all the parsers. Let's start by not using a flag.

I assume if the metric name is valid it should keep working as is.
If it has utf-8 characters it has to be quoted around in the metadata.

Yes that's correct.

The current parser supports escaping already, so it should be fine to use that escaping inside double quotes for the TYPE and HELP lines:

# HELP "my.\"noncompliant\".metric" help text
# TYPE "my.\"noncompliant\".metric" counter

@ywwg ywwg changed the title Update expfmt/text_parse.go to support the new UTF8 syntax Update expfmt/text_parse.go to support the new UTF-8 syntax Feb 8, 2024
ywwg added a commit to ywwg/prometheus that referenced this issue Feb 15, 2024
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.

This does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.

Part of prometheus#13095

Signed-off-by: Owen Williams <owen.williams@grafana.com>
ywwg added a commit to ywwg/prometheus that referenced this issue Feb 15, 2024
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.

This does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.

Part of prometheus#13095

Signed-off-by: Owen Williams <owen.williams@grafana.com>
paveldroo pushed a commit to paveldroo/prometheus that referenced this issue Feb 21, 2024
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.

This does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.

Part of prometheus#13095

Signed-off-by: Owen Williams <owen.williams@grafana.com>
aknuds1 pushed a commit to grafana/mimir-prometheus that referenced this issue Feb 26, 2024
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.

This does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.

Part of prometheus/prometheus#13095

Signed-off-by: Owen Williams <owen.williams@grafana.com>
@ywwg
Copy link
Contributor Author

ywwg commented Mar 6, 2024

Checking in on this again, is there a PR attached?

@jmichalek132
Copy link

Sorry for the delays I didn't have a chance to make much progress on this, if it's delays are an issue, please do re-assign it to someone else. Otherwise I should be able get time to work on it / publish draft next weekend.

@ywwg
Copy link
Contributor Author

ywwg commented Mar 11, 2024

Thanks for letting me know! yeah I think we need to move ahead on this more quickly so I'll be reassigning it. I appreciate your contribution regardless

@ywwg
Copy link
Contributor Author

ywwg commented Mar 11, 2024

@fedetorres93

@fedetorres93
Copy link

I'll take this issue 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants