Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timezones support to promql #10163

Closed
wants to merge 8 commits into from
Closed

Add timezones support to promql #10163

wants to merge 8 commits into from

Conversation

sylr
Copy link
Contributor

@sylr sylr commented Jan 13, 2022

This function aims to solve problems for those who need to account for DST in their queries.

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
@juliusv
Copy link
Member

juliusv commented Jan 13, 2022

Thanks! Others always have more to say about timezones than me, but I do remember a previous conversation about just this:

That second comment from Björn about the edge cases that a simple offset function wouldn't cover is especially relevant to think about.

@juliusv juliusv requested a review from beorn7 January 13, 2022 15:52
@beorn7
Copy link
Member

beorn7 commented Jan 13, 2022

Also a lot of discussion can be found on this design doc: https://docs.google.com/document/d/1xfw1Lb1GIRZB_-4iFVGkgwnpwuBemWfxYqFdBm7APsE/edit#heading=h.bupciudrwmna

tl;dr: This is not as easy as it looks.

That's just very generally about the state of this problem. I haven't looked into this PR in particular, but I'll put this on my review queue for a proper review. (Sadly, my backlog is quite significant here.)

@sylr
Copy link
Contributor Author

sylr commented Jan 13, 2022

Pasted_Image_13_01_2022_19_28

@sylr
Copy link
Contributor Author

sylr commented Jan 13, 2022

Quite frankly I fail to see what is "not that easy".

I don't claim to see all the possible timezone problems one can have but as a +5 years old daily prometheus user, give me this or any equivalent (hour(vector(time()), "Europe/Paris"), hour(vector(time("Europe/Paris"))) ... etc) and prometheus/alertmanager#2782 and you would have solved 100% of my prometheus timezone related problems.

@juliusv
Copy link
Member

juliusv commented Jan 13, 2022

@sylr Yeah, if I understand correctly, because you provide a full time_tz() function and not a pure offset function, your approach doesn't actually have the DST switch problem as described in #8871 (comment). As far as I can see, @beorn7's design doc about time handling doesn't actually say much about time zones, but there's a bit more general stuff in that PR discussion.

While I'm not the timezone guy, there are still at least some things I'd like a reviewer to take into account:

  • Do we want to introduce a PromQL function that does not work on all platforms (or alternatively, can it be made to work on all)?
  • When something is officially part of PromQL, all other projects and vendors who want to use the "Prometheus Compatible" mark will need to implement this. Currently all compatible systems are written in Go (or have a Go layer that creates comaptibility), but does timezone handling introduce any unforeseenly large burden on other implementors?
  • Time zone definitions around the world change regularly (laws around it are being passed all the time). Most people will not be fully up to date with the very latest time zone database changes. Especially given the point above, what does this mean for compatibility testing? Or do we just exempt that area from compatibility for the most part?

That said, if those concerns are at least acknowledged and deemed fine, I would be happy if timezone functionality (such as this or another viable alternative) would make it into Prometheus, so that users can stop using weird workarounds such as the one mentioned in #8871 (comment).

@sylr
Copy link
Contributor Author

sylr commented Jan 13, 2022

Do we want to introduce a PromQL function that does not work on all platforms (or alternatively, can it be made to work on all)?

Here are the possible choices I see:

  • Disallow timezone related functions on Windows
  • Allow them knowing that they are currently broken
  • Build a Windows version with the go build tag timetzdata so that windows build get working timezones but embedded in the binary so not coming from the OS.

The status quo of not having timezone support in PromQL is not an option (according to me).

When something is officially part of PromQL, all other projects and vendors who want to use the "Prometheus Compatible" mark will need to implement this. Currently all compatible systems are written in Go (or have a Go layer that creates comaptibility), but does timezone handling introduce any unforeseenly large burden on other implementors?

Honestly I don't know. However, I would be surprised to find an all-purpose language without the basic functions to do timezone diffs in its standard library.

Time zone definitions around the world change regularly (laws around it are being passed all the time). Most people will not be fully up to date with the very latest time zone database changes. Especially given the point above, what does this mean for compatibility testing? Or do we just exempt that area from compatibility for the most part?

Timezone definitions shouldn't be anyone else's responsibility but the OS.

@juliusv
Copy link
Member

juliusv commented Jan 14, 2022

Yeah, fair views. What do people thinking about adding time_tz() behind an experimental feature flag (--enable-feature=time-tz) maybe for the time being? That way we have some time to decide on the Windows and compatibility questions before making it an official part of PromQL, and at the same time if we decide on a different way of approaching this functionality, the door would still be open to that.

@sylr
Copy link
Contributor Author

sylr commented Jan 14, 2022

What do people thinking about adding time_tz() behind an experimental feature flag (--enable-feature=time-tz) maybe for the time being?

I'm all for it if it can get things moving on the matter.

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
@beorn7
Copy link
Member

beorn7 commented Jan 14, 2022

The discussion about timezone support has happened at many places already:

All of these demonstrate the problems, but also provide a lot of ideas and insights how to approach them.

I don't think it's helpful to start another lengthy discussion in this PR and disregard everything that has been discussed before. (Stating that "quite frankly I fail to see what is 'not that easy'" won't convince the participants of above discussions that they were all wrong all the time.)

I think the next step should be to finalize a design doc (by amending the existing one or rewriting it – the existing one was around the basic idea of providing a timezone on the command line, which hit a lot of resistance, so that approach would just be one of the possible alternatives to be presented in the design doc).

Thanks to feature flags, we can certainly experiment with features where we are unsure if they will work or if they will be useful. However, in this case, we already know about concrete problems, and various ideas have been expressed how to address them. Those ideas need to be discussed rather than disregarded. Simply implementing a feature while "failing to see the problem" isn't helping.

@beorn7
Copy link
Member

beorn7 commented Jan 14, 2022

Naming nit: time_tz is a bad name as the argument is a location, not a timezone.

time.LoadLocation returns the (current) TZ for the provided location. The TZ for "Europe/Paris" is either "CEST" or "CET", depending on the time of the year. Or in other words: If you know the timezone, you already know if you are affected by so-called "DST" or not.

The name should be something like time_for_location.

@sylr
Copy link
Contributor Author

sylr commented Jan 14, 2022

According to my tests (on MacOS) time.LoadLocation() will load anything it finds in the zoneinfo dir of your OS so time_tz("CET") and time_tz("Europe/Paris") will both work.

@beorn7
Copy link
Member

beorn7 commented Jan 14, 2022

Yeah, if I understand correctly, because you provide a full time_tz() function and not a pure offset function, your approach doesn't actually have the DST switch problem as described in #8871 (comment).

I'm afraid the approach in this PR has precisely the problems I have described in #8871 (comment), just that it obfuscates things a bit more by returning something that looks like a UNIX time but it is shifted by a certain offset. I think #8871 is actually better than this PR because it makes explicit what it does. However, both approaches suffer from the problem that they calculate the offset for the evaluation time, but in many queries, you would like to know the offset for another point in time.

Quoting from the linked comment:

It is 2021-03-28, 01:30h UTC. hour(time()) (and hour()) returns 1. timezone_offset("Europe/Berlin") returns 7200 (2 hours offset), so hour(time() + timezone_offset("Europe/Berlin")) correctly returns 3. But what if I want to know the hour of a timestamp that is one hour ago? hour(time()-3600) correctly returns 0, but hour(time() - 3600 + timezone_offset("Europe/Berlin")) wrongly returns 2. At 00:30h UTC, so-called daylight saving time hasn't kicked in yet, and it was only 01:30h in Berlin.

Reworded for this PR, it would read: "It is 2021-03-28, 01:30h UTC. hour(time()) (and hour()) returns 1. hour(time_tz("Europe/Berlin")) correctly returns 3. But what if I want to know the hour of a timestamp that is one hour ago? hour(time()-3600) correctly returns 0, but hour(time_tz("Europe/Berlin") - 3600) wrongly returns 2. At 00:30h UTC, so-called daylight saving time hasn't kicked in yet, and it was only 01:30h in Berlin."

As already discussed in that comment, a more viable solution is to provide a location to functions like hour(). Still following the above example, hour(time(), "Europe/Berlin") and hour(time() - 3600, "Europe/Berlin") would return the correct values of 3 and 1, respectively. The concern raised in that comment was that this needs modifying a whole lot of functions, but maybe that's the least bad solution for now. Back then, we were also discussing other concerns, for which we have gathered ideas in the meantime, see comment above. So overall, this circles back to summarizing all the discussion so far in a design doc and come up with a proposal that takes the concerns into account (or rejects them in an informed way).

Of course, the best case would be if someone came up with a more elegant solution that could also answer questions like "how much time has passed since midnight (or since the beginning of the month)?", but I said that before. This might be too involved for now, but I would feel much better if someone came to that conclusion who actually understands the whole discussion and the problems raised.

@roidelapluie
Copy link
Member

roidelapluie commented Jan 14, 2022

We would need to have the embedded TZ for windows (via build commands) + have a dedicated flag (for all os'es) to pass a recent tzdata file. Somehow it would be great to expose the tz version as metric.

@beorn7
Copy link
Member

beorn7 commented Jan 14, 2022

About how to provide the TZ data: The design doc already contains an interesting discussion with a conclusion in the comments. Search for "Should we provide the embedded tzdata for windows users?" and also read the comments there.

Returning a fake unix timestamp offested by the diff of TZ - UTC
is not a good way to do things.


Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
@sylr sylr marked this pull request as draft January 14, 2022 15:32
@sylr
Copy link
Contributor Author

sylr commented Jan 14, 2022

How would you propagate the feature flag value to github.com/prometheus/prometheus/promql/parser ?

@sylr sylr changed the title Add a time_tz(tz) function to promql Add timezones support to promql Jan 14, 2022
Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
@roidelapluie
Copy link
Member

roidelapluie commented Jan 17, 2022

Naming nit: time_tz is a bad name as the argument is a location, not a timezone.

time.LoadLocation returns the (current) TZ for the provided location. The TZ for "Europe/Paris" is either "CEST" or "CET", depending on the time of the year. Or in other words: If you know the timezone, you already know if you are affected by so-called "DST" or not.

The name should be something like time_for_location.

I disagree with this. As per wikipedia, within the tz database, a time zone is any national region where local clocks have all agreed since 1970. In general, Timezone is way more clear than location and more user-friendly. https://en.wikipedia.org/wiki/Tz_database#File_formats

@sylr sylr marked this pull request as ready for review January 17, 2022 10:15
@roidelapluie
Copy link
Member

I think it would be cleaner to have the exception handled by the engine rather than by the parser.

@beorn7
Copy link
Member

beorn7 commented Jan 19, 2022

WRT https://en.wikipedia.org/wiki/Tz_database#File_formats : Indeed, the TZ database uses locations to name timezones. However, https://pkg.go.dev/time has a usage that corresponds to what I had said. Therefore, "tz" or "timezone" is ambiguous outside of the immediate context of the TZ database. "location" is not ambiguous and I would prefer if we used that.

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
@roidelapluie
Copy link
Member

It seems that this is still using parser, while other experimental features like @ fail at the engine side. Additionally, engine already has a mechanism for feature enabling (see EngineOpts).

@sylr
Copy link
Contributor Author

sylr commented Jan 19, 2022

You want to overwrite the Parser's Functions from the engine ?

@roidelapluie
Copy link
Member

I think we should maybe have a second FunctionCalls in the engine that can be used to process this. The initial Functioncalls would fail if there is at tz passed.

@beorn7
Copy link
Member

beorn7 commented Jan 21, 2022

Thanks for all the work you are investing here. However, I'm concerned that we go deep into details here without having an agreed-upon design doc. We are discussing general aspects mixed up with implementation details here, which makes it hard to follow for a wider audience. Plus, a PR is just not the right form to make important design decisions.

Ideas seem to take shape here, so I guess it won't be that hard to write them down in a design doc, with their reasoning, so that the can be presented to the developer community, discussed in a more visible and more focused way, get buy-in from the community as a whole, and ideally get final agreement at a dev summit.

All of this reminds me that we should really better document our proposal process, cf. #8986

@valyala
Copy link

valyala commented Feb 4, 2023

How about much simpler solution - to add timezone_offset function, which would return the offset in seconds for the specified timezone. For example, timezone_offset("America/Los_Angeles") would return -28800, e.g. -8h. Then this function can be freely combined with other time-related functions. For example, floor((time() + timezone_offset("America/Los_Angeles")) / 3600) % 24 would return the hour at America/Los_Angeles timezone.

An additional benefit for having a separate timezone_offset() function is that it can be combined with time series values, which store timestamps. For example, prometheus_config_last_reload_success_timestamp_seconds + timezone_offset("America/los_Angeles") would return the last successful config reload time in seconds in America/Los_Angeles time zone.

This feature is already implemented in VictoriaMetrics, so you can play with it there.

Additionally, it would be great if Prometheus would support time-related constants anywhere in the query. For example, 8h would be automatically converted to 8*3600. This will simplify time-related calculations. For example, the following query would return the average rate for samples' scraping over the last hour:

sum(sum_over_time(scrape_samples_scraped[1h])) / 1h

Or, if you know needed timezone offset, you could write something like the following time() + 5h30m for Indian standard time. This feature is also supported by VictoriaMetrics, so you can play with it there.

@roidelapluie
Copy link
Member

The solution of using the timezone_offset function to calculate time in a specific time zone may not be suitable for all use cases, as it may not be compatible with metrics collected prior to Daylight Saving Time (DST) changes. This is because the timezone_offset function provides the offset for the current query time and not the metric time. As a result, if a metric was collected prior to a DST change, the calculation of time in a specific time zone using timezone_offset would be incorrect.

@juliusv
Copy link
Member

juliusv commented Feb 4, 2023

Aside: Issues with timezone_offset() have already been discussed above and in other places by @beorn7, such as #10163 (comment) and #8871 (comment).

@beorn7
Copy link
Member

beorn7 commented Feb 7, 2023

Thanks, @juliusv, for linking my relevant ramblings. :)

WRT making all durations numbers and vice versa, there is already #9138 . It's almost done, IMHO, just somebody has to shepherd it towards conclusion.

@bboreham
Copy link
Member

As said at #10163 (comment)

I think the next step should be to finalize a design doc (by amending the existing one or rewriting it

Given there has been a lot of debate and no progress, we decided at the bug-scrub meeting to close this PR until a design is agreed.

@bboreham bboreham closed this Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants