[ENH]: Automatically trim the time on the x axis #28158

RomainPastureau · 2024-05-01T14:10:14Z

Problem

I would like to have the x-axis showing the timestamps of the samples of audio files. Here is a minimal example:

import numpy as np
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
from scipy.io import wavfile

# Open the WAV files
audio_1 = wavfile.read("audio1.wav")
freq_audio_1 = audio_1[0]
samples_1 = audio_1[1][:, 0]  # Turn to mono

audio_2 = wavfile.read("audio2.wav")
freq_audio_2 = audio_2[0]
samples_2 = audio_2[1][:, 0]  # Turn to mono

# Create the timestamps
t_audio_1 = np.arange(0, len(samples_1)) / freq_audio_1
t_audio_2 = np.arange(0, len(samples_2)) / freq_audio_2

# We turn them into datetime
t_audio_1 = np.array(t_audio_1*1000, dtype="datetime64[ms]")
t_audio_2 = np.array(t_audio_2*1000, dtype="datetime64[ms]")

# Create the figure
fig, ax = plt.subplots(1, 2, constrained_layout=True)

# If the audio files are more than 1 hour, we format as HH:MM:SS, else just MM:SS
if len(samples_1) / freq_audio_1 >= 3600 and len(samples_2) / freq_audio_2 >= 3600 :
   formatter = mdates.AutoDateFormatter(mdates.AutoDateLocator(), defaultfmt='%H:%M:%S')
else:
   formatter = mdates.AutoDateFormatter(mdates.AutoDateLocator(), defaultfmt='%M:%S')

plt.gcf().axes[0].xaxis.set_major_formatter(formatter)
plt.gcf().axes[1].xaxis.set_major_formatter(formatter)

ax[0].plot(t_audio_1, samples_1)
ax[1].plot(t_audio_2, samples_2)
plt.show()

Here is the output:

As you can see, the microsecond precision makes so that the ticks on the x axis are shown on top of each other.

Proposed solution

The ideal would be to have the plot automatically decide how many significant digits after the comma are necessary, depending on the level of zoom (in a similar fashion to the way Audacity displays timestamps).

Thank you :)

The text was updated successfully, but these errors were encountered:

story645 · 2024-05-01T18:11:48Z

As a first pass, this could probably be implemented using funcformatter or zoom events & might make a good example of dynamic label updating on zoom?

https://matplotlib.org/3.8.4/gallery/ticks/custom_ticker1.html#sphx-glr-gallery-ticks-custom-ticker1-py
+
https://matplotlib.org/stable/gallery/event_handling/zoom_window.html

Basically wondering on trade offs if a good example of how to do this would be more useful than a library function that may need a bunch of parameters to get folks what they want.

WeatherGod · 2024-05-01T18:47:04Z

I wonder if the offset label could also help? Like, if all of the tick labels have the same date portion, then stick that in the "offset" label, and put the times in the tick labels, like how we'd do for tick labels of very large numbers.

…

On Wed, May 1, 2024 at 2:12 PM hannah ***@***.***> wrote: As a first pass, this could probably be implemented using funcformatter & might make a good example of dynamic label updating on zoom? https://matplotlib.org/3.8.4/gallery/ticks/custom_ticker1.html#sphx-glr-gallery-ticks-custom-ticker1-py Basically wondering on trade offs if a good example of how to do this would be more useful than a library function that may need a bunch of parameters to get folks what they want. — Reply to this email directly, view it on GitHub <#28158 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACHF6HXHNFJA6HAPJ7ULH3ZAEV7ZAVCNFSM6AAAAABHCAXTAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBYHA3DMMBTGU> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

timhoffm · 2024-05-01T23:09:00Z

@WeatherGod That’s basically what ConciseDateFormatter is doing https://matplotlib.org/stable/gallery/ticks/date_concise_formatter.html.

RomainPastureau · 2024-05-02T08:38:41Z

The problem with the ConsiseDateFormatter, is that it will indicate a date on top of the time. For timestamps, I don't really want Jan 1, 1970 to be there.

To elaborate on the issue, there are actually two issues in one: make sure that the ticks labels don't appear on top of each other, and reduce the ticks labels to their most significant digit.

story645 · 2024-05-02T15:29:22Z

To elaborate on the issue, there are actually two issues in one: make sure that the ticks labels don't appear on top of each other, and reduce the ticks labels to their most significant digit.

Is rotating the tick labels an option? Otherwise it'd be a locator that chooses ticks based on label width, which means it would have to be formatter/label dependent, which I'm not sure is technically possible
something like a set_powerlimits for dates?

timhoffm · 2024-05-02T20:24:53Z

Otherwise it'd be a locator that chooses ticks based on label width, which means it would have to be formatter/label dependent, which I'm not sure is technically possible

AFAIK this is not possible. And it would be tricky: Formatters need to know all label positions and from that determine the number of significant digits. If Locators on the other hand, want to decide placement on the label size, this would result in a mutual dependence loop.

story645 · 2024-05-03T00:12:51Z

Is this time formatter from librosa kinda what you're after: https://librosa.org/doc/main/generated/librosa.display.TimeFormatter.html

rcomer · 2024-05-03T09:29:22Z

I am confused by the example. Standard python datetimes have microseconds so I tried this:

import datetime
import matplotlib.pyplot as plt

x = datetime.datetime(2021, 5, 4)
dates = [x.replace(minute=n, microsecond=n) for n in range(50)]

ax = plt.figure().add_subplot()
ax.plot(dates, range(50))

If I use the date formatter as in the OP (less than 1 hour case) I only get years:

import matplotlib.dates as mdates

formatter = mdates.AutoDateFormatter(mdates.AutoDateLocator(), defaultfmt='%M:%S')
ax.xaxis.set_major_formatter(formatter)

If instead I define the formatter using the locator instance on the axis, it goes back to what you get by default:

formatter = mdates.AutoDateFormatter(ax.xaxis.get_major_locator(), defaultfmt='%M:%S')
ax.xaxis.set_major_formatter(formatter)

What am I missing? I admit I have not really followed what the defaultfmt keyword does.

I am using mpl 3.8.2.

RomainPastureau · 2024-05-03T12:41:19Z

@story645 Rotating the ticks is not always an option, unfortunately. I would need this for two different projects, and in one of these I am plotting 8 subplots (2 horizontally, 4 vertically) in the same graph, so I need the x axis to be as compact as possible. I will look into your librosa function, though, thanks for that!

@rcomer To be fair I am also confused between formatters and locators, and the code I have provided is mostly the result of tinkering around with diverse solutions I found online. In any case, I did obtained the different results you are showing in your figures at various points, but none of them give interesting results... So I am a bit lost.

In any case, thanks to all of you for the interest on this question. After looking around and asking the question on other forums, I seem to understand that it is a feature that other people would be interested in!

jklymak · 2024-05-03T13:46:30Z

Would you consider making a self contained example that makes the problem clear? We can't reproduce your issue if we can't run your code.

story645 · 2024-05-03T15:56:03Z

To be fair I am also confused between formatters and locator

Can you elaborate on this a bit so that we can try and make the documentation clearer?

Broadly:

locators control tick position
formatters control tick label

These generally function independently, but sequentially - so locator generates positions then formatter labels those positions. That's why Tim and I don't think a locator that adjusts tick positions based on the formatter generated label would be feasible, though I'm now curious about the Librosa implementation.

story645 · 2024-05-03T16:24:05Z

Also @ksunden dynamic relabeling/scale adjustment based on subsample resolution of unitized data (which thinking more, Librosa knows it has dates) might be a good example for the data-prototype (if you don't already have one 😅).

rcomer · 2024-05-03T20:02:55Z

I think I have understood now. It is not about the precision of the data, but if the interval between the ticks is less than a second:

import datetime
import matplotlib.pyplot as plt

x = datetime.datetime(2021, 5, 4)
dates = [x.replace(second=n) for n in range(3)]

ax = plt.figure().add_subplot()
ax.plot(dates, range(3))

The behaviour is defined by the rcParam

matplotlib/lib/matplotlib/mpl-data/matplotlibrc

Line 461 in e253aa2

#date.autoformatter.microsecond: %M:%S.%f

rcomer · 2024-05-03T20:22:30Z

And actually the AutoDateFormatter docstring tells us we can modify that with a function:

import datetime

import matplotlib.pyplot as plt
import matplotlib.dates as mdates

x = datetime.datetime(2021, 5, 4)
dates = [x.replace(second=n) for n in range(3)]

ax = plt.figure().add_subplot()
ax.plot(dates, range(3))

def my_format_function(x, pos=None):
    x = mdates.num2date(x)
    fmt = '%M:%S.%f'
    label = x.strftime(fmt)
    label = label.rstrip("0")
    label = label.rstrip(".")
    return label

formatter = mdates.AutoDateFormatter(ax.xaxis.get_major_locator())
formatter.scaled[1 / mdates.MUSECONDS_PER_DAY] = my_format_function
ax.xaxis.set_major_formatter(formatter)

RomainPastureau · 2024-05-06T14:42:15Z

Hi!

Thank you for all of your responses. I realize now that my question wasn't clear at the beginning, so I will provide a full and simplified description of my problem, as suggested by @jklymak

Clear description of the issues

When plotting a time series, I would like to get the timestamps on the x-axis.
The timestamps should always show at least the minutes and seconds (MM:SS).
If the time series is longer than an hour, the timestamps should have the format HH:MM:SS.
Decimals of seconds should appear if their digits are significant at the current zoom level on the figure. For example, if the ticks on the x-axis are 5 ms long, three decimals should appear, not more.
In a similar fashion to what appears for "regular" floats on the x-axis, if I zoom on the generated figure, I would like to see the tick labels only up to the last significant digit (by default, 6 decimals always appear for time series).
The tick labels should not appear on top of each other, for readability.

First example: no formatting, numpy datetime64[us]

This is the minimal reproducible example for my use case. I am importing a WAV file, creating timestamps from its frequency, and plotting it, passing the timestamps on the x-axis.

Code

import os.path as op
import scipy.io as sio
import numpy as np
import matplotlib.pyplot as plt

# Get the WAV example file from Scipy
data_dir = op.join(op.dirname(sio.__file__), 'tests', 'data')
wav_example_file = op.join(data_dir, 'test-44100Hz-2ch-32bit-float-be.wav')

# Load the audio file
audio = sio.wavfile.read(wav_example_file)
freq_audio = audio[0]
samples = audio[1][:, 0]  # Take only the left channel

# Create the timestamps
t_audio = np.arange(0, len(samples)) / freq_audio

# Turn them into datetime
t_audio = np.array(t_audio*1000000, dtype="datetime64[us]")

# Create the figure
plt.plot(t_audio, samples)
plt.show()

Output

Remarks

As you can see, when plotted, the timestamps are on top of each other - plus, here, we do not really care about the precision after the 3rd decimal place as the ticks are spaced by 2 ms each. Ideally, the significant digits would increase when I zoom in dynamically using the mouse; however, the precision remains at 6 digits after the decimal point, no matter what.

Note that the output is exactly the same if instead of:

t_audio = np.array(t_audio*1000000, dtype="datetime64[us]")

I use Python datetime objects via:

t_audio = [datetime.datetime(1970, 1, 1, int(t // 3600) % 24, int((t // 60) % 60), int((t % 60) // 1), int((t % 1) * 1000000)) for t in t_audio]

Using Python datetime object makes the computation time way larger for longer audio files, though.

Second example: no formatting, numpy timedelta64[us]

This time I am using a numpy timedelta64 object for the x-axis:

Code

import os.path as op
import scipy.io as sio
import numpy as np
import matplotlib.pyplot as plt

# Get the WAV example file from scipy
data_dir = op.join(op.dirname(sio.__file__), 'tests', 'data')
wav_example_file = op.join(data_dir, 'test-44100Hz-2ch-32bit-float-be.wav')

# Load the audio file
audio = sio.wavfile.read(wav_example_file)
freq_audio = audio[0]
samples = audio[1][:, 0]  # Take only the left channel

# Create the timestamps
t_audio = np.arange(0, len(samples)) / freq_audio

# Turn them into datetime
t_audio = np.array(t_audio*1000000, dtype="timedelta64[us]")

# Create the figure
plt.plot(t_audio, samples)
plt.show()

Output

Remarks

This time, the time format is ignored and the x-axis only shows the microseconds. Once again, the output is the same if I use Python timedelta objects:

Third example: using AutoDateFormatter

In order to get the format I want, I am now trying to use Matplotlib formatters. Here is the result using AutoDateFormatter:

Code

import os.path as op
import scipy.io as sio
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import dates as mdates

# Get the WAV example file from scipy
data_dir = op.join(op.dirname(sio.__file__), 'tests', 'data')
wav_example_file = op.join(data_dir, 'test-44100Hz-2ch-32bit-float-be.wav')

# Load the audio file
audio = sio.wavfile.read(wav_example_file)
freq_audio = audio[0]
samples = audio[1][:, 0]  # Take only the left channel

# Create the timestamps
t_audio = np.arange(0, len(samples)) / freq_audio

# Turn them into datetime
t_audio = np.array(t_audio*1000000, dtype="datetime64[us]")

# Create the figure
fig = plt.figure()
plt.plot(t_audio, samples)

# Use a formatter
formatter = mdates.AutoDateFormatter(mdates.AutoDateLocator())
fig.axes[0].xaxis.set_major_formatter(formatter)

# Plot the figure
plt.show()

Output

Remarks

Obviously, here, that's not the output we want, so let's try something else. Adding a default format (formatter = mdates.AutoDateFormatter(mdates.AutoDateLocator(), defaultfmt="%H:%M:%S")) doesn't change anything.

Fourth example: using a function (@rcomer solution)

import os.path as op
import scipy.io as sio
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import dates as mdates

def my_format_function(x, pos=None):
    x = mdates.num2date(x)
    fmt = '%M:%S.%f'
    label = x.strftime(fmt)
    label = label.rstrip("0")
    label = label.rstrip(".")
    return label

# Get the WAV example file from scipy
data_dir = op.join(op.dirname(sio.__file__), 'tests', 'data')
wav_example_file = op.join(data_dir, 'test-44100Hz-2ch-32bit-float-be.wav')

# Load the audio file
audio = sio.wavfile.read(wav_example_file)
freq_audio = audio[0]
samples = audio[1][:, 0]  # Take only the left channel

# Create the timestamps
t_audio = np.arange(0, len(samples)) / freq_audio

# Turn them into datetime
t_audio = np.array(t_audio*1000000, dtype="datetime64[us]")

# Create the figure
ax = plt.figure().add_subplot()
plt.plot(t_audio, samples)

# Use a formatter
formatter = mdates.AutoDateFormatter(ax.xaxis.get_major_locator())
formatter.scaled[1 / mdates.MUSECONDS_PER_DAY] = my_format_function
ax.xaxis.set_major_formatter(formatter)

# Plot the figure
plt.show()

Output

Remarks

Now it works! I can adapt it to have the hours depending on the length of the timestamps. The only caveat, it seems, is that time before 00:00 is 59:59 (while ideally, i would prefer a -00:01) - which makes sense as we are working with datetime and not timedelta.

Conclusion

So, I guess the problem for my specific use case is solved (thanks again @rcomer!) - that being said, I do think that other users may have an interest in this specifically. Matplolib documentation focuses a lot on personalized tick formats for dates, taking into account month lengths, business days, etc., which is incredibly useful when working with dates. But, when working with timestamps, the solution wasn't straightforward (or maybe I missed something). If I am not the only one having that issue, maybe it would be an interesting feature to implement? It could be a new type of formatter that would take a default time format and respond dynamically to it.

Thank you @story645 for your description of Formatters and Locators, I understand it better now. Speaking of, when selecting %H:%M:%S.%f formatting instead of %M:%S.%f, sometimes the labels are shown on top of their neighbors. I believe a function that detects the maximum length of a label given the format and calculates the amount of ticks accordingly may solve my issue here.

timhoffm · 2024-05-06T15:33:45Z

Glad to see @rcomer’s suggestion solves your problem. Would an example be helpful enough for other users?

I believe a function that detects the maximum length of a label given the format and calculates the amount of ticks accordingly may solve my issue here.

As stated above this is not trivial, unfortunately. The locators first decide on the positions (and number) of number of ticks. Then, the formatter decide how to represent them. - One can only reasonably decide on the formatting if you know all ticks to be plotted.
You want additionally the reverse: decide on the positions given the format. This mutual interaction is difficult to realize with the current architecture of separate locators and formatters. At best, you could have a loop that checks the overlap (overlap checking in itself is somewhat involved because it depends on drawing characteristics like figure size and font size) and forces the locator to use less positions if an overlap is detected. In general, there will not even be a solution - you can always increase font size or reduce figure size enough to force an overlap of just two tick labels.

jklymak · 2024-05-07T00:59:52Z

@RomainPastureau we do not have access to your wavs, and we do not know what freq_audio is. Can you make these more reproducible?

jklymak · 2024-05-07T03:26:25Z

Irreprodocibility aside, the following seems to do what you want:

fig, ax = plt.subplots()
ax.plot(t_audio, signal)
locator = mdates.AutoDateLocator()
formatter = mdates.ConciseDateFormatter(locator, show_offset=False)
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(formatter)

RomainPastureau · 2024-05-07T08:12:49Z

@jklymak I put exemples in this comment with reproducible examples using a wav from scipy! Sorry about my first confusing example - it was the end of the day when I wrote down this first message and I hadn't think this through 😅
I will try your example code, thank you - though I am afraid that the ConciseDateFormatter will also indicate a date, which I don't really want.

RomainPastureau · 2024-05-09T12:31:15Z

Replying to say that I tweaked @rcomer solution - it seems to work, and I implemented it in a package I developed. This is the result on one of my example outputs:

Here is a snippet of the code I used:

def get_label(value, include_hour=True, include_us=True):
    """Returns a label value depending on the selected parameters."""

    neg = False
    # If negative, put positive
    if value < 0:
        neg = True
        value = abs(value)

    # If zero, set zero
    elif value == 0:
        if include_hour:
            return "00:00:00"
        else:
            return "00:00"

    # Turn to timedelta
    td_value = mdates.num2timedelta(value)

    seconds = td_value.total_seconds()
    hh = str(int(seconds // 3600)).zfill(2)
    mm = str(int((seconds // 60) % 60)).zfill(2)
    ss = str(int(seconds % 60)).zfill(2)

    us = str(int((seconds % 1) * 1000000)).rstrip("0")

    label = ""
    if neg:
        label += "-"
    if include_hour:
        label += hh + ":"
    label += mm + ":" + ss
    if include_us and us != "":
        label += "." + us

    return label

def get_label_hh_mm_ss_no_ms(value, pos=None):
    """Returns a label value as HH:MM:SS, without any ms value."""
    return get_label(value, True, False)

def get_label_hh_mm_ss(value, pos=None):
    """Returns a label value as HH:MM:SS.ms, without any trailing zero."""
    return get_label(value, True, True)

def set_label_time_figure(ax):
    """Sets the time formatted labels on the x axes."""
    if x_format_figure == "time":
        formatter = mdates.AutoDateFormatter(ax.xaxis.get_major_locator())
        formatter.scaled[1 / mdates.MUSECONDS_PER_DAY] = get_label_hh_mm_ss
        formatter.scaled[1 / mdates.SEC_PER_DAY] = get_label_hh_mm_ss
        formatter.scaled[1 / mdates.MINUTES_PER_DAY] = get_label_hh_mm_ss_no_ms
        formatter.scaled[1 / mdates.HOURS_PER_DAY] = get_label_hh_mm_ss_no_ms
        formatter.scaled[1] = get_label_hh_mm_ss_no_ms
        formatter.scaled[mdates.DAYS_PER_MONTH] = get_label_hh_mm_ss_no_ms
        formatter.scaled[mdates.DAYS_PER_YEAR] = get_label_hh_mm_ss_no_ms
        ax.xaxis.set_major_formatter(formatter)
        return ax

    return ax

i = 0

It is probably very naive code - but it results in exactly what I needed. I just have to call ax = set_label_time_figure(ax) after each plot/subplot and it works - even if I zoom in or unzoom. I also used timedelta objects instead of datetime objects.

I still think a proper, built-in formatter may be beneficial for other people working on time series - but at least, now, I see that it is possible.

Thank you all for your help!

RomainPastureau added the New feature label May 1, 2024

story645 added topic: date handling topic: ticks axis labels labels May 2, 2024

jklymak added the status: duplicate label May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH]: Automatically trim the time on the x axis #28158

[ENH]: Automatically trim the time on the x axis #28158

RomainPastureau commented May 1, 2024 •

edited

story645 commented May 1, 2024 •

edited

WeatherGod commented May 1, 2024 via email

timhoffm commented May 1, 2024

RomainPastureau commented May 2, 2024

story645 commented May 2, 2024

timhoffm commented May 2, 2024

story645 commented May 3, 2024

rcomer commented May 3, 2024 •

edited

RomainPastureau commented May 3, 2024

jklymak commented May 3, 2024

story645 commented May 3, 2024 •

edited

story645 commented May 3, 2024

rcomer commented May 3, 2024

rcomer commented May 3, 2024

RomainPastureau commented May 6, 2024

timhoffm commented May 6, 2024

jklymak commented May 7, 2024

jklymak commented May 7, 2024

RomainPastureau commented May 7, 2024

RomainPastureau commented May 9, 2024

[ENH]: Automatically trim the time on the x axis #28158

[ENH]: Automatically trim the time on the x axis #28158

Comments

RomainPastureau commented May 1, 2024 • edited

Problem

Proposed solution

story645 commented May 1, 2024 • edited

WeatherGod commented May 1, 2024 via email

timhoffm commented May 1, 2024

RomainPastureau commented May 2, 2024

story645 commented May 2, 2024

timhoffm commented May 2, 2024

story645 commented May 3, 2024

rcomer commented May 3, 2024 • edited

RomainPastureau commented May 3, 2024

jklymak commented May 3, 2024

story645 commented May 3, 2024 • edited

story645 commented May 3, 2024

rcomer commented May 3, 2024

rcomer commented May 3, 2024

RomainPastureau commented May 6, 2024

Clear description of the issues

First example: no formatting, numpy datetime64[us]

Code

Output

Remarks

Second example: no formatting, numpy timedelta64[us]

Code

Output

Remarks

Third example: using AutoDateFormatter

Code

Output

Remarks

Fourth example: using a function (@rcomer solution)

Output

Remarks

Conclusion

timhoffm commented May 6, 2024

jklymak commented May 7, 2024

jklymak commented May 7, 2024

RomainPastureau commented May 7, 2024

RomainPastureau commented May 9, 2024

RomainPastureau commented May 1, 2024 •

edited

story645 commented May 1, 2024 •

edited

rcomer commented May 3, 2024 •

edited

story645 commented May 3, 2024 •

edited