Request for rolling_top_k #16266

leoknuth · 2024-05-16T09:05:21Z

Request for a rolling_top_k expression.

Currently, I can do this with very slow code using rolling_map and lambda.

data = data.with_columns(
    roll_top_k_mean = pl.col("a").rolling_map(lambda x: x.top_k(5).mean(), window_size=10)
)

The method I hope to use is like:

data = data.with_columns(
    roll_top_k_mean = pl.col("a").rolling_top_k(k=5, w=10).mean()
)

You can use 2 heaps and 1 queue to implement this.

The text was updated successfully, but these errors were encountered:

cmdlineluser · 2024-05-16T09:16:22Z

As an aside, I think the slow rolling_map can be replaced with Expr.rolling - right?

(data
  .with_row_index()
  .with_columns(
      pl.col("a").top_k(5).mean().rolling(index_column="index", period="5i")
  )
)

leoknuth added the enhancement New feature or an improvement of an existing feature label May 16, 2024

Provide feedback