Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent update to Shiny 0.9.0 broke DataTable input #1345

Closed
pieterjanvc opened this issue Apr 30, 2024 · 4 comments · Fixed by #1377 · May be fixed by #1374
Closed

Recent update to Shiny 0.9.0 broke DataTable input #1345

pieterjanvc opened this issue Apr 30, 2024 · 4 comments · Fixed by #1377 · May be fixed by #1374
Labels
bug Something isn't working data frame Related to @render.data_frame
Milestone

Comments

@pieterjanvc
Copy link

pieterjanvc commented Apr 30, 2024

Hello,

The recent update to Shiny 0.9.0 broke the DataTable input, which was one of the updates in this new version.
At this point input.<tableID>_selected_rows() stopped working and is no longer updating any reactive function.
You can check your own demo example provided in the documentation to see it is broken.

I hope this can be patched soon :) Downgrading to version 0.8.1 reverts the issue.

Thanks!

@pieterjanvc pieterjanvc changed the title Recent update broke DataTable input Recent update to Shiny 0.9.0 broke DataTable input Apr 30, 2024
@gadenbuie
Copy link
Collaborator

gadenbuie commented Apr 30, 2024

Shiny 0.9.0 introduced many new DataTable features, which includes new syntax for retrieving the selected rows. The new syntax is much more flexible in the sense that it both allows finer grained control over what types of selections are allowed and makes it possible to retrieve the selected rows, columns or cells on the server side.

The best example currently lives in the API reference: https://shiny.posit.co/py/api/core/render.DataGrid.html. We definitely need to update the examples in the components gallery to match the new syntax.

Here's an updated version of the demo example you linked to:

from palmerpenguins import load_penguins
from shiny import render
from shiny.express import input, ui

penguins = load_penguins()

ui.h2("Palmer Penguins")


@render.ui
def rows():
    rows = penguins_df.data_view(selected=True)  # <2>
    selected = ", ".join(str(i) for i in sorted(rows.index)) if not rows.empty else "None"  # <3>
    return f"Rows selected: {selected}"


@render.data_frame
def penguins_df():
    return render.DataTable(penguins, selection_mode="rows")  # <1>

Notice that:

  1. In render.DataTable() we now use selection_mode to specify what selections types are allowed.
  2. Rather than using input.{data_frame_id}_selected_rows(), we now let you refer to the output directly, with a much richer API for interacting with the data table. Above, I used penguins_df.data_view(selected=True) to get the selected rows from the current view (which may be filtered).
  3. The return value from the .data_view() method is a complete table, which is generally more useful than the indices alone. That said, you can still use rows.index to get the indices, which were used in the example to just list which rows were selected.

@pstorozenko
Copy link

I'd like to throw my bit here.

  1. I definitely prefer .data_view(...) accessor than some random *_selected_rows() :D Thanks for working on this!
  2. However, with _selected_rows() it was possible to get the indices of selected rows easily. Now I have to rely on pandas index for this. Why it's important to have such feature? Sometimes the displayed dataframe shows just a subset of columns of some dataframe in memory. Or rows in the dataframe correspond to some other objects (in my particular case each row represents an image, but you get the abstraction). It's not a problem for now, but if one day we shiny switches to some dataframe representation without creature like pd.Index it may be.
  3. ☝🏻 The breaking changes section of 0.9.0 release mentions only selection_mode (which is actually deprecation, not breaking), but says nothing about _selected_rows() / .data_view() which made it much harder for me to trace the issue.

Cheers!

@schloerke schloerke added data frame Related to @render.data_frame and removed needs-triage labels May 8, 2024
@schloerke schloerke added this to the v0.9.1 milestone May 8, 2024
@schloerke schloerke added the bug Something isn't working label May 8, 2024
@schloerke
Copy link
Collaborator

schloerke commented May 8, 2024

@pieterjanvc

At this point input.<tableID>_selected_rows() stopped working and is no longer updating any reactive function.

Yes. It was removed without warning. Sorry 'bout that!

I'll restore it for the next release and add a clear news item about removing it in a future release.

[I'd like to have access to the selected rows]

Good news...

With a combination of

@reactive.calc
def self__input_data_view_indices() -> list[int]:
data_view_indices = self._get_session().input[
f"{self.output_id}_data_view_indices"
]()
return data_view_indices
self._input_data_view_indices = self__input_data_view_indices
and
cell_selection = self.input_cell_selection()
if cell_selection is not None and cell_selection["type"] == "row":
# Use a `set` for faster lookups
selected_row_indices_set = set(cell_selection["rows"])
# Subset the data view indices to only include the selected rows
data_view_indices = [
index
for index in data_view_indices
if index in selected_row_indices_set
]
, maybe we expose them via .data_view_indices(*, selected: bool = False)? (Similar to .data_view(*, selected: bool = False))

It could return a {rows: list[int], cols: list[int]} object containing all rows/cols being used (given selected value). We could add a type to describe how the values were selected, similar to the CellSelection type

class CellSelectionRow(TypedDict):
type: Literal["row"]
rows: ListOrTuple[int]
class CellSelectionCol(TypedDict):
type: Literal["col"]
cols: ListOrTuple[int]
class CellSelectionRect(TypedDict):
type: Literal["rect"]
# Attempt to type the list of size 2, but it is not type enforced
rows: tuple[int, int] | Annotated[list[int], 2]
cols: tuple[int, int] | Annotated[list[int], 2]
# For receiving selection info from JS:
CellSelection = Union[
CellSelectionRow,
CellSelectionCol,
CellSelectionRect,
CellSelectionNone,
]
.

For the return value, I still don't know if rows and cols should always exist (containing all values). If they always exist, it is easier to code as a user. If they do not exist, it is much clearer as to what is being subsetted but more annoying to integrate with as a user. I'm currently leaning towards always including rows and cols containing all values being used. Ex: multi row selection would return {type: "row", rows: [1,3,5], cols: [0,1,2,3,4,5]} assuming we selected rows 1, 3, and 5 and there were 6 columns.

class DataViewIndices(TypedDict):
    type: Literal["row", "col", "rect", "none"]
    rows: ListOrTuple[int]
    cols: ListOrTuple[int]

# class data_frame...
def data_view_indices(self, *, selected: bool = False) -> DataViewIndices:
    ...

@pstorozenko Glad you like .data_view()!

It's not a problem for now, but if one day we shiny switches to some dataframe representation without creature like pd.Index it may be.

Can I get a copy of your crystal ball? 😆

I'd love to move to something like https://github.com/data-apis/dataframe-api-compat , however to get typing support, their typing specification is defined deep within another repo: https://github.com/data-apis/dataframe-api/tree/main/spec/API_specification/dataframe_api .

I do not understand why the types being used within dataframe-api-compat are not shipped with the package. /rant

@schloerke
Copy link
Collaborator

Original issue is being fixed in #1377 .

Making new issue for .data_view_info() support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment