New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update index handling in PandasAdapter
#28731
Comments
This is currently outside our API scope and I don't think that we want to make some code that we are not using internally. I think this is a design purpose that scikit-learn define We might be revisit this if we come to think of transformer that make independent column-wise transform. So I'm -1 for the inclusion right now. |
Makes sense. Perhaps this would be useful for the |
As input but not as output. Here, I think that this is more on the output side that you want to act. |
Correct—those transformers take 1D input and then output a 2D object. I figure if they had access to the |
Describe the workflow you want to enable
As noted in #27037, handling the index of an input container can be hairy. The solution implemented in #27044 works, but it excludes
pandas.Series
input types. I'd like to modify the logic in the :method:PandasAdapter.create_container
so that it checks if theX_original
is apandas.DataFrame
orpandas.Series
. This would allow transformers that accept 1-dimensional inputs and output 2-dimensional dataframes to persist their indices.Describe your proposed solution
I'd like to change line 124 from this:
To this:
Describe alternatives you've considered, if relevant
User sets the index on their own:
Additional context
I recognize most transformers in
scikit-learn
expect 2-dimensional inputs. But some packages that depend onscikit-learn
(likemlxtend
) have transformers that transform 1-dimensional input into 2-dimensional output. I believe this would greatly benefit them. See the newly updatedTransactionEncoder
for an example.I'm willing to submit a PR if this is an acceptable enhancement.
The text was updated successfully, but these errors were encountered: