New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Improves error message for mixed types for feature names #25018
Merged
ogrisel
merged 7 commits into
scikit-learn:main
from
thomasjpfan:improve_error_message_feature_names
Nov 25, 2022
Merged
Changes from 3 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
1074def
ENH Imrpoves error message for mixed types for feature names
thomasjpfan 69fd96f
Merge remote-tracking branch 'upstream/main' into improve_error_messa…
thomasjpfan 51e447f
CLN Improves error message
thomasjpfan ed509cf
CLN Use column names
thomasjpfan 97c7da6
improve warning message
jeremiedbb db8d685
remove the warning part
jeremiedbb 77d44d5
roll back
jeremiedbb File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
The "all" is unnecessary and I find it confuses me more/makes for tricky reading/doesn't make things easier to understand.
(Only commenting once here, instead of on all tests and where the message comes from)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think w/o the
all
here, it could mean that only string feature names are supported and the rest can be ignored. The two don't mean the same thing to me, and the latter doesn't seem correct.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I understood the feature is that if you specify a column name, it has to be a string. No other types are allowed. Is that right? If not ignore the below.
For me "column names that are all strings" is like "food made from all organic ingredients". The "all" is a statement about the ingredients, which in this case are exclusively organic. As a single column name can't consist of two different types it seems weird to say "all strings". "All column names have to be strings" is a statement about the column names, they all have to be strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a logical fallacy there in your example. feature names are separate things, but food is one thing, therefore your example would be more like "food made from only organic ingredients", or a better example: "we can allow a group of people who are all above 18", and to me it makes senes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From reading the code I think the answer to my question is: no.
This means that after considering only the exception's message and the few lines of code around it I didn't understand what the problem was. Hence I don't have a good idea how to succinctly explain to a user what just happened and what they need to do to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this ☝️ is an improvement over what we currently have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I directly pushed the change to use Adrin's suggestion because it looks uncontroversially more clear and I think it's still the middle of the night for Thomas :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the last sentence about silencing the warning since it's not a warning actually, good catch from #25018 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should tell people how to disable it though. They don't have to pass feature names, this makes it look like they have to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I put it back but without mentioning the warning. Is it fine now ?