Add new Naming/InclusiveLanguage cop #9842

tjwp · 2021-05-31T23:00:53Z

This pull request adds a new cop (Naming/InclusiveLanguage) that recommends the use of inclusive language instead of problematic terms.

Following this comment, the cop is generic and can target different terms based on configuration. Suggested replacements can be configured in addition to exceptions for allowed use of a term.

The implementation is inspired by some of the ideas found in https://github.com/flexport/inclusive-code/tree/main/ruby.

Auto-correct is not implemented because too much context is required to choose an appropriate replacement suggestion and many renames would be unsafe.

bbatsov · 2021-06-01T06:50:56Z

Thanks for tackling this! Two points from me:

Shouldn't this target only identifiers? I think that extending the check to filenames is a bit excessive.
I think this cop can be used to suggest alternatives all sorts of poor identifier names, not just those related to inclusive language (e.g. confusing shorthands, commonly misspelled words and so on).

tjwp · 2021-06-01T12:58:35Z

@bbatsov Thanks for taking a look!

I was intentionally targeting entire codebases, comments included. My intent was to identify language that may be unwelcoming wherever it appears in a project. I'm okay with dropping the file path part. Terms in the file or directory name would like also be reflected in file contents.
I'm hesitant to combine the existing implementation with other examples of poor identifier names because the reason that offenses are being flagged is very different. For example a commonly misspelled word may be for execution correctness, whereas the reason for this cop is inclusiveness. The configuration would likely need to be broken out by use case, or supply a reason for the offense with each term. Also there is the implementation difference of targeting just identifiers vs all content.

What do you think about a GoodIdentifier cop in addition to this cop? It could target identifiers specifically for confusing shorthands and common misspellings but some of the implementation and configuration approach could possibly be shared with the InclusiveLanguage cop?

bbatsov · 2021-06-01T13:26:37Z

I get your point, but I'm concerned that some words like master have plenty of valid uses (e.g. Master of None, Master Boot Record, etc) and I don't want to generate lots of false positives for the users. I agree that whitelist and blacklist are universally bad, but for master I think that the context matters a lot. Just my two cents. @rubocop/rubocop-core What are your thoughts on this?

koic · 2021-06-01T14:03:09Z

In the past, I worked on #7469. What I remember as a task at that time was searching for the target term in the text files with grep. So I didn't parse AST with a static analysis tool.
I'm wondering whether a static analysis tool (e.g. RuboCop) should have the feature, as it also needs to be detected in the documents that users see. Users will see more of documentation than code.
IMHO, for this solution, text search tool is better and faster than static analysis tool. The right tool for the right job :-)

tjwp · 2021-06-01T16:14:46Z

@bbatsov To handle master without a lot of false positives, I think there a couple of different options:

Don't include that term in the default configuration. Users can create a custom configuration if they want to check for it.
Or ship with a default list of AllowedRegex values that includes examples like Master of None, Master Boot Record (what others?).
Personally, I'm more invested in having the generic capability than shipping a specific configuration.

@koic I see your point that this cop does not rely on parsing the code. The benefit that I see to including it in RuboCop is everything else that Rubcop already provides. I.e. the tooling is already setup in build pipelines, possibly across hundreds or thousands of repos for large companies. Building something on top of grep or another static analysis tool would also require defining all of the configuration conventions, writing formatters for reporting offenses, inventing mechanisms for disabling offenses and managing TODO lists. Though this doesn't use the AST it benefits from everything else RuboCop provides.

tjwp · 2021-06-04T00:16:15Z

I've removed the file path check, as a separate commit for now.

tas50 · 2021-06-08T05:10:16Z

This will be super handy in our project. Thanks for creating it @tjwp

lib/rubocop/cop/naming/inclusive_language.rb

bbatsov · 2021-06-10T05:52:02Z

Overall I'm happy with the current code of the cop, but I want to make it more configurable - e.g. to have CheckIdentifiers, CheckStrings, CheckComments flags that make its operation a bit more granular. If we go this route we can also bring back the filename check as well.

Sorry about the slow feedback here - I had a couple of busy weeks on my end.

tjwp · 2021-06-10T14:47:38Z

@bbatsov No problem! Thanks for the review. I'll start on making the requested changes.

Recommends the use of inclusive language instead of problematic terms.

Drenmi · 2021-06-19T14:42:28Z

Just my two cents. @rubocop/rubocop-core What are your thoughts on this?

The implementation is certainly a bit unorthodox, as Koichi already pointed out. I don't mind it conceptually, but matching regular expressions on every token does sound like it could significantly affect the execution time. Some benchmark around this would be nice.

I certainly think that this cop should either be 1) disabled by default, or 2) be shipped with a minimal configuration. As the author pointed out, the general capability is more important than a specific configuration. RuboCop is used by many people, and like with the other cops, we leave it to them to define their own policies.

tjwp · 2021-06-19T15:56:16Z

@Drenmi Thanks for taking a look!

The implementation is certainly a bit unorthodox, as Koichi already pointed out. I don't mind it conceptually, but matching regular expressions on every token does sound like it could significantly affect the execution time. Some benchmark around this would be nice.

A small update on this. The implementation has been updated based on feedback to look at specific types of tokens, and all of those are now individually configurable.

I certainly think that this cop should either be 1) disabled by default, or 2) be shipped with a minimal configuration. As the author pointed out, the general capability is more important than a specific configuration. RuboCop is used by many people, and like with the other cops, we leave it to them to define their own policies.

👍🏻

tjwp · 2021-06-21T20:09:43Z

I ran a very simple benchmark around this using the rubocop code base. The following 3 scenarios were tested:

InclusiveLanguage cop disabled
InclusiveLanguage cop enabled with default config
InclusiveLanguage cop enable with repo config

Each scenario was run 3 times with the cache disabled:

Scenario	Avg	Max	Min	% increase avg
Disabled	4:13.87	4:14:56	4:12.57
Enabled w/ default	4:14.86	4:16.70	4:12.79	0.4%
Enabled w/ repo	4:17.77	4:18.37	4:15.59	1.5%

My interpretation of those results is that the impact on the execution time is not large. With the current default configuration from the change, the increase was small, with the fastest execution falling within the range of the baseline executions. With the config included for the rubocop repo, the effect of the cop was larger, but still a small percentage increase for the overall execution time.

bbatsov · 2021-06-23T07:16:37Z

Great work! Thanks for working on this!

Follow up rubocop/rubocop#9842.

tjwp · 2021-06-23T11:06:22Z

Great work! Thanks for working on this!

Thank you! It's been several years since my last contribution to rubocop, but a pleasure to be back here and able to contribute!

I don't agree with forcing human language into cultural limits and therefore propose to disable the cop introduced by rubocop/rubocop#9842. My code normally offends people in a lot of different non-cultural ways already 😉 . It's fine if Rubocop complains about those. But please talk to me in person if my code offends you by cultural issues.

Follow up rubocop/rubocop#9842.

tjwp force-pushed the tjwp-naming-inclusive-language branch 2 times, most recently from a3d3a6b to 1aeba9f Compare May 31, 2021 23:15

bbatsov reviewed Jun 10, 2021

View reviewed changes

lib/rubocop/cop/naming/inclusive_language.rb Outdated Show resolved Hide resolved

bbatsov reviewed Jun 10, 2021

View reviewed changes

lib/rubocop/cop/naming/inclusive_language.rb Outdated Show resolved Hide resolved

bbatsov reviewed Jun 10, 2021

View reviewed changes

lib/rubocop/cop/naming/inclusive_language.rb Outdated Show resolved Hide resolved

tjwp added 2 commits June 15, 2021 20:35

Add new Naming/InclusiveLanguage cop

28f7e8a

Recommends the use of inclusive language instead of problematic terms.

Fix CLI option spec

c979288

tjwp force-pushed the tjwp-naming-inclusive-language branch from 64b3b44 to 50990e4 Compare June 16, 2021 00:36

Review feedback

c73b83a

tjwp force-pushed the tjwp-naming-inclusive-language branch from 50990e4 to c73b83a Compare June 19, 2021 11:41

bbatsov merged commit 609cd7f into rubocop:master Jun 23, 2021

koic added a commit to rubocop/rubocop-minitest that referenced this pull request Jun 23, 2021

Suppress Naming/InclusiveLanguage warn

b91f0c6

Follow up rubocop/rubocop#9842.

koic added a commit to rubocop/rubocop-performance that referenced this pull request Jun 23, 2021

Suppress Naming/InclusiveLanguage warn

04d0ebd

Follow up rubocop/rubocop#9842.

koic added a commit to rubocop/rubocop-rails that referenced this pull request Jun 23, 2021

Suppress Naming/InclusiveLanguage warn

1f380d2

Follow up rubocop/rubocop#9842.

schmijos mentioned this pull request Jun 29, 2021

Disable Naming/InclusiveLanguage renuo/applications-setup-guide#272

Merged

tjwp mentioned this pull request Jun 30, 2021

Enable and configure Naming/InclusiveLanguage Shopify/ruby-style-guide#288

Merged

renawatson68 added a commit to renawatson68/performance-develop-rubyonrails that referenced this pull request Sep 23, 2022

Suppress Naming/InclusiveLanguage warn

b7c7586

Follow up rubocop/rubocop#9842.

richardstewart0213 added a commit to richardstewart0213/performance-build-Performance-optimization-analysis- that referenced this pull request Nov 4, 2022

Suppress Naming/InclusiveLanguage warn

ff888d4

Follow up rubocop/rubocop#9842.

MarttiCheng added a commit to MarttiCheng/Rubocop-Performance that referenced this pull request Sep 28, 2023

Suppress Naming/InclusiveLanguage warn

48174d6

Follow up rubocop/rubocop#9842.

SerhiiMisiura added a commit to SerhiiMisiura/Rubocop-Performance that referenced this pull request Oct 5, 2023

Suppress Naming/InclusiveLanguage warn

3972752

Follow up rubocop/rubocop#9842.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new Naming/InclusiveLanguage cop #9842

Add new Naming/InclusiveLanguage cop #9842

tjwp commented May 31, 2021

bbatsov commented Jun 1, 2021

tjwp commented Jun 1, 2021

bbatsov commented Jun 1, 2021

koic commented Jun 1, 2021

tjwp commented Jun 1, 2021

tjwp commented Jun 4, 2021

tas50 commented Jun 8, 2021

bbatsov commented Jun 10, 2021

tjwp commented Jun 10, 2021

Drenmi commented Jun 19, 2021

tjwp commented Jun 19, 2021

tjwp commented Jun 21, 2021 •

edited

bbatsov commented Jun 23, 2021

tjwp commented Jun 23, 2021

Add new Naming/InclusiveLanguage cop #9842

Add new Naming/InclusiveLanguage cop #9842

Conversation

tjwp commented May 31, 2021

bbatsov commented Jun 1, 2021

tjwp commented Jun 1, 2021

bbatsov commented Jun 1, 2021

koic commented Jun 1, 2021

tjwp commented Jun 1, 2021

tjwp commented Jun 4, 2021

tas50 commented Jun 8, 2021

bbatsov commented Jun 10, 2021

tjwp commented Jun 10, 2021

Drenmi commented Jun 19, 2021

tjwp commented Jun 19, 2021

tjwp commented Jun 21, 2021 • edited

bbatsov commented Jun 23, 2021

tjwp commented Jun 23, 2021

tjwp commented Jun 21, 2021 •

edited