Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

casemapping: support rfc1459/rfc1459-strict #2099

Closed
ghost opened this issue Nov 8, 2023 · 2 comments
Closed

casemapping: support rfc1459/rfc1459-strict #2099

ghost opened this issue Nov 8, 2023 · 2 comments
Milestone

Comments

@ghost
Copy link

ghost commented Nov 8, 2023

This was discussed some in IRC, but it would be nice if ergo supported rfc1459 and rfc1459-strict casemapping as well.

Docs: Horse Docs


Notes: InspIRCd and UnrealIRCd both seem to also default to ascii like ergo. (based on GitHub code search)

Libera (the largest IRC network) uses rfc1459. I imagine most of the top 10 do.

chatlogs
20:31:41 <xnaas> Yes, but my question is really: can ergo support rfc1459 (and rfc1459-strict) and should one of those be the default?
20:32:07 <@slingamn> i think my answers would be "potentially, but why would we want to?" and "no"
20:32:29 <@slingamn> the behavior we implement is a de facto standard and we publish an 005 token describing it
20:32:38 <xnaas> to match what users expect if other IRC networks already do so, mostly
20:33:04 <xnaas> Can say one of my users was very shocked to see someone could take {their_nick} when they had [their_nick] already. :P
20:34:07 <@slingamn> hmm
20:34:59 <xnaas> I'm checking InspIRCd and UnrealIRCd which dwarf ergo by raw server count. I know Libera uses rfc1459.
20:35:58 <xnaas> (And Freenode before it)
20:36:01 <@slingamn> i think it's not a clear user expectation at this point
20:36:09 <@slingamn> you can open an issue...i think dan might have thoughts
20:37:10 <xnaas> looks like InspIRCd and UnrealIRCd both also default to ascii
20:37:37 <xnaas> But I'll have to check the top 10 now :P https://netsplit.de/networks/top10.php
20:37:38 -@ErgoBot- IRC Networks - Top 10 in the annual comparison
20:38:29 <@slingamn> i would guess that most large networks use rfc1459
20:38:45 <@slingamn> but i would also guess that the distinction is important to very few people
20:39:10 <@dan> I’d rather default to ascii, easier for new client authors to implement imo and more appropriate for newer networks, but ye smol distinction
20:39:27 <@slingamn> ohai dan
20:39:33 <@dan> haihai
20:40:25 <@slingamn> dan: we could open a ticket to support rfc1459 and/or rfc1459-strict, and make it incompatible with PRECIS (if you have PRECIS enabled then you must use CASEMAPPING=ascii)
20:40:37 <@slingamn> but, i don't think it's much of a priority
20:40:49 * xnaas can definitely open an issue and you can comment "PRs welcome" 😉
20:40:55 * xnaas knows how FOSS works
20:40:57 <xnaas> 😉
20:41:02 <@slingamn> it would also create some terminological confusion because the config key that controls utf8mapping is named 'casemapping', lol
20:41:11 <@dan> if you’d like to maintain it then totally open to having it ;D
20:41:16 <@dan> ehe
@slingamn slingamn added this to the 2.14 milestone Feb 23, 2024
@slingamn
Copy link
Member

This would be a fun thing to do, in keeping with our overall "reference implementation" mandate, and pretty easy. We'd basically just need to add them here:

ergo/irc/strings.go

Lines 49 to 63 in c67835c

const (
// "precis" is the default / zero value:
// casefolding/validation: PRECIS + ircd restrictions (like no *)
// confusables detection: standard skeleton algorithm
CasemappingPRECIS Casemapping = iota
// "ascii" is the traditional ircd behavior:
// casefolding/validation: must be pure ASCII and follow ircd restrictions, ASCII lowercasing
// confusables detection: none
CasemappingASCII
// "permissive" is an insecure mode:
// casefolding/validation: arbitrary unicodes that follow ircd restrictions, unicode casefolding
// confusables detection: standard skeleton algorithm (which may be ineffective
// over the larger set of permitted identifiers)
CasemappingPermissive
)

and then add corresponding cases to casefoldWithSetting and skeleton (precis and permissive would use the real skeletonization, ascii and the two new options would use the identity function).

@slingamn
Copy link
Member

Oh, and the 005 token would have to be made conditional:

isupport.Add("CASEMAPPING", "ascii")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant