fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

11K
active users

Luke T. Shumaker

str.lower() converts 'İ' (U+0130 (LATIN CAPITAL LETTER I WITH DOT ABOVE)) to 'i̇' (U+0069 + U+0307 (LATIN SMALL LETTER I + COMBINING DOT ABOVE)), while strings.ToLower() and downcase-word convert it to just 'i' (U+0069 (LATIN SMALL LETTER I)).

And I have a hard time arguing why either is wrong. . I sure wish I could consult util.unicode.org, but it's down rn.

So UnicodeData.txt and SpecialCasings.txt disagree. UnicodeData.txt is supposed to leave the field blank if it wants you to defer to SpecialCasings.txt. But UnicodeData.txt says that just U+0069 is correct, while SpecialCasings.txt says that U+0307 should be included.

Now, Go's implementation of strings.ToLower 100% cannot handle case-conversions that change the number of codepoints, so even if we accept that it's correct in this case, it's broken in general.