gpt4 book ai didi

java - 调用 String#toLowerCase 时应该指定哪个区域设置?

转载 作者:太空狗 更新时间:2023-10-29 22:41:25 26 4
gpt4 key购买 nike

在 Java 中,String#toLowerCase 方法使用默认系统 Locale 来确定如何处理小写。如果我将一些 ASCII 文本小写,并希望确保按预期进行处理,我应该使用哪个语言环境?

编辑:我主要关心编程标识符,例如模式中的表名和列名。因此,我希望应用英文小写字母。

Locale.ROOT 声明它是区域设置敏感操作的语言/国家/地区中性区域设置

Locale.ENGLISH 大概也是一个安全的选择。

最佳答案

是的,Locale.ENGLISH 是编程语言标识符和 URL 部分等大小写操作的安全选择,因为它不涉及任何特殊的大小写规则和所有 7 位 ASCII 字符英文大小写转换为 7 位 ASCII 字符。

所有其他语言环境并非如此。在土耳其语中,“I”和“i”字符不会大小写转换。

"Dotted and dotless I"解释:

The Turkish alphabet, which is a variant of the Latin alphabet, includes two distinct versions of the letter I, one dotted and the other dotless.

In Unicode, U+0131 is a lower case letter dotless i (ı). U+0130 (İ) is capital i with dot. ISO-8859-9 has them at positions 0xFD and 0xDD respectively. In normal typography, when lower case i is combined with other diacritics, the dot is generally removed before the diacritic is added; however, Unicode still lists the equivalent combining sequences as including the dotted i, since logically it is the normal dotted i character that is being modified.

Most Unicode software uppercases ı to I and lowercases İ to i, but, unless specifically set up for Turkish, it lowercases I to i and uppercases i to I. Thus uppercasing then lowercasing, or vice versa, changes the letters.

特殊异常(exception)列表维护在 http://unicode.org/Public/UNIDATA/SpecialCasing.txt

# ================================================================================

# Turkish and Azeri

# I and i-dotless; I-dot and i are case pairs in Turkish and Azeri
# The following rules handle those cases.

0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE

# When lowercasing, remove dot_above in the sequence I + dot_above, which will turn into i.
# This matches the behavior of the canonically equivalent I-dot_above

0307; ; 0307; 0307; tr After_I; # COMBINING DOT ABOVE
0307; ; 0307; 0307; az After_I; # COMBINING DOT ABOVE

...

关于java - 调用 String#toLowerCase 时应该指定哪个区域设置?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10336730/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com