gpt4 book ai didi

javascript - 如何使 toLowerCase() 和 toUpperCase() 在浏览器之间保持一致

转载 作者:行者123 更新时间:2023-11-29 10:57:55 25 4
gpt4 key购买 nike

是否有 String.toLowerCase() 和 String.toUpperCase() 的 JavaScript polyfill 实现,或者 JavaScript 中可以处理 Unicode 字符并在浏览器之间保持一致的其他方法?

背景信息

执行以下操作将在浏览器中产生不同的结果,甚至在浏览器版本之间(例如 FireFox 54 与 55):

document.write(String.fromCodePoint(223).normalize("NFKC").toLowerCase().toUpperCase().toLowerCase())

在 Firefox 55 中,它为您提供 ss,在 Firefox 54 中,它为您提供 ß

通常这没问题,像 Locales 这样的机制可以处理很多你想要的情况;但是,当您需要跨平台的一致行为时,例如与 之类的 BaaS 系统交谈它可以极大地简化您实际上是在客户端处理内部数据的交互。

最佳答案

请注意,此问题似乎只影响过时的 Firefox 版本,因此除非您明确需要支持那些旧版本,否则您可以选择完全不打扰。您示例的行为在所有现代浏览器中都是相同的(自 Firefox 发生变化以来)。这个可以验证using jsvu + eshost :

$ jsvu # Update installed JavaScript engine binaries to the latest version.

$ eshost -e '"\xDF".normalize("NFKC").toLowerCase().toUpperCase().toLowerCase()'
#### Chakra
ss

#### V8 --harmony
ss

#### JavaScriptCore
ss

#### V8
ss

#### SpiderMonkey
ss

#### xs
ss

但是你问的是如何解决这个问题,那我们继续吧。

https://tc39.github.io/ecma262/#sec-string.prototype.tolowercase 的第 4 步状态:

Let cuList be a List where the elements are the result of toLowercase(cpList), according to the Unicode Default Case Conversion algorithm.

Unicode 默认大小写转换算法section 3.13 Default Case Algorithms of the Unicode standard 中指定.

The full case mappings for Unicode characters are obtained by using the mappings from SpecialCasing.txt plus the mappings from UnicodeData.txt, excluding any of the latter mappings that would conflict. Any character that does not have a mapping in these files is considered to map to itself.

[…]

The following rules specify the default case conversion operations for Unicode strings. These rules use the full case conversion operations, Uppercase_Mapping(C), Lowercase_Mapping(C), and Titlecase_Mapping(C), as well as the context-dependent mappings based on the casing context, as specified in Table 3-17.

For a string X:

  • R1 toUppercase(X): Map each character C in X to Uppercase_Mapping(C).
  • R2 toLowercase(X): Map each character C in X to Lowercase_Mapping(C).

这是来自 SpecialCasing.txt 的示例,下面添加了我的注释:

00DF  ; 00DF   ; 0053 0073; 0053 0053;                      # LATIN SMALL LETTER SHARP S
<code>; <lower>; <title> ; <upper> ; (<condition_list>;)? # <comment>

这一行表示 U+00DF ('ß') 小写为 U+00DF (ß) 大写为 U+0053 U+0053 ( SS).

这是来自 UnicodeData.txt 的示例,下面添加了我的注释:

0041  ; LATIN CAPITAL LETTER A; Lu;0;L;;;;;N;;;; 0061   ;
<code>; <name> ; <ignore> ; <lower>; <upper>

这一行表示 U+0041 ('A') 小写为 U+0061 ('a')。它没有明确的大写映射,这意味着它自身大写。

这是来自 UnicodeData.txt 的另一个例子:

0061  ; LATIN SMALL LETTER A; Ll;0;L;;;;;N;; ;0041;        ; 0041
<code>; <name> ; <ignore> ; <lower>; <upper>

这一行表示 U+0061 ('a') 大写为 U+0041 ('A')。它没有明确的小写映射,这意味着它小写到自身。

您可以编写一个脚本来解析这两个文件,读取这些示例之后的每一行,并构建小写/大写映射。然后,您可以将这些映射转换为一个小型 JavaScript 库,该库提供符合规范的 toLowerCase/toUpperCase 功能。

这看起来工作量很大。根据 Firefox 中的旧行为以及具体发生了什么变化(?),您可能会将工作限制为 SpecialCasing.txt 中的特殊映射。 . (根据您提供的示例,我假设在 Firefox 55 中只有特殊的外壳发生了变化。)

// Instead of…
function normalize(string) {
const normalized = string.normalize('NFKC');
const lowercased = normalized.toLowerCase();
return lowercased;
}

// …one could do something like:
function lowerCaseSpecialCases(string) {
// TODO: replace all SpecialCasing.txt characters with their lowercase
// mapping.
return string.replace(/TODO/g, fn);
}
function normalize(string) {
const normalized = string.normalize('NFKC');
const fixed = lowerCaseSpecialCases(normalized); // Workaround for old Firefox 54 behavior.
const lowercased = fixed.toLowerCase();
return lowercased;
}

我编写了一个脚本来解析 SpecialCasing.txt 并生成一个 JS 库,该库实现了上面提到的 lowerCaseSpecialCases 功能(如 toLower)以及 toUpper。这是:https://gist.github.com/mathiasbynens/a37e3f3138069729aa434ea90eea4a3c根据您的具体用例,您可能根本不需要 toUpper 及其相应的正则表达式和映射。这是完整生成的库:

const reToLower = /[\u0130\u1F88-\u1F8F\u1F98-\u1F9F\u1FA8-\u1FAF\u1FBC\u1FCC\u1FFC]/g;
const toLowerMap = new Map([
['\u0130', 'i\u0307'],
['\u1F88', '\u1F80'],
['\u1F89', '\u1F81'],
['\u1F8A', '\u1F82'],
['\u1F8B', '\u1F83'],
['\u1F8C', '\u1F84'],
['\u1F8D', '\u1F85'],
['\u1F8E', '\u1F86'],
['\u1F8F', '\u1F87'],
['\u1F98', '\u1F90'],
['\u1F99', '\u1F91'],
['\u1F9A', '\u1F92'],
['\u1F9B', '\u1F93'],
['\u1F9C', '\u1F94'],
['\u1F9D', '\u1F95'],
['\u1F9E', '\u1F96'],
['\u1F9F', '\u1F97'],
['\u1FA8', '\u1FA0'],
['\u1FA9', '\u1FA1'],
['\u1FAA', '\u1FA2'],
['\u1FAB', '\u1FA3'],
['\u1FAC', '\u1FA4'],
['\u1FAD', '\u1FA5'],
['\u1FAE', '\u1FA6'],
['\u1FAF', '\u1FA7'],
['\u1FBC', '\u1FB3'],
['\u1FCC', '\u1FC3'],
['\u1FFC', '\u1FF3']
]);
const toLower = (string) => string.replace(reToLower, (match) => toLowerMap.get(match));

const reToUpper = /[\xDF\u0149\u01F0\u0390\u03B0\u0587\u1E96-\u1E9A\u1F50\u1F52\u1F54\u1F56\u1F80-\u1FAF\u1FB2-\u1FB4\u1FB6\u1FB7\u1FBC\u1FC2-\u1FC4\u1FC6\u1FC7\u1FCC\u1FD2\u1FD3\u1FD6\u1FD7\u1FE2-\u1FE4\u1FE6\u1FE7\u1FF2-\u1FF4\u1FF6\u1FF7\u1FFC\uFB00-\uFB06\uFB13-\uFB17]/g;
const toUpperMap = new Map([
['\xDF', 'SS'],
['\uFB00', 'FF'],
['\uFB01', 'FI'],
['\uFB02', 'FL'],
['\uFB03', 'FFI'],
['\uFB04', 'FFL'],
['\uFB05', 'ST'],
['\uFB06', 'ST'],
['\u0587', '\u0535\u0552'],
['\uFB13', '\u0544\u0546'],
['\uFB14', '\u0544\u0535'],
['\uFB15', '\u0544\u053B'],
['\uFB16', '\u054E\u0546'],
['\uFB17', '\u0544\u053D'],
['\u0149', '\u02BCN'],
['\u0390', '\u0399\u0308\u0301'],
['\u03B0', '\u03A5\u0308\u0301'],
['\u01F0', 'J\u030C'],
['\u1E96', 'H\u0331'],
['\u1E97', 'T\u0308'],
['\u1E98', 'W\u030A'],
['\u1E99', 'Y\u030A'],
['\u1E9A', 'A\u02BE'],
['\u1F50', '\u03A5\u0313'],
['\u1F52', '\u03A5\u0313\u0300'],
['\u1F54', '\u03A5\u0313\u0301'],
['\u1F56', '\u03A5\u0313\u0342'],
['\u1FB6', '\u0391\u0342'],
['\u1FC6', '\u0397\u0342'],
['\u1FD2', '\u0399\u0308\u0300'],
['\u1FD3', '\u0399\u0308\u0301'],
['\u1FD6', '\u0399\u0342'],
['\u1FD7', '\u0399\u0308\u0342'],
['\u1FE2', '\u03A5\u0308\u0300'],
['\u1FE3', '\u03A5\u0308\u0301'],
['\u1FE4', '\u03A1\u0313'],
['\u1FE6', '\u03A5\u0342'],
['\u1FE7', '\u03A5\u0308\u0342'],
['\u1FF6', '\u03A9\u0342'],
['\u1F80', '\u1F08\u0399'],
['\u1F81', '\u1F09\u0399'],
['\u1F82', '\u1F0A\u0399'],
['\u1F83', '\u1F0B\u0399'],
['\u1F84', '\u1F0C\u0399'],
['\u1F85', '\u1F0D\u0399'],
['\u1F86', '\u1F0E\u0399'],
['\u1F87', '\u1F0F\u0399'],
['\u1F88', '\u1F08\u0399'],
['\u1F89', '\u1F09\u0399'],
['\u1F8A', '\u1F0A\u0399'],
['\u1F8B', '\u1F0B\u0399'],
['\u1F8C', '\u1F0C\u0399'],
['\u1F8D', '\u1F0D\u0399'],
['\u1F8E', '\u1F0E\u0399'],
['\u1F8F', '\u1F0F\u0399'],
['\u1F90', '\u1F28\u0399'],
['\u1F91', '\u1F29\u0399'],
['\u1F92', '\u1F2A\u0399'],
['\u1F93', '\u1F2B\u0399'],
['\u1F94', '\u1F2C\u0399'],
['\u1F95', '\u1F2D\u0399'],
['\u1F96', '\u1F2E\u0399'],
['\u1F97', '\u1F2F\u0399'],
['\u1F98', '\u1F28\u0399'],
['\u1F99', '\u1F29\u0399'],
['\u1F9A', '\u1F2A\u0399'],
['\u1F9B', '\u1F2B\u0399'],
['\u1F9C', '\u1F2C\u0399'],
['\u1F9D', '\u1F2D\u0399'],
['\u1F9E', '\u1F2E\u0399'],
['\u1F9F', '\u1F2F\u0399'],
['\u1FA0', '\u1F68\u0399'],
['\u1FA1', '\u1F69\u0399'],
['\u1FA2', '\u1F6A\u0399'],
['\u1FA3', '\u1F6B\u0399'],
['\u1FA4', '\u1F6C\u0399'],
['\u1FA5', '\u1F6D\u0399'],
['\u1FA6', '\u1F6E\u0399'],
['\u1FA7', '\u1F6F\u0399'],
['\u1FA8', '\u1F68\u0399'],
['\u1FA9', '\u1F69\u0399'],
['\u1FAA', '\u1F6A\u0399'],
['\u1FAB', '\u1F6B\u0399'],
['\u1FAC', '\u1F6C\u0399'],
['\u1FAD', '\u1F6D\u0399'],
['\u1FAE', '\u1F6E\u0399'],
['\u1FAF', '\u1F6F\u0399'],
['\u1FB3', '\u0391\u0399'],
['\u1FBC', '\u0391\u0399'],
['\u1FC3', '\u0397\u0399'],
['\u1FCC', '\u0397\u0399'],
['\u1FF3', '\u03A9\u0399'],
['\u1FFC', '\u03A9\u0399'],
['\u1FB2', '\u1FBA\u0399'],
['\u1FB4', '\u0386\u0399'],
['\u1FC2', '\u1FCA\u0399'],
['\u1FC4', '\u0389\u0399'],
['\u1FF2', '\u1FFA\u0399'],
['\u1FF4', '\u038F\u0399'],
['\u1FB7', '\u0391\u0342\u0399'],
['\u1FC7', '\u0397\u0342\u0399'],
['\u1FF7', '\u03A9\u0342\u0399']
]);
const toUpper = (string) => string.replace(reToUpper, (match) => toUpperMap.get(match));

关于javascript - 如何使 toLowerCase() 和 toUpperCase() 在浏览器之间保持一致,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53488013/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com