gpt4 book ai didi

go - 一个字符可以在 Go 中跨越多个 rune 吗?

转载 作者:IT王子 更新时间:2023-10-29 01:39:12 27 4
gpt4 key购买 nike

我在 this blog 上阅读了这篇文章

Even with rune slices a single character might span multiple runes, which can happen if you have characters with grave accent, for example. This complicated and ambiguous nature of "characters" is the reason why Go strings are represented as byte sequences.

这是真的吗? (看起来像是懂 Go 的人的博客)。我在我的机器上测试过,“è”是 1 个 rune 和 2 个字节。和 Go doc似乎另有说法。

你遇到过这样的角色吗? (utf-8) 一个字符可以在 Go 中跨越多个 rune 吗?

最佳答案

是的,它可以:

s := "é́́"
fmt.Println(s, []rune(s))

输出(在 Go Playground 上尝试):

é́́ [101 769 769 769]

一个字符,4 个 rune 。它可以是任意长...

示例取自 The Go Blog: Text Normalization in Go .

What is a character?

As was mentioned in the strings blog post, characters can span multiple runes. For example, an 'e' and '◌́' (acute "\u0301") can combine to form 'é' ("e\u0301" in NFD). Together these two runes are one character. The definition of a character may vary depending on the application. For normalization we will define it as a sequence of runes that starts with a starter, a rune that does not modify or combine backwards with any other rune, followed by possibly empty sequence of non-starters, that is, runes that do (typically accents). The normalization algorithm processes one character at at time.

一个字符后面可以跟任意数量的modifiers (修饰符可以重复堆叠):

Theoretically, there is no bound to the number of runes that can make up a Unicode character. In fact, there are no restrictions on the number of modifiers that can follow a character and a modifier may be repeated, or stacked. Ever seen an 'e' with three acutes? Here you go: 'é́́'. That is a perfectly valid 4-rune character according to the standard.

另见:Combining character .

编辑: “这不会扼杀‘ rune 概念’吗?”

答:不是 rune 的概念。 rune 不是字符。 rune 是标识 Unicode 代码点的整数值。一个字符可能是一个Unicode代码点,在这种情况下1个字符是1个runerune 的大部分一般用途都适合这种情况,因此在实践中这几乎不会让人头疼。这是 Unicode standard 的概念.

关于go - 一个字符可以在 Go 中跨越多个 rune 吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36569018/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com