gpt4 book ai didi

python - unicode.isdigit() 和 unicode.isnumeric() 的区别

转载 作者:IT老高 更新时间:2023-10-28 20:45:39 25 4
gpt4 key购买 nike

方法 unicode.isdigit() 和 unicode.isnumeric() 有什么区别?

最佳答案

Python 3 documentation比 Python 2 文档更清晰:

str.isdigit()
[...] Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. Formally, a digit is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal.

str.isnumeric()
Numeric characters include digit characters, and all characters that have the Unicode numeric value property, e.g. U+2155, VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property value Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric.

所以 isnumeric() 额外测试 Numeric_Type=Numeric。引用 official numeric type definitions 的历史性提案:

Numeric_Type=Decimal
Characters used in a positional decimal systems, which standard base-10 radix systems with contiguous digits 0..9, and are most-significant-digit first (backingstore order). These are coextensive by definition with General_Category=Decimal_Number.

Numeric_Type=Digit Variants of positional decimal characters (Numeric_Type=Decimal) or sequences thereof. These include super/subscripts, enclosed, or decorated by the addition of characters such as parentheses, dots, or commas.

Numeric_Type=Numeric Characters with numeric value, but that are neither Decimal nor Digit.

所以任何数字字符,但不是十进制或其变体。想想分数、罗马数字、组合数字的字形,以及任何不基于十进制的编号系统。

这包括:

>>> import unicodedata
>>> for codepoint in range(2**16):
... chr = unichr(codepoint)
... if chr.isnumeric() and not chr.isdigit():
... print u'{:04x}: {} ({})'.format(codepoint, chr, unicodedata.name(chr, 'UNNAMED'))
...
00bc: ¼ (VULGAR FRACTION ONE QUARTER)
00bd: ½ (VULGAR FRACTION ONE HALF)
00be: ¾ (VULGAR FRACTION THREE QUARTERS)
09f4: ৴ (BENGALI CURRENCY NUMERATOR ONE)
09f5: ৵ (BENGALI CURRENCY NUMERATOR TWO)
09f6: ৶ (BENGALI CURRENCY NUMERATOR THREE)
09f7: ৷ (BENGALI CURRENCY NUMERATOR FOUR)
09f8: ৸ (BENGALI CURRENCY NUMERATOR ONE LESS THAN THE DENOMINATOR)
09f9: ৹ (BENGALI CURRENCY DENOMINATOR SIXTEEN)
0bf0: ௰ (TAMIL NUMBER TEN)
0bf1: ௱ (TAMIL NUMBER ONE HUNDRED)
0bf2: ௲ (TAMIL NUMBER ONE THOUSAND)
0c78: ౸ (TELUGU FRACTION DIGIT ZERO FOR ODD POWERS OF FOUR)
0c79: ౹ (TELUGU FRACTION DIGIT ONE FOR ODD POWERS OF FOUR)
0c7a: ౺ (TELUGU FRACTION DIGIT TWO FOR ODD POWERS OF FOUR)
0c7b: ౻ (TELUGU FRACTION DIGIT THREE FOR ODD POWERS OF FOUR)
0c7c: ౼ (TELUGU FRACTION DIGIT ONE FOR EVEN POWERS OF FOUR)
0c7d: ౽ (TELUGU FRACTION DIGIT TWO FOR EVEN POWERS OF FOUR)
0c7e: ౾ (TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR)
0d70: ൰ (MALAYALAM NUMBER TEN)
0d71: ൱ (MALAYALAM NUMBER ONE HUNDRED)
0d72: ൲ (MALAYALAM NUMBER ONE THOUSAND)
0d73: ൳ (MALAYALAM FRACTION ONE QUARTER)
0d74: ൴ (MALAYALAM FRACTION ONE HALF)
0d75: ൵ (MALAYALAM FRACTION THREE QUARTERS)
0f2a: ༪ (TIBETAN DIGIT HALF ONE)
0f2b: ༫ (TIBETAN DIGIT HALF TWO)
0f2c: ༬ (TIBETAN DIGIT HALF THREE)
0f2d: ༭ (TIBETAN DIGIT HALF FOUR)
0f2e: ༮ (TIBETAN DIGIT HALF FIVE)
0f2f: ༯ (TIBETAN DIGIT HALF SIX)
0f30: ༰ (TIBETAN DIGIT HALF SEVEN)
0f31: ༱ (TIBETAN DIGIT HALF EIGHT)
0f32: ༲ (TIBETAN DIGIT HALF NINE)
0f33: ༳ (TIBETAN DIGIT HALF ZERO)
1372: ፲ (ETHIOPIC NUMBER TEN)
1373: ፳ (ETHIOPIC NUMBER TWENTY)
1374: ፴ (ETHIOPIC NUMBER THIRTY)
1375: ፵ (ETHIOPIC NUMBER FORTY)
1376: ፶ (ETHIOPIC NUMBER FIFTY)
1377: ፷ (ETHIOPIC NUMBER SIXTY)
1378: ፸ (ETHIOPIC NUMBER SEVENTY)
1379: ፹ (ETHIOPIC NUMBER EIGHTY)
137a: ፺ (ETHIOPIC NUMBER NINETY)
137b: ፻ (ETHIOPIC NUMBER HUNDRED)
137c: ፼ (ETHIOPIC NUMBER TEN THOUSAND)
16ee: ᛮ (RUNIC ARLAUG SYMBOL)
16ef: ᛯ (RUNIC TVIMADUR SYMBOL)
16f0: ᛰ (RUNIC BELGTHOR SYMBOL)
17f0: ៰ (KHMER SYMBOL LEK ATTAK SON)
17f1: ៱ (KHMER SYMBOL LEK ATTAK MUOY)
17f2: ៲ (KHMER SYMBOL LEK ATTAK PII)
17f3: ៳ (KHMER SYMBOL LEK ATTAK BEI)
17f4: ៴ (KHMER SYMBOL LEK ATTAK BUON)
17f5: ៵ (KHMER SYMBOL LEK ATTAK PRAM)
17f6: ៶ (KHMER SYMBOL LEK ATTAK PRAM-MUOY)
17f7: ៷ (KHMER SYMBOL LEK ATTAK PRAM-PII)
17f8: ៸ (KHMER SYMBOL LEK ATTAK PRAM-BEI)
17f9: ៹ (KHMER SYMBOL LEK ATTAK PRAM-BUON)
2150: ⅐ (VULGAR FRACTION ONE SEVENTH)
2151: ⅑ (VULGAR FRACTION ONE NINTH)
2152: ⅒ (VULGAR FRACTION ONE TENTH)
2153: ⅓ (VULGAR FRACTION ONE THIRD)
2154: ⅔ (VULGAR FRACTION TWO THIRDS)
2155: ⅕ (VULGAR FRACTION ONE FIFTH)
2156: ⅖ (VULGAR FRACTION TWO FIFTHS)
2157: ⅗ (VULGAR FRACTION THREE FIFTHS)
2158: ⅘ (VULGAR FRACTION FOUR FIFTHS)
2159: ⅙ (VULGAR FRACTION ONE SIXTH)
215a: ⅚ (VULGAR FRACTION FIVE SIXTHS)
215b: ⅛ (VULGAR FRACTION ONE EIGHTH)
215c: ⅜ (VULGAR FRACTION THREE EIGHTHS)
215d: ⅝ (VULGAR FRACTION FIVE EIGHTHS)
215e: ⅞ (VULGAR FRACTION SEVEN EIGHTHS)
215f: ⅟ (FRACTION NUMERATOR ONE)
2160: Ⅰ (ROMAN NUMERAL ONE)
2161: Ⅱ (ROMAN NUMERAL TWO)
2162: Ⅲ (ROMAN NUMERAL THREE)
2163: Ⅳ (ROMAN NUMERAL FOUR)
2164: Ⅴ (ROMAN NUMERAL FIVE)
2165: Ⅵ (ROMAN NUMERAL SIX)
2166: Ⅶ (ROMAN NUMERAL SEVEN)
2167: Ⅷ (ROMAN NUMERAL EIGHT)
2168: Ⅸ (ROMAN NUMERAL NINE)
2169: Ⅹ (ROMAN NUMERAL TEN)
216a: Ⅺ (ROMAN NUMERAL ELEVEN)
216b: Ⅻ (ROMAN NUMERAL TWELVE)
216c: Ⅼ (ROMAN NUMERAL FIFTY)
216d: Ⅽ (ROMAN NUMERAL ONE HUNDRED)
216e: Ⅾ (ROMAN NUMERAL FIVE HUNDRED)
216f: Ⅿ (ROMAN NUMERAL ONE THOUSAND)
2170: ⅰ (SMALL ROMAN NUMERAL ONE)
2171: ⅱ (SMALL ROMAN NUMERAL TWO)
2172: ⅲ (SMALL ROMAN NUMERAL THREE)
2173: ⅳ (SMALL ROMAN NUMERAL FOUR)
2174: ⅴ (SMALL ROMAN NUMERAL FIVE)
2175: ⅵ (SMALL ROMAN NUMERAL SIX)
2176: ⅶ (SMALL ROMAN NUMERAL SEVEN)
2177: ⅷ (SMALL ROMAN NUMERAL EIGHT)
2178: ⅸ (SMALL ROMAN NUMERAL NINE)
2179: ⅹ (SMALL ROMAN NUMERAL TEN)
217a: ⅺ (SMALL ROMAN NUMERAL ELEVEN)
217b: ⅻ (SMALL ROMAN NUMERAL TWELVE)
217c: ⅼ (SMALL ROMAN NUMERAL FIFTY)
217d: ⅽ (SMALL ROMAN NUMERAL ONE HUNDRED)
217e: ⅾ (SMALL ROMAN NUMERAL FIVE HUNDRED)
217f: ⅿ (SMALL ROMAN NUMERAL ONE THOUSAND)
2180: ↀ (ROMAN NUMERAL ONE THOUSAND C D)
2181: ↁ (ROMAN NUMERAL FIVE THOUSAND)
2182: ↂ (ROMAN NUMERAL TEN THOUSAND)
2185: ↅ (ROMAN NUMERAL SIX LATE FORM)
2186: ↆ (ROMAN NUMERAL FIFTY EARLY FORM)
2187: ↇ (ROMAN NUMERAL FIFTY THOUSAND)
2188: ↈ (ROMAN NUMERAL ONE HUNDRED THOUSAND)
2189: ↉ (VULGAR FRACTION ZERO THIRDS)
2469: ⑩ (CIRCLED NUMBER TEN)
246a: ⑪ (CIRCLED NUMBER ELEVEN)
246b: ⑫ (CIRCLED NUMBER TWELVE)
246c: ⑬ (CIRCLED NUMBER THIRTEEN)
246d: ⑭ (CIRCLED NUMBER FOURTEEN)
246e: ⑮ (CIRCLED NUMBER FIFTEEN)
246f: ⑯ (CIRCLED NUMBER SIXTEEN)
2470: ⑰ (CIRCLED NUMBER SEVENTEEN)
2471: ⑱ (CIRCLED NUMBER EIGHTEEN)
2472: ⑲ (CIRCLED NUMBER NINETEEN)
2473: ⑳ (CIRCLED NUMBER TWENTY)
247d: ⑽ (PARENTHESIZED NUMBER TEN)
247e: ⑾ (PARENTHESIZED NUMBER ELEVEN)
247f: ⑿ (PARENTHESIZED NUMBER TWELVE)
2480: ⒀ (PARENTHESIZED NUMBER THIRTEEN)
2481: ⒁ (PARENTHESIZED NUMBER FOURTEEN)
2482: ⒂ (PARENTHESIZED NUMBER FIFTEEN)
2483: ⒃ (PARENTHESIZED NUMBER SIXTEEN)
2484: ⒄ (PARENTHESIZED NUMBER SEVENTEEN)
2485: ⒅ (PARENTHESIZED NUMBER EIGHTEEN)
2486: ⒆ (PARENTHESIZED NUMBER NINETEEN)
2487: ⒇ (PARENTHESIZED NUMBER TWENTY)
2491: ⒑ (NUMBER TEN FULL STOP)
2492: ⒒ (NUMBER ELEVEN FULL STOP)
2493: ⒓ (NUMBER TWELVE FULL STOP)
2494: ⒔ (NUMBER THIRTEEN FULL STOP)
2495: ⒕ (NUMBER FOURTEEN FULL STOP)
2496: ⒖ (NUMBER FIFTEEN FULL STOP)
2497: ⒗ (NUMBER SIXTEEN FULL STOP)
2498: ⒘ (NUMBER SEVENTEEN FULL STOP)
2499: ⒙ (NUMBER EIGHTEEN FULL STOP)
249a: ⒚ (NUMBER NINETEEN FULL STOP)
249b: ⒛ (NUMBER TWENTY FULL STOP)
24eb: ⓫ (NEGATIVE CIRCLED NUMBER ELEVEN)
24ec: ⓬ (NEGATIVE CIRCLED NUMBER TWELVE)
24ed: ⓭ (NEGATIVE CIRCLED NUMBER THIRTEEN)
24ee: ⓮ (NEGATIVE CIRCLED NUMBER FOURTEEN)
24ef: ⓯ (NEGATIVE CIRCLED NUMBER FIFTEEN)
24f0: ⓰ (NEGATIVE CIRCLED NUMBER SIXTEEN)
24f1: ⓱ (NEGATIVE CIRCLED NUMBER SEVENTEEN)
24f2: ⓲ (NEGATIVE CIRCLED NUMBER EIGHTEEN)
24f3: ⓳ (NEGATIVE CIRCLED NUMBER NINETEEN)
24f4: ⓴ (NEGATIVE CIRCLED NUMBER TWENTY)
24fe: ⓾ (DOUBLE CIRCLED NUMBER TEN)
277f: ❿ (DINGBAT NEGATIVE CIRCLED NUMBER TEN)
2789: ➉ (DINGBAT CIRCLED SANS-SERIF NUMBER TEN)
2793: ➓ (DINGBAT NEGATIVE CIRCLED SANS-SERIF NUMBER TEN)
2cfd: ⳽ (COPTIC FRACTION ONE HALF)
3007: 〇 (IDEOGRAPHIC NUMBER ZERO)
3021: 〡 (HANGZHOU NUMERAL ONE)
3022: 〢 (HANGZHOU NUMERAL TWO)
3023: 〣 (HANGZHOU NUMERAL THREE)
3024: 〤 (HANGZHOU NUMERAL FOUR)
3025: 〥 (HANGZHOU NUMERAL FIVE)
3026: 〦 (HANGZHOU NUMERAL SIX)
3027: 〧 (HANGZHOU NUMERAL SEVEN)
3028: 〨 (HANGZHOU NUMERAL EIGHT)
3029: 〩 (HANGZHOU NUMERAL NINE)
3038: 〸 (HANGZHOU NUMERAL TEN)
3039: 〹 (HANGZHOU NUMERAL TWENTY)
303a: 〺 (HANGZHOU NUMERAL THIRTY)
3192: ㆒ (IDEOGRAPHIC ANNOTATION ONE MARK)
3193: ㆓ (IDEOGRAPHIC ANNOTATION TWO MARK)
3194: ㆔ (IDEOGRAPHIC ANNOTATION THREE MARK)
3195: ㆕ (IDEOGRAPHIC ANNOTATION FOUR MARK)
3220: ㈠ (PARENTHESIZED IDEOGRAPH ONE)
3221: ㈡ (PARENTHESIZED IDEOGRAPH TWO)
3222: ㈢ (PARENTHESIZED IDEOGRAPH THREE)
3223: ㈣ (PARENTHESIZED IDEOGRAPH FOUR)
3224: ㈤ (PARENTHESIZED IDEOGRAPH FIVE)
3225: ㈥ (PARENTHESIZED IDEOGRAPH SIX)
3226: ㈦ (PARENTHESIZED IDEOGRAPH SEVEN)
3227: ㈧ (PARENTHESIZED IDEOGRAPH EIGHT)
3228: ㈨ (PARENTHESIZED IDEOGRAPH NINE)
3229: ㈩ (PARENTHESIZED IDEOGRAPH TEN)
3251: ㉑ (CIRCLED NUMBER TWENTY ONE)
3252: ㉒ (CIRCLED NUMBER TWENTY TWO)
3253: ㉓ (CIRCLED NUMBER TWENTY THREE)
3254: ㉔ (CIRCLED NUMBER TWENTY FOUR)
3255: ㉕ (CIRCLED NUMBER TWENTY FIVE)
3256: ㉖ (CIRCLED NUMBER TWENTY SIX)
3257: ㉗ (CIRCLED NUMBER TWENTY SEVEN)
3258: ㉘ (CIRCLED NUMBER TWENTY EIGHT)
3259: ㉙ (CIRCLED NUMBER TWENTY NINE)
325a: ㉚ (CIRCLED NUMBER THIRTY)
325b: ㉛ (CIRCLED NUMBER THIRTY ONE)
325c: ㉜ (CIRCLED NUMBER THIRTY TWO)
325d: ㉝ (CIRCLED NUMBER THIRTY THREE)
325e: ㉞ (CIRCLED NUMBER THIRTY FOUR)
325f: ㉟ (CIRCLED NUMBER THIRTY FIVE)
3280: ㊀ (CIRCLED IDEOGRAPH ONE)
3281: ㊁ (CIRCLED IDEOGRAPH TWO)
3282: ㊂ (CIRCLED IDEOGRAPH THREE)
3283: ㊃ (CIRCLED IDEOGRAPH FOUR)
3284: ㊄ (CIRCLED IDEOGRAPH FIVE)
3285: ㊅ (CIRCLED IDEOGRAPH SIX)
3286: ㊆ (CIRCLED IDEOGRAPH SEVEN)
3287: ㊇ (CIRCLED IDEOGRAPH EIGHT)
3288: ㊈ (CIRCLED IDEOGRAPH NINE)
3289: ㊉ (CIRCLED IDEOGRAPH TEN)
32b1: ㊱ (CIRCLED NUMBER THIRTY SIX)
32b2: ㊲ (CIRCLED NUMBER THIRTY SEVEN)
32b3: ㊳ (CIRCLED NUMBER THIRTY EIGHT)
32b4: ㊴ (CIRCLED NUMBER THIRTY NINE)
32b5: ㊵ (CIRCLED NUMBER FORTY)
32b6: ㊶ (CIRCLED NUMBER FORTY ONE)
32b7: ㊷ (CIRCLED NUMBER FORTY TWO)
32b8: ㊸ (CIRCLED NUMBER FORTY THREE)
32b9: ㊹ (CIRCLED NUMBER FORTY FOUR)
32ba: ㊺ (CIRCLED NUMBER FORTY FIVE)
32bb: ㊻ (CIRCLED NUMBER FORTY SIX)
32bc: ㊼ (CIRCLED NUMBER FORTY SEVEN)
32bd: ㊽ (CIRCLED NUMBER FORTY EIGHT)
32be: ㊾ (CIRCLED NUMBER FORTY NINE)
32bf: ㊿ (CIRCLED NUMBER FIFTY)
3405: 㐅 (CJK UNIFIED IDEOGRAPH-3405)
3483: 㒃 (CJK UNIFIED IDEOGRAPH-3483)
382a: 㠪 (CJK UNIFIED IDEOGRAPH-382A)
3b4d: 㭍 (CJK UNIFIED IDEOGRAPH-3B4D)
4e00: 一 (CJK UNIFIED IDEOGRAPH-4E00)
4e03: 七 (CJK UNIFIED IDEOGRAPH-4E03)
4e07: 万 (CJK UNIFIED IDEOGRAPH-4E07)
4e09: 三 (CJK UNIFIED IDEOGRAPH-4E09)
4e5d: 九 (CJK UNIFIED IDEOGRAPH-4E5D)
4e8c: 二 (CJK UNIFIED IDEOGRAPH-4E8C)
4e94: 五 (CJK UNIFIED IDEOGRAPH-4E94)
4e96: 亖 (CJK UNIFIED IDEOGRAPH-4E96)
4ebf: 亿 (CJK UNIFIED IDEOGRAPH-4EBF)
4ec0: 什 (CJK UNIFIED IDEOGRAPH-4EC0)
4edf: 仟 (CJK UNIFIED IDEOGRAPH-4EDF)
4ee8: 仨 (CJK UNIFIED IDEOGRAPH-4EE8)
4f0d: 伍 (CJK UNIFIED IDEOGRAPH-4F0D)
4f70: 佰 (CJK UNIFIED IDEOGRAPH-4F70)
5104: 億 (CJK UNIFIED IDEOGRAPH-5104)
5146: 兆 (CJK UNIFIED IDEOGRAPH-5146)
5169: 兩 (CJK UNIFIED IDEOGRAPH-5169)
516b: 八 (CJK UNIFIED IDEOGRAPH-516B)
516d: 六 (CJK UNIFIED IDEOGRAPH-516D)
5341: 十 (CJK UNIFIED IDEOGRAPH-5341)
5343: 千 (CJK UNIFIED IDEOGRAPH-5343)
5344: 卄 (CJK UNIFIED IDEOGRAPH-5344)
5345: 卅 (CJK UNIFIED IDEOGRAPH-5345)
534c: 卌 (CJK UNIFIED IDEOGRAPH-534C)
53c1: 叁 (CJK UNIFIED IDEOGRAPH-53C1)
53c2: 参 (CJK UNIFIED IDEOGRAPH-53C2)
53c3: 參 (CJK UNIFIED IDEOGRAPH-53C3)
53c4: 叄 (CJK UNIFIED IDEOGRAPH-53C4)
56db: 四 (CJK UNIFIED IDEOGRAPH-56DB)
58f1: 壱 (CJK UNIFIED IDEOGRAPH-58F1)
58f9: 壹 (CJK UNIFIED IDEOGRAPH-58F9)
5e7a: 幺 (CJK UNIFIED IDEOGRAPH-5E7A)
5efe: 廾 (CJK UNIFIED IDEOGRAPH-5EFE)
5eff: 廿 (CJK UNIFIED IDEOGRAPH-5EFF)
5f0c: 弌 (CJK UNIFIED IDEOGRAPH-5F0C)
5f0d: 弍 (CJK UNIFIED IDEOGRAPH-5F0D)
5f0e: 弎 (CJK UNIFIED IDEOGRAPH-5F0E)
5f10: 弐 (CJK UNIFIED IDEOGRAPH-5F10)
62fe: 拾 (CJK UNIFIED IDEOGRAPH-62FE)
634c: 捌 (CJK UNIFIED IDEOGRAPH-634C)
67d2: 柒 (CJK UNIFIED IDEOGRAPH-67D2)
6f06: 漆 (CJK UNIFIED IDEOGRAPH-6F06)
7396: 玖 (CJK UNIFIED IDEOGRAPH-7396)
767e: 百 (CJK UNIFIED IDEOGRAPH-767E)
8086: 肆 (CJK UNIFIED IDEOGRAPH-8086)
842c: 萬 (CJK UNIFIED IDEOGRAPH-842C)
8cae: 貮 (CJK UNIFIED IDEOGRAPH-8CAE)
8cb3: 貳 (CJK UNIFIED IDEOGRAPH-8CB3)
8d30: 贰 (CJK UNIFIED IDEOGRAPH-8D30)
9621: 阡 (CJK UNIFIED IDEOGRAPH-9621)
9646: 陆 (CJK UNIFIED IDEOGRAPH-9646)
964c: 陌 (CJK UNIFIED IDEOGRAPH-964C)
9678: 陸 (CJK UNIFIED IDEOGRAPH-9678)
96f6: 零 (CJK UNIFIED IDEOGRAPH-96F6)
a6e6: ꛦ (BAMUM LETTER MO)
a6e7: ꛧ (BAMUM LETTER MBAA)
a6e8: ꛨ (BAMUM LETTER TET)
a6e9: ꛩ (BAMUM LETTER KPA)
a6ea: ꛪ (BAMUM LETTER TEN)
a6eb: ꛫ (BAMUM LETTER NTUU)
a6ec: ꛬ (BAMUM LETTER SAMBA)
a6ed: ꛭ (BAMUM LETTER FAAMAE)
a6ee: ꛮ (BAMUM LETTER KOVUU)
a6ef: ꛯ (BAMUM LETTER KOGHOM)
a830: ꠰ (NORTH INDIC FRACTION ONE QUARTER)
a831: ꠱ (NORTH INDIC FRACTION ONE HALF)
a832: ꠲ (NORTH INDIC FRACTION THREE QUARTERS)
a833: ꠳ (NORTH INDIC FRACTION ONE SIXTEENTH)
a834: ꠴ (NORTH INDIC FRACTION ONE EIGHTH)
a835: ꠵ (NORTH INDIC FRACTION THREE SIXTEENTHS)
f96b: 參 (CJK COMPATIBILITY IDEOGRAPH-F96B)
f973: 拾 (CJK COMPATIBILITY IDEOGRAPH-F973)
f978: 兩 (CJK COMPATIBILITY IDEOGRAPH-F978)
f9b2: 零 (CJK COMPATIBILITY IDEOGRAPH-F9B2)
f9d1: 六 (CJK COMPATIBILITY IDEOGRAPH-F9D1)
f9d3: 陸 (CJK COMPATIBILITY IDEOGRAPH-F9D3)
f9fd: 什 (CJK COMPATIBILITY IDEOGRAPH-F9FD)

但是,Numeric_Type=Digit 和 Numeric_Type=Numeric 之间的区别不再被认为是有用的,并且 Numeric_Type=Digit 自 Unicode 6.3.0 起不再用于新字符。报价 Unicode Standard Annex #44 :

Starting with Unicode 6.3.0, no newly encoded numeric characters will be given Numeric_Type=Digit, nor will existing characters with Numeric_Type=Numeric be changed to Numeric_Type=Digit. The distinction between those two types is not considered useful.

因此,🄌(DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO)和其他曾经被分配 Numeric_Type=Digit 的字符被分配了 Numeric_Type=Numeric,它们报告 False for isdigit:

>>> '🄌'.isdigit()
False

关于python - unicode.isdigit() 和 unicode.isnumeric() 的区别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24384852/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com