gpt4 book ai didi

java - Java 11 中 String trim() 和 strip() 方法的区别

转载 作者:搜寻专家 更新时间:2023-11-01 00:54:18 29 4
gpt4 key购买 nike

除其他变化外,JDK 11 还为 java.lang.String 类引入了 6 个新方法:

  • repeat(int) - 根据 int 参数提供的次数重复字符串
  • lines() - 使用 Spliterator 延迟提供来自源字符串的行
  • isBlank() - 指示字符串是否为空或仅包含空白字符
  • stripLeading() - 删除开头的空白
  • stripTrailing() - 移除末尾的空白
  • strip() - 删除字符串开头和结尾的空格

特别是,strip() 看起来与trim() 非常相似。根据 this article strip*() 方法旨在:

The String.strip(), String.stripLeading(), and String.stripTrailing() methods trim white space [as determined by Character.isWhiteSpace()] off either the front, back, or both front and back of the targeted String.

String.trim() JavaDoc 声明:

/**
* Returns a string whose value is this string, with any leading and trailing
* whitespace removed.
* ...
*/

这与上面的引述几乎相同。

自 Java 11 以来,String.trim()String.strip() 到底有什么区别?

最佳答案

简而言之:strip()trim() 的“Unicode 感知”演变。意思是 trim() 仅删除字符 <= U+0020(空格); strip() 移除所有 Unicode 空白字符(但不是所有控制字符,例如\0)

CSR : JDK-8200378

Problem

String::trim 在 Unicode 时代的 Java 早期就已经存在

had not fully evolved to the standard we widely use today.

The definition of space used by String::trim is any code point lessthan or equal to the space code point (\u0020), commonly referred toas ASCII or ISO control characters.

Unicode-aware trimming routines should useCharacter::isWhitespace(int).

Additionally, developers have not been able to specifically removeindentation white space or to specifically remove trailing whitespace.

Solution

Introduce trimming methods that are Unicode white space awareand provide additional control of leading only or trailing only.

这些新方法的一个共同特征是它们使用与旧方法(如 String.trim())不同(更新)的“空白”定义。错误 JDK-8200373 .

The current JavaDoc for String::trim does not make it clear whichdefinition of "space" is being used in the code. With additionaltrimming methods coming in the near future that use a differentdefinition of space, clarification is imperative. String::trim usesthe definition of space as any codepoint that is less than or equal tothe space character codepoint (\u0020.) Newer trimming methods willuse the definition of (white) space as any codepoint that returns truewhen passed to the Character::isWhitespace predicate.

isWhitespace(char) 方法是在 JDK 1.1 中添加到 Character 中的,但是 isWhitespace(int) 方法没有引入到Character 类直到 JDK 1.5。添加了后一种方法(接受 int 类型参数的方法)以支持增补字符。 Character 类的 Javadoc 注释定义了补充字符(通常使用基于 int 的“代码点”建模)与 BMP 字符(通常使用单个字符建模):

The set of characters from U+0000 to U+FFFF is sometimes referred toas the Basic Multilingual Plane (BMP). Characters whose code pointsare greater than U+FFFF are called supplementary characters. The Javaplatform uses the UTF-16 representation in char arrays and in theString and StringBuffer classes. In this representation, supplementarycharacters are represented as a pair of char values ... A char value,therefore, represents Basic Multilingual Plane (BMP) code points,including the surrogate code points, or code units of the UTF-16encoding. An int value represents all Unicode code points, includingsupplementary code points. ... The methods that only accept a charvalue cannot support supplementary characters. ... The methods thataccept an int value support all Unicode characters, includingsupplementary characters.

OpenJDK Changeset .


trim()strip() 之间的基准比较 - Why is String.strip() 5 times faster than String.trim() for blank string In Java 11

关于java - Java 11 中 String trim() 和 strip() 方法的区别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51345212/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com