22 "One hundred forty four" => 144 "Twenty bla bla" -6ren">
gpt4 book ai didi

java - 如何使用icu4j在java中将单词转换为数字

转载 作者:行者123 更新时间:2023-12-02 05:10:51 25 4
gpt4 key购买 nike

我想将单词解析为数字,当字符串不能完全表达实数时会出错,例如:

"Twenty two" => 22
"One hundred forty four" => 144
"Twenty bla bla" => error
"One hundred forty thousand one" => error

我尝试使用com.ibm.icu.text.RuleBasedNumberFormat,但parse()方法仅解析开头而不是完整字符串。他们的 javadoc 中提到了这一点:
从给定字符串的开头解析文本以生成数字。该方法可能不会使用给定字符串的整个文本

在他们的javadoc中提到可以使用特殊的规则集,结合RuleBasedCollat​​or来改变宽松的解析,但我正在努力实现这一点。

public class NumFormatter {
public static int numberFromString(String number, Locale locale) {
RuleBasedNumberFormat numberFormat = new RuleBasedNumberFormat(locale, RuleBasedNumberFormat.SPELLOUT);

try {
return numberFormat.parse(number).intValue();
} catch (ParseException e) {
return -1;
}
}
}

public class NumFormatterTest
@Test
public void formatNumber_fromString() {
Locale locale = new Locale("en");
assertEquals(numberFromString("twenty two", locale), 22);
assertEquals(numberFromString("three blablabla ", locale), -1); // not ok. It return 3 and not -1.
}
}

pom.xml
<dependency>
<groupId>com.ibm.icu</groupId>
<artifactId>icu4j</artifactId>
<version>60.2</version>
</dependency>

以前有人处理过这个问题吗?预先感谢您。

链接

最佳答案

  • 文档内容如下:
To see how these rules actually work in practice, consider the following example: Formatting 25,430 with this rule set would work like this:

<< thousand >> [the rule whose base value is 1,000 is applicable to 25,340]
twenty->> thousand >> [25,340 over 1,000 is 25. The rule for 20 applies.]
twenty-five thousand >> [25 mod 10 is 5. The rule for 5 is "five."
twenty-five thousand << hundred >> [25,340 mod 1,000 is 340. The rule for 100 applies.]
twenty-five thousand three hundred >> [340 over 100 is 3. The rule for 3 is "three."]
twenty-five thousand three hundred forty [340 mod 100 is 40. The rule for 40 applies. Since 40 divides evenly by 10, the hyphen and substitution in the brackets are omitted.]

public class NumberFormat {

public static void main(String[] args) {
Locale locale = new Locale("en");
int twenty = numberFromString("twenty-two", locale);
System.out.println(twenty);
}

public static int numberFromString(String number, Locale locale) {
RuleBasedNumberFormat numberFormat = new RuleBasedNumberFormat(locale, RuleBasedNumberFormat.SPELLOUT);

try {
return numberFormat.parse(number).intValue();
} catch (ParseException e) {
return -1;
}
}
}

您需要将空格替换为 -

关于java - 如何使用icu4j在java中将单词转换为数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56326959/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com