"igsc"。 当我使用 toLowerCase(new Locale("en", "US")) 函-6ren">
gpt4 book ai didi

java - 将包含土耳其语字符的字符串转换为小写

转载 作者:行者123 更新时间:2023-11-29 09:43:20 26 4
gpt4 key购买 nike

我想将包含土耳其语字符的字符串转换为小写,并将土耳其语字符映射为英语等价物,即 "İĞŞÇ" -> "igsc"

当我使用 toLowerCase(new Locale("en", "US")) 函数时,它会将例如 © 转换为 i 但是点缀。

我该如何解决这个问题? (我正在使用 Java 7)

谢谢。

最佳答案

你可以

1) 首先,删除重音符:

以下内容来自本主题:

Is there a way to get rid of accents and convert a whole string to regular letters? :

Use java.text.Normalizer to handle this for you.

string = Normalizer.normalize(string, Normalizer.Form.NFD);

This will separate all of the accent marks from the characters. Then, you just need to compare each character against being a letter and throw out the ones that aren't.

string = string.replaceAll("[^\\p{ASCII}]", "");

If your text is in unicode, you should use this instead:

string = string.replaceAll("\\p{M}", "");

For unicode, \P{M} matches the base glyph and \p{M} (lowercase) matches each accent.

2) 然后,把剩下的String转为小写

string = string.toLowerCase();

关于java - 将包含土耳其语字符的字符串转换为小写,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35597603/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com