gpt4 book ai didi

php - 多字节安全计数字符串中的不同字符

转载 作者:行者123 更新时间:2023-12-01 14:02:24 25 4
gpt4 key购买 nike

我不想找到一种智能有效的方法来计算一个字符串中有多少不同的字母字符。例子:

$str = "APPLE";
echo char_count($str) // should return 4, because APPLE has 4 different chars 'A', 'P', 'L' and 'E'

$str = "BOB AND BOB"; // should return 5 ('B', 'O', 'A', 'N', 'D').

$str = 'PLÁTANO'; // should return 7 ('P', 'L', 'Á', 'T', 'A', 'N', 'O')

它应该支持 UTF-8 字符串!

最佳答案

如果您正在处理 UTF-8(您真的应该考虑,恕我直言)没有一个已发布的解决方案(使用 strlen、str_split 或 count_chars)将起作用,因为它们都将一个字节视为一个字符(这对于UTF-8,显然)。

<?php

$treat_spaces_as_chars = true;
// contains hälöwrd and a space, being 8 distinct characters (7 without the space)
$string = "hällö wörld";
// remove spaces if we don't want to count them
if (!$treat_spaces_as_chars) {
$string = preg_replace('/\s+/u', '', $string);
}
// split into characters (not bytes, like explode() or str_split() would)
$characters = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
// throw out the duplicates
$unique_characters = array_unique($characters);
// count what's left
$numer_of_characters = count($unique_characters);

如果要丢弃所有非单词字符:
<?php

$ignore_non_word_characters = true;
// contains hälöwrd and PIE, as this is treated as a word character (Greek)
$string = "h,ä*+l•π‘°’lö wörld";
// remove spaces if we don't want to count them
if ($ignore_non_word_characters) {
$string = preg_replace('/\W+/u', '', $string);
}
// split into characters (not bytes, like explode() or str_split() would)
$characters = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
// throw out the duplicates
$unique_characters = array_unique($characters);
// count what's left
$numer_of_characters = count($unique_characters);

var_dump($characters, $unique_characters, $numer_of_characters);

关于php - 多字节安全计数字符串中的不同字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7730059/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com