gpt4 book ai didi

c - strcasecmp算法有缺陷吗?

转载 作者:行者123 更新时间:2023-12-03 07:27:52 33 4
gpt4 key购买 nike

我试图在C中重新实现strcasecmp函数,我注意到在比较过程中似乎不一致。

来自man strcmp

The strcmp() function compares the two strings s1 and s2. The locale is not taken into account (for a locale-aware comparison, see strcoll(3)). It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2.



来自 man strcasecmp

The strcasecmp() function performs a byte-by-byte comparison of the strings s1 and s2, ignoring the case of the characters. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2.


int strcmp(const char *s1, const char *s2); int strcasecmp(const char *s1, const char *s2);
鉴于此信息,我不理解以下代码的结果:

#include <stdio.h>
#include <string.h>

int main()
{
// ASCII values
// 'A' = 65
// '_' = 95
// 'a' = 97

printf("%i\n", strcmp("A", "_"));
printf("%i\n", strcmp("a", "_"));
printf("%i\n", strcasecmp("A", "_"));
printf("%i\n", strcasecmp("a", "_"));
return 0;
}


输出:

-1  # "A" is less than "_"
1 # "a" is more than "_"
2 # "A" is more than "_" with strcasecmp ???
2 # "a" is more than "_" with strcasecmp

看起来,如果 s1中的当前字符是字母,则无论 s2中的当前字符是否是字母,它总是会转换为小写。

有人可以解释这种行为吗?第一和第三行不应该相同吗?

先感谢您!

PS:
我在Manjaro上使用 gcc 9.2.0
另外,当我使用 -fno-builtin标志进行编译时,我得到的是:

-30
2
2
2

我猜是因为程序没有使用gcc的优化功能,但问题仍然存在。

最佳答案

行为是正确的。

the POSIX str\[n\]casecmp() specification:

When the LC_CTYPE category of the locale being used is from the POSIX locale, these functions shall behave as if the strings had been converted to lowercase and then a byte comparison performed. Otherwise, the results are unspecified.



这也是 the NOTES section of the Linux man page的一部分:

The POSIX.1-2008 standard says of these functions:

When the LC_CTYPE category of the locale being used is from the POSIX locale, these functions shall behave as if the strings had been converted to lowercase and then a byte comparison performed. Otherwise, the results are unspecified.



为什么?

As @HansOlsson pointed out in his answer,仅在字母之间进行不区分大小写的比较,并允许所有其他比较像 strcmp()一样具有其“自然”结果会破坏排序。

如果 'A' == 'a'(不区分大小写的比较的定义),那么 '_' > 'A''_' < 'a'(ASCII字符集中的“自然”结果)不能同时为true。

关于c - strcasecmp算法有缺陷吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60342445/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com