I'm new to C language. The following code snippet is from a book I've been reading.
我是C语言的新手。下面的代码片段来自我一直在读的一本书。
struct S {
int i; double d; char c;
};
int main(void) {
unsigned char bad_buff[sizeof(struct S)];
_Alignas(struct S) unsigned char good_buff[sizeof(struct S)];
struct S *bad_s_ptr = (struct S *)bad_buff; // wrong pointer alignment
struct S *good_s_ptr = (struct S *)good_buff; // correct pointer alignment
}
The author states that bad_buff
may have incorrect alignment for member-access expressions. I have difficulty in understanding this statement. What can go wrong if we use bad_buff
?
作者指出,bad_buff可能对成员访问表达式进行了不正确的对齐。我很难理解这句话。如果我们使用bad_buff,会出现什么问题?
更多回答
Is this really from a book? Both methods are bad and incorrect.
这真的是一本书吗?这两种方法都是错误的。
It appears to be from Effective C: An Introduction to Professional C Programming - Robert C. Seacord. Not a word about strict pointer aliasing. This is embarrassing, the book clearly didn't go through peer review by C experts. Another book to add to the black list...
它似乎出自《高效C:专业C编程导论》——罗伯特·C·西科德。对严格的指针别名只字不提。令人尴尬的是,这本书显然没有经过C专家的同行评审。另一本要添加到黑名单的书。。。
The correct answer is: you don't. And if you are a C beginner, you are out on deep waters since this is a rather advanced topic. My rule of thumb is that a beginner should never use the cast operator for any purpose, since there are so many pitfalls when doing so.
正确的答案是:你没有。如果你是一个C级的初学者,你就陷入了困境,因为这是一个相当高级的话题。我的经验法则是,初学者永远不应该出于任何目的使用cast运算符,因为这样做会有很多陷阱。
There are two problems going from a raw character byte array to a struct type:
从原始字符字节数组到结构类型有两个问题:
First there is alignment which the code addresses somewhat. But _Alignas etc only ensures that the buffer and the struct have the same alignment, ie start address. It does nothing for alignment of individual struct members and any padding the struct may contain internally. Such things are non-portable.
首先是代码在一定程度上解决的对齐问题。但是_Alignas等只能确保缓冲区和结构具有相同的对齐方式,即起始地址。它对单个结构成员的对齐以及结构内部可能包含的任何填充都没有任何作用。这样的东西是不可携带的。
Second, these kind of dirty casts are undefined behavior in the C language, meaning anything can happen. Including incorrect code getting generated, hardware exceptions getting thrown etc. Or it might just seem to work fine for now/this time/forever/until next full moon. What is the strict aliasing rule?
其次,这种脏类型的强制转换在C语言中是未定义的行为,这意味着任何事情都可能发生。包括生成不正确的代码,抛出硬件异常等。或者它可能只是现在/这次/永远/直到下一次满月都很好。什么是严格的混叠规则?
(As it happens, you can go from a struct to a character byte array safely, but not the other way around.)
(碰巧的是,您可以安全地从结构转到字符字节数组,但不能反过来。)
There's no quick fix to address both of these issues. If you are aware that a struct has padding bytes and where they are, then you could do something like:
没有快速解决这两个问题的方法。如果您知道一个结构有填充字节以及它们在哪里,那么您可以执行以下操作:
typedef union
{
some_struct s;
unsigned char buf [sizeof(some_struct)];
} some_union;
This solves the strict aliasing problems and initial alignment, but it doesn't solve the issue of individual struct members potentially having padding bytes. For that you'd need some non-standard means like #pragma pack(1)
. Which in turn is problematic since padding is there for a reason and if you pack the struct you might not be able to access certain members as intended.
这解决了严格的混叠问题和初始对齐,但并不能解决单个结构成员可能具有填充字节的问题。为此,您需要一些非标准的方法,如#pragma pack(1)。这反过来又是有问题的,因为填充是有原因的,如果你打包了结构,你可能无法按预期访问某些成员。
Structs are overall non-portable and unsuitable for things like close to the metal constructs like data communication protocols or register maps. The only truly portable way is to write a serialization/deserialization function which copies in/out of individual struct members one by one.
结构总体上是不可移植的,不适合于数据通信协议或寄存器映射等接近金属的结构。唯一真正可移植的方法是编写一个序列化/反序列化函数,逐个复制单个结构成员的入/出。
更多回答
Regarding your first point, if the pointer is correctly aligned for the structure as a whole, then the addresses of all the members must be correctly aligned too as long as all the members have a fundamental alignment and do not have an alignment specifier specifying a non-fundamental alignment.
关于第一点,如果指针对整个结构正确对齐,那么所有成员的地址也必须正确对齐,只要所有成员都有基本对齐,并且没有指定非基本对齐的对齐说明符。
@IanAbbott Yes but there will be padding bytes in order to guarantee that. And if the raw byte stream does not have corresponding dummy bytes matching that padding, then type punning between them will not work.
@IanAbbott是的,但会有填充字节来保证这一点。如果原始字节流没有与该填充匹配的相应伪字节,那么它们之间的类型punning将不起作用。
Later on you address the problem of alignment of structure members with externally defined protocol fields (i.e. serialization/deserialization). Perhaps in the first point you are discussing the possible non-alignment of structure members with some specific indices in the char array that are determined by some protocol, but as written, it reads like you are discussing the possible non-alignment of structure members relative to a pointer that is correctly aligned for the structure itself.
稍后,您将解决结构成员与外部定义的协议字段对齐的问题(即序列化/反序列化)。也许在第一点中,您讨论的是结构成员可能与由某些协议确定的char数组中的某些特定索引不对齐,但正如所写的,这读起来就像您在讨论结构成员相对于结构本身正确对齐的指针可能不对齐。
@IanAbbott Well, _Alignas doesn't do anything for individual struct members since their alignment is already handled automatically by the compiler (since C89, so nothing to do with C11). The problem with the OP's book is that they drop _Alignas as some sort of magic feature which will fix all problems in the code and we get the impression that if we only add this keyword, we can suddenly do wild & crazy type punning with dirty casts.
@IanAbbott好吧,_Alignas对单个结构成员没有任何作用,因为它们的对齐已经由编译器自动处理了(从C89开始,所以与C11无关)。OP的书的问题是,他们放弃了Alignas作为某种神奇的功能,它将修复代码中的所有问题,我们得到的印象是,如果我们只添加这个关键字,我们可以突然用肮脏的类型转换进行疯狂的双关语。
我是一名优秀的程序员,十分优秀!