gpt4 book ai didi

GCC 内联汇编副作用

转载 作者:行者123 更新时间:2023-12-02 05:04:14 25 4
gpt4 key购买 nike

有人可以向我解释一下(换句话说)GCC doc 的以下部分吗? :

Here is a fictitious sum of squares instruction, that takes two pointers to floating point values in memory and produces a floating point register output. Notice that x, and y both appear twice in the asm parameters, once to specify memory accessed, and once to specify a base register used by the asm. You won’t normally be wasting a register by doing this as GCC can use the same register for both purposes. However, it would be foolish to use both %1 and %3 for x in this asm and expect them to be the same. In fact, %3 may well not be a register. It might be a symbolic memory reference to the object pointed to by x.

asm ("sumsq %0, %1, %2"
: "+f" (result)
: "r" (x), "r" (y), "m" (*x), "m" (*y));

Here is a fictitious *z++ = *x++ * *y++ instruction. Notice that the x, y and z pointer registers must be specified as input/output because the asm modifies them.

asm ("vecmul %0, %1, %2"
: "+r" (z), "+r" (x), "+r" (y), "=m" (*z)
: "m" (*x), "m" (*y));

在第一个示例中,在输入操作数中列出 *x*y 有何意义?同一份文档指出:

In particular, there is no way to specify that input operands get modified without also specifying them as output operands.

在第二个示例中为什么要使用输入操作数部分?无论如何,它的操作数都不会在汇编语句中使用。

作为奖励,如何将以下示例更改为 this所以帖子所以不需要 volatile 关键字?

void swap_2 (int *a, int *b)
{
int tmp0, tmp1;

__asm__ volatile (
"movl (%0), %k2\n\t" /* %2 (tmp0) = (*a) */
"movl (%1), %k3\n\t" /* %3 (tmp1) = (*b) */
"cmpl %k3, %k2\n\t"
"jle %=f\n\t" /* if (%2 <= %3) (at&t!) */
"movl %k3, (%0)\n\t"
"movl %k2, (%1)\n\t"
"%=:\n\t"

: "+r" (a), "+r" (b), "=r" (tmp0), "=r" (tmp1) :
: "memory" /* "cc" */ );
}

提前致谢。我已经为此苦苦挣扎了两天。

最佳答案

在第一个示例中,*x*y 必须列为输入操作数,以便 GCC 知道指令的结果取决于它们。否则,GCC 可以将存储移动到内联汇编片段之后的 *x*y 中,然后该片段将访问未初始化的内存。通过编译这个例子可以看出这一点:

double
f (void)
{
double result;
double a = 5;
double b = 7;
double *x = &a;
double *y = &b;
asm ("sumsq %0, %1, %2"
: "+X" (result)
: "r" (x), "r" (y) /*, "m" (*x), "m" (*y)*/);
return result;
}

结果是:

f:
leaq -16(%rsp), %rax
leaq -8(%rsp), %rdx
pxor %xmm0, %xmm0
#APP
# 8 "t.c" 1
sumsq %xmm0, %rax, %rdx
# 0 "" 2
#NO_APP
ret

两条leaq指令只是将寄存器设置为指向堆栈上未初始化的红色区域。作业不见了。

第二个示例也是如此。

我认为你可以使用同样的技巧来消除 volatile 。但我认为这里实际上没有必要,因为已经有一个“内存” clobber,它告诉GCC内存是从内联汇编中读取或写入的。

关于GCC 内联汇编副作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46942936/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com