Homemade 'fabs' function in C and x86 Assembly(国产“FABS”在C和x86汇编中的作用)-6ren

Homemade 'fabs' function in C and x86 Assembly(国产“FABS”在C和x86汇编中的作用)

转载作者：bug小助手更新时间：2023-10-28 11:42:31

I'm trying to make in GNU C an fabs function that returns the absolute value of a 32 bits float. I have three different ways, called fabs1, fabs2, and fabs3:

我正试图在GNU C中创建一个返回32位浮点数绝对值的FABS函数。我有三种不同的方法，称为fab1、fab2和fab3：

#include <math.h>
#include <stdio.h>

typedef union
{
    float v;
    struct
    {
        int mantissa : 23;
        int exponent : 8;
        int negative : 1;
    } b;
} components;

float fabs1(float f)
{
    return f >= 0.0 ? f : -f;
}

float fabs2(float f)
{
    components c;

    c.v = f;
    c.b.negative = 0;

    return c.v;
}

float fabs3(float f)
{
    double aux = f;
    unsigned short cw;

    __asm__
    (
        "finit;\
        fstcw %[cw];\
        andw $0xf0ff, %[cw];\
        orw $0x0200, %[cw];\
        fldcw %[cw];\
        fldl %[aux];\
        fabs;\
        fstpl %[aux];"
        : [aux] "=mr" (aux) : "m" (aux), [cw] "m" (cw)
    );

    return aux;
}

void main(void)
{
    printf("fabs(-189.55f) = %f\n", fabs(-189.55f));

    printf("fabs1(-189.55f) = %f\n", fabs1(-189.55f));
    printf("fabs2(-189.55f) = %f\n", fabs2(-189.55f));
    printf("fabs3(-189.55f) = %f\n", fabs3(-189.55f));
}

There are three different functions, one using a simple decision, one a bit more complicated using unions, and a final one using x86 assembly. I am compiling it in Cygwin 32 bits with:

有三个不同的函数，一个使用简单的判定，一个使用联合，最后一个使用x86汇编。我正在用Cygwin 32位编译它，其中包括：

C:/Developer/Cygwin/bin/i686-w64-mingw32-gcc -masm=att -I.. -std=c99 -o main.exe main.c

I'm running it in Windows 11 and the results are:

我在Windows 11上运行它，结果是：

fabs(-189.55f) = 189.550000
fabs1(-189.55f) = 189.550003
fabs2(-189.55f) = 189.550003
fabs3(-189.55f) = 189.550003

But they should really be:

但它们真的应该是：

fabs(-189.55f) = 189.550000
fabs1(-189.55f) = 189.550000
fabs2(-189.55f) = 189.550000
fabs3(-189.55f) = 189.550000

Can you spot the difference? How do I get rid of the extra 0.000003 in all three cases?

你能看出不同之处吗？在这三种情况下，我如何处理掉多余的0.000003？

更多回答

Can you explain what "It works fine in 64 bits, but not in 32 bits" really means? Error? Incorrect result? Give us the details.

你能解释一下“它在64位下运行得很好，但在32位下就不行了”的真正含义吗？错误？结果不正确？给我们讲讲细节。

ia64 refers to Itanium which I suspect is not what you have. The common 64-bit desktop architecture is called x86-64 or amd64 (or x64 by Microsoft). But since you're compiling as 32-bit code, you're not using that either; this is just x86.

Ia64指的是安腾，我怀疑它不是你所拥有的。常见的64位桌面体系结构称为x86-64或AMD64(或Microsoft的x64)。但是，因为您正在编译为32位代码，所以您也没有使用它；这只是x86。

I think the fabs version might be able to promote the value directly to double, since fabs is defined with a double argument and return value, and then you get the nearest double to -189.55 which is much closer. I'd have to double check C's rules for floating point literals. I suspect if you use fabsf instead you will get the same result as the other versions.

我认为FABS版本可能能够直接将值提升到双倍，因为FABS是用双精度参数和返回值定义的，然后您可以得到最接近的双精度，即-189.55，这更接近。我必须仔细检查C语言中浮点文字的规则。我怀疑，如果您使用frupf，您将得到与其他版本相同的结果。

You didn't say what the problem is.

你没说问题出在哪里。

I'm not sure why you mess with the rounding mode. fabs toggles the sign bit, it should not cause rounding.

我不知道你为什么要弄乱四舍五入模式。FABS切换符号位，它应该不会导致舍入。

优秀答案推荐

The basic issue causing the 3 to appear is that the number 189.550000 can't be represented to that level of precision in a float -- the closest value is 189.5500030517578125 (0x1.7b199ap+7 in hex), which when printed with 6 digits after the decimal point is 189.550003

导致3出现的基本问题是，数字189.550000不能以浮点数的精度级别表示--最接近的值是189.5500030517578125(0x1.7b199ap+7的十六进制)，当打印小数点后的6位数字时，它是189.550003

The compiler is permitted to do operations at higher precision, so when you use fabs (which may be builtin and returns a double), you may get the value 189.55000000000001136868377216160297393798828125 (the closest you can get with double precision -- 0x1.7b1999999999ap+7 in hex), but all your handwritten functions return the float value of 189.5500030517578125

编译器被允许以更高的精度执行操作，因此当您使用FABS(可能是内置的，并返回一个双精度值)时，您可能会得到值189.55000000000001136868377216160297393798828125(使用双精度--0x1.7b1999999999ap+7的十六进制)，但您的所有手写函数返回的浮点值都是189.5500030517578125

To get rid of the 3s you can:

要摆脱你能做到的3，你可以：

change everything to double precision

change the output to 5 chars after the decimal (%.5f in the format)

However, neither fixes the fundamental problem that IEEE binary floating point numbers cannot exactly represent base-10 fractions, so there will always be rounding and imprecision going on.

然而，这两种方法都不能解决IEEE二进制浮点数不能准确表示以10为基数的小数这一根本问题，因此始终存在舍入和不精确问题。

更多回答

Every call to some fabs variant in OP’s code passes -189.55f as argument. This should always be a float of the same value, and the fact one of the calls is to fabs and that fabs has a double parameter and double return type should be irrelevant. -189.55f should produce a float value before it is passed to fabs, and passing it to fabs should not change that value. Peter Cordes already identified the problem, a compiler defect.

每次调用OP代码中的某个FABS变量时，都会传递-189.55f作为参数。它应该始终是一个相同值的浮点数，其中一个调用是对FABS的调用，FABS具有双参数和双返回类型，这一事实应该是无关紧要的。-189.55f应在传递给FABS之前产生浮点值，传递给FABS不应改变该值。Peter Cordes已经发现了这个问题，这是一个编译器缺陷。

@EricPostpischil The precision of float is just a minimum -- the compiler is always permitted to evaluate things at higher precision if it wants to, but it is not required to. But that is irrelevant to the OPs question of why the 3 appears (and how to get rid of it), which is due to using float precision.

@EricPostpischil浮点数的精度只是一个最小值--如果编译器愿意，它总是被允许以更高的精度计算，但这不是必需的。但这与OP为什么出现3（以及如何摆脱它）的问题无关，这是由于使用浮点精度。

The C standard explicitly states that all instances of a floating-point literal of the same form (exactly the same characters in the source text) must convert to the same value. C 2018 6.4.4.2 5: “… All floating constants of the same source form shall convert to the same internal format with the same value.”

C标准明确规定，相同形式(源文本中完全相同的字符)的浮点文本的所有实例必须转换为相同的值。C 2018 6.4.4.2 5：“…同一源格式的所有浮点常量应转换为具有相同值的相同内部格式。“

The compiler is permitted to do operations at higher precision - Yes, but 189.55f means the starting point for any operations must still be a float that can actually have existed, as Eric says. That literal has a value of type float. Gaining precision beyond that requires breaking in to the parsing of the float and changing to parsing it as a double, which as Eric says is not what the standard says should happen. godbolt.org/z/n53nsKhW6 shows that as args to printf, (float)189.55f has the expected low zeros in the mantissa but plain 189.55f doesn't.

编译器被允许以更高的精度进行运算--是的，但189.55f意味着任何运算的起点必须仍然是一个浮点数，而该浮点数实际上可能已经存在，Eric说。该字面值的类型为Float。除此之外，要获得更高的精度，需要进入浮点数的解析，并将其解析为双精度型，正如Eric所说，这不是标准所说的应该发生的事情。Org/z/n53nsKhW6显示，与args to print tf一样，(Float)189.55f在尾数中有预期的低零，而普通的189.55f没有。

@PeterCordes: Ugh. Lousy excuse. I expect the standard’s reason for allowing excess precision is for run-time performance—use one FMA instead of two instructions, use double-precision instructions on a processor without single-precision instructions, etc. Interpreting the standard’s latitude on computation as applying to translating constants is questionable. I would expect most experienced floating-point programmers to think that 189.55f refers to a specific value, so the GCC behavior will be surprising to them.

@PeterCordes：啊。糟糕的借口。我预计该标准允许额外精度的原因是为了运行时性能-使用一个FMA而不是两条指令，在没有单精度指令的处理器上使用双精度指令，等等。将该标准在计算方面的纬度解释为适用于转换常量是值得怀疑的。我预计大多数有经验的浮点程序员都会认为189.55f指的是一个特定值，所以GCC的行为会让他们感到惊讶。

文章推荐： iphone - 如何为图层 shadowOpacity 设置动画？

文章推荐： php - Laravel 4 : how to "order by" using Eloquent ORM

文章推荐： java - 如何在 Jackson 中禁用 fail_on_empty_beans？

文章推荐： php - 将数组打印到文件

实例分析Try {} Catch{} 作用
今天有小伙伴给我留言问到，try{...}catch(){...}是什么意思？它用来干什么？简单的说他们是用来捕获异常的下面我们通过一个例子来详细讲解下
html - 列表社交媒体链接的 ARIA 作用
我正在努力提高网站的可访问性，但我不知道如何在页脚中标记社交媒体链接列表。这些链接指向我在 facecook、twitter 等上的帐户。我不想用 role="navigation" 标记这些链接，因
java.util.Timer SystemTime 作用？
说现在是 6 点，我有一个 Timer 并在 10 点安排了一个 TimerTask。之后，System DateTime 被其他服务(例如 ntp)调整为 9 点钟。我仍然希望我的 TimerTas
php - 什么是 Doctrine hydration 作用？
就目前而言，这个问题不适合我们的问答形式。我们希望答案得到事实、引用资料或专业知识的支持，但这个问题可能会引发辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the
python入门:argparse浅析 nargs='+'作用
我就废话不多说了，大家还是直接看代码吧~ ? 1
Maven是什么?Maven的概念+作用+仓库的介绍+常用命令的详解
Maven系列1 1.什么是Maven？ Maven是一个项目管理工具，它包含了一个对象模型。一组标准集合，一个依赖管理系统。和用来运行定义在生命周期阶段中插件目标和逻辑。核心功能 Mav

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Homemade 'fabs' function in C and x86 Assembly(国产“FABS”在C和x86汇编中的作用)