使用泰勒展开的 sin(x) 汇编代码-6ren

使用泰勒展开的 sin(x) 汇编代码

转载作者：行者123 更新时间：2023-12-04 01:33:06

26

4

在 x86 Linux 中，如何实现 sin(x)在使用泰勒展开的汇编代码中？

最佳答案

您没有说明哪种 CPU 架构，所以我假设为 x86。

最简单(也可能是最低效)的方法是在 RPN 中编写公式，它几乎可以直接映射到 FPU 指令。

例子，

代数公式:x - (x^3/3!) + (x^5/5!)

RPN:x x x * x * 3 2 */- x x * x * x * x * 5 4 * 3 * 2 */+

变成:

fld x
fld x
fld x
fmul
fld x
fmul
fild [const_3]
fild [const_2]
fmul
fdiv
fsub
fld x
fld x
fmul 
fld x
fmul
fld x
fmul
fld x
fmul
fild [const_5]
fild [const_4]
fmul
fild [const_3]
fmul
fild [const_2]
fmul
fdiv
fadd

有一些明显的优化策略——

而不是计算 x, xxx,
每个术语的 xxxxx 等，存储一个
“运行产品”，然后乘以
每次 x*x

代替
计算每个的阶乘
术语，做同样的“运行产品”

这是 x86 FPU 的一些注释代码，每条 FPU 指令之后的注释显示了该指令执行后的堆栈状态，堆栈顶部 (st0) 在左侧，例如:

fldz ; 0
fld1 ; 1, 0

--剪断--

bits 32

section .text

extern printf
extern atof
extern atoi
extern puts
global main

taylor_sin:
  push eax
  push ecx

  ; input :
  ;  st(0) = x, value to approximate sin(x) of
  ;  [esp+12] = number of taylor series terms

  ; variables we'll use :
  ; s = sum of all terms (final result)
  ; x = value we want to take the sin of
  ; fi = factorial index (1, 3, 5, 7, ...)
  ; fc = factorial current (1, 6, 120, 5040, ...)
  ; n = numerator of term (x, x^3, x^5, x^7, ...)

  ; setup state for each iteration (term)
  fldz ; s x
  fxch st1 ; x s
  fld1 ; fi x s
  fld1 ; fc fi x s
  fld st2 ; n fc fi x s

  ; first term
  fld st1 ; fc n fc fi x s
  fdivr st0,st1 ; r n fc fi x s
  faddp st5,st0 ; n fc fi x s

  ; loop through each term
  mov ecx,[esp+12] ; number of terms
  xor eax,eax ; zero add/sub counter

loop_term:
  ; calculate next odd factorial
  fld1 ; 1 n fc fi x s
  faddp st3 ; n fc fi x s
  fld st2 ; fi n fc fi x s
  fmulp st2,st0
  fld1 ; 1 n fc fi x s
  faddp st3 ; n fc fi x s
  fld st2 ; fi n fc fi x s
  fmulp st2,st0 ; n fc fi x s

  ; calculate next odd power of x
  fmul st0,st3 ; n*x fc fi x s
  fmul st0,st3 ; n*x*x fc fi x s

  ; divide power by factorial
  fld st1 ; fc n fc fi x s
  fdivr st0,st1 ; r n fc fi x s

  ; check if we need to add or subtract this term
  test eax,1
  jnz odd_term
  fsubp st5,st0 ; n fc fi x s
  jmp skip
odd_term:
  ; accumulate result
  faddp st5,st0 ; n fc fi x s
skip:
  inc eax ; increment add/sub counter
  loop loop_term

  ; unstack work variables
  fstp st0
  fstp st0
  fstp st0
  fstp st0

  ; result is in st(0)

  pop ecx
  pop eax

  ret

main:

  ; check if we have 2 command-line args
  mov eax, [esp+4]
  cmp eax, 3
  jnz error

  ; get arg 1 - value to calc sin of
  mov ebx, [esp+8]
  push dword [ebx+4]
  call atof
  add esp, 4

  ; get arg 2 - number of taylor series terms
  mov ebx, [esp+8]
  push dword [ebx+8]
  call atoi
  add esp, 4

  ; do the taylor series approximation
  push eax
  call taylor_sin
  add esp, 4

  ; output result
  sub esp, 8
  fstp qword [esp]
  push format
  call printf
  add esp,12

  ; return to libc
  xor eax,eax
  ret

error:
  push error_message
  call puts
  add esp,4
  mov eax,1
  ret

section .data

error_message: db "syntax: <x> <terms>",0
format: db "%0.10f",10,0

运行程序:

$ ./taylor-sine 0.5 1
0.4791666667
$ ./taylor-sine 0.5 5
0.4794255386
$ echo "s(0.5)"|bc -l
.47942553860420300027

关于使用泰勒展开的 sin(x) 汇编代码，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/1252929/

26

4

0

文章推荐： perl - 如何在 Perl 和 Moose 中编写工厂代码？

文章推荐： ntfs - 多大就是太大(对于 NTFS)

文章推荐： .net - 为什么设置后在 dot net 快捷方式中禁用了目标？

文章推荐： asp.net-mvc - ActionResult 返回调用它的页面

haskell - haskell中的 `sin sin 0.5`和 `sin (sin 0.5)`有什么区别？
*> sin sin 0.5 :10:1: Non type-variable argument in the constraint: Floating (a -> a) (Use Flexible
c++ - 与数学库 sin 相比，sin 函数不准确
我一直在尝试实现一个快速但更重要的是准确的自定义 sin 函数(我不能在我的项目中使用 math.h sin)。我不是这类数学方面的专家，所以请和我一起工作 XD。在网上稍作搜索后，我发现了以下代码，
prolog - 将 sin 函数的答案分配给序言中的 sin(X) 项
我编写了一个 Prolog 程序来求解简单的三角方程。我写它是为了获取三角函数的值。例如，我可以获得 sin(45) 的值，但我无法将 sin(45) 的值赋给术语 sin(45 )。我尝试了 =,=
python - 在 python 中使用 Sin-1 或逆 sin
这是我的代码: # point of intersection between opposite and hypotenuse x,y = pygame.mouse.get_pos() # u
c++ - 为什么所有内核上的 sin(Vector) 可以和一个内核上的 sin(V) 一样快？
我有一个简单的 C++ 代码，它在一个值 vector 上运行一个默认的 sin 函数。 static void BM_sin() { int data_size = 10000000
javascript - Javascript "Math.sin"和 WebGL“sin”之间的区别
有什么区别，如何让 WebGL 的 sin 产生与 Math.sin 相同的结果？编辑:我的顶点着色器中有一些代码(这不是全部代码)，它计算球体周围的斐波那契点，并且应该将顶点放置在这个新点上: a
sin 到 std::sin 的 C++ 别名——需要草率的快速修复
我有一个客户试图在一个过时的编译器上编译，该编译器似乎没有来自 c++11 的 std::sin 和 std::cos。 (而且他们不能升级)我正在寻找某种快速修复方法来插入标题的顶部以使 std::
c - C 中的泰勒级数(sin(240) 和 sin(300) 的问题)
#include #include const int TERMS = 7; const float PI = 3.14159265358979; int fact(int n) { r
c++ - 正弦和余弦哪个更有效？ Sin 和 Cos 还是 Sin 和 Sqrt？
不幸的是，标准 C++ 库没有对 sincos 的单一调用，这为这个问题提供了空间。第一个问题: 如果我想计算 sin 和 cos，计算 sin 和 cos 更便宜，还是先计算 sin 再计算 sq
c++ - 创建一个 Fast Sin() 函数来提高 fps ?快速 sin() 函数？
我正在实时渲染 500x500 点。我必须使用 atan() 和 sin() 函数计算点的位置。通过使用 atan() 和 sin()，我得到了 24 fps(每秒帧数)。 float thetaC
java - 如何在 Java 中重新实现 sin() 方法？ (使结果接近 Math.sin() )
我知道 Math.sin() 可以工作，但我需要自己使用 factorial(int) 实现它我已经在下面有一个阶乘方法是我的 sin 方法，但我无法获得与 Math.sin() 相同的结果: pu
jvm - 为什么 Math.sin() 委托(delegate)给 StrictMath.sin()？
我想知道，当我在 Reddit thread 中发现问题时，为什么 Math.sin(double) 委托(delegate)给 StrictMath.sin(double) .提到的代码片段如下所示
python - 为什么在 Pi 的整数倍处 torch.sin() 和 numpy.sin() 的评估存在数量级？
为什么 Pytorch 和 Numpy 的三角函数在以 Pi 的整数倍计算时会导致数量级上如此巨大的差异？ >>> torch.sin(torch.ones(1)*2*np.pi) tensor([1
c++ - 找不到 sin(double)、sin(double&)、cos(double)、cos(double&)
这是一个很简单的问题，让我很困惑。我收到一个源文件的以下错误，但另一个没有: 4 src/Source2.cpp:1466: error: no matching function for cal
javascript - 为什么预计算 sin(x) *比在 Javascript 中使用 Math.sin() *慢*？
我在 JavaScript 中发现了一个有趣的异常现象。其中重点是我尝试通过预先计算 sin(x) 和 cos(x) 并简单地引用预先计算的值来加速三 Angular 变换计算。直觉上，预计算比每次
python - numpy 的 sin(x) 有多精确？我怎么知道？ [需要它来数值求解 x=a*sin(x)]
我正在尝试用 Python 对方程 x=a*sin(x) 进行数值求解，其中 a 是某个常数。我已经尝试先用符号求解方程，但似乎这种特殊的表达形式并没有在 sympy 中实现。我也尝试过使用 symp
matlab - 为什么在 matlab 中 sin(pi) 不精确但 sin(pi/2) 是精确的？
我在使用 matlab 计算时遇到问题。我知道“pi”是一个 float ，并不精确。因此，在 matlab 中 sin(pi) 不完全为零。我的问题是，如果“pi”不准确，那么为什么 sin(pi/
java - Android:我如何只使用 sin 或 cos 而不是 Math.sin 或 Math.cos
如何只使用 sin 或 cos 而不是 Math.sin 或 Math.cos？我尝试导入 Math.* 但我想我可能需要对命名空间做一些事情？最佳答案 import static java.lan
c++ - 为什么 std::sin() 和 std::cos() 比 sin() 和 cos() 慢？
测试代码: #include #include const int N = 4096; const float PI = 3.1415926535897932384626; float cosin
python - 如何避免 math.sin(math.pi*2*VERY LARGE NUMBER) 的误差范围比 math.sin(math.pi*2) 大得多？
我在其他问题中读到，例如由于浮点表示，sin(2π) 不为零，但非常接近。这个非常小的错误在我的代码中不是问题，因为例如我可以四舍五入 5 位小数。但是当2π乘以一个非常大的数时，误差就会放大很多。

首页

博学

6Ren·AI

商城

使用泰勒展开的 sin(x) 汇编代码