c - 为什么 Knuth 使用这种笨拙的减量？-6ren

c - 为什么 Knuth 使用这种笨拙的减量？

转载作者：行者123 更新时间：2023-12-02 16:06:38

27

4

我正在查看 Don Knuth 教授的一些代码，这些代码是用 CWEB 编写并转换为 C 语言的。具体示例是 dlx1.w，可从 Knuth's website 获取。

在某个阶段，struct nd[cc] 的 .len 值会递减，并且是以一种笨拙的方式完成的:

  o,t=nd[cc].len-1;
  o,nd[cc].len=t;

(这是一个特定于 Knuth 的问题，所以也许您已经知道“o”是一个用于递增“mems”的预处理器宏，“mems”是通过访问 64 位字来衡量的运行总工作量.)“t”中剩余的值绝对不会用于其他任何用途。 (此处的示例位于 dlx1.w 的第 665 行，或 ctangle 之后的 dlx1.c 的第 193 行。)

我的问题是:为什么高德纳这样写，而不是

nd[cc].len--;

他确实在其他地方使用过(dlx1.w 的第 551 行):

oo,nd[k].len--,nd[k].aux=i-1;

(“oo”是一个类似的宏，用于将“mems”递增两次 - 但这里有一些微妙之处，因为 .len 和 .aux 存储在同一个 64 位字中。为 S.len 赋值和 S.aux，通常只计算 mems 的一个增量。)

我唯一的理论是减量由两次内存访问组成:首先查找，然后分配。 (对吗？)而这样的写法是为了提醒两个步骤。对于 Knuth 来说，这可能是异常冗长的，但也许这是本能的备忘录，而不是说教。

对于它的值(value)，我在 CWEB documentation 中进行了搜索没有找到答案。我的问题可能更多地与高德纳的标准实践有关，我正在一点一点地学习。我会对将这些实践作为一个 block 进行布局(并且可能受到批评)的任何资源感兴趣 - 但现在，让我们关注 Knuth 为什么这样写。

最佳答案

初步评论:对于 Knuth 风格的文学编程(即，当阅读 WEB 或 CWEB 程序时)，Knuth 所设想的“真实”程序既不是“源” .w文件也不是生成的(纠结).c文件，而是排版(编织)输出。来源.w文件最好被认为是生成它的一种方法(当然还有提供给编译器的 .c 源代码)。 (如果你没有方便的 cweave 和 TeX；我已经排版了其中一些程序 here ；这个程序 DLX1 is here 。)

因此，在这种情况下，我将代码中的位置描述为 DLX1 的模块 25，或子例程“cover”:

无论如何，回到实际问题:请注意，这个(DLX1)是为计算机编程艺术编写的程序之一。由于报告程序所花费的时间“秒”或“分钟”逐年变得毫无意义，因此他用“mems”加上“oops”来报告程序花费的时间，这由“mems”主导，即对 64 位字的内存访问次数(通常)。因此，书中包含诸如“该程序在 3.5 gigamems 的运行时间内找到该问题的答案”之类的陈述。此外，这些陈述基本上是关于程序/算法本身，而不是针对某些硬件的特定版本的编译器生成的特定代码。 (理想情况下，当细节非常重要时，他会在 MMIX 或 MMIXAL 中编写程序，并在 MMIX 硬件上分析其操作，但这种情况很少见。)对 mems 进行计数(如上报告)是插入 o 的目的。和oo指令到程序中。请注意，对于多次执行的“内循环”指令(例如子例程 cover 中的所有内容)，正确执行此操作更为重要。在这种情况下。

第 1.3.1′ 节对此进行了详细说明(Fascicle 1 的一部分):

Timing. […] The running time of a program depends not only on the clock rate but also on the number of functional units that can be active simultaneously and the degree to which they are pipelined; it depends on the techniques used to prefetch instructions before they are executed; it depends on the size of the random-access memory that is used to give the illusion of 2⁶⁴ virtual bytes; and it depends on the sizes and allocation strategies of caches and other buffers, etc., etc.

For practical purposes, the running time of an MMIX program can often be estimated satisfactorily by assigning a fixed cost to each operation, based on the approximate running time that would be obtained on a high-performance machine with lots of main memory; so that’s what we will do. Each operation will be assumed to take an integer number of υ, where υ (pronounced “oops”) is a unit that represents the clock cycle time in a pipelined implementation. Although the value of υ decreases as technology improves, we always keep up with the latest advances because we measure time in units of υ, not in nanoseconds. The running time in our estimates will also be assumed to depend on the number of memory references or mems that a program uses; this is the number of load and store instructions. For example, we will assume that each LDO (load octa) instruction costs µ + υ, where µ is the average cost of a memory reference. The total running time of a program might be reported as, say, 35µ+ 1000υ, meaning “35 mems plus 1000 oops.” The ratio µ/υ has been increasing steadily for many years; nobody knows for sure whether this trend will continue, but experience has shown that µ and υ deserve to be considered independently.

他当然明白与现实的区别:

Even though we will often use the assumptions of Table 1 for seat-of-the-pants estimates of running time, we must remember that the actual running time might be quite sensitive to the ordering of instructions. For example, integer division might cost only one cycle if we can find 60 other things to do between the time we issue the command and the time we need the result. Several LDB (load byte) instructions might need to reference memory only once, if they refer to the same octabyte. Yet the result of a load command is usually not ready for use in the immediately following instruction. Experience has shown that some algorithms work well with cache memory, and others do not; therefore µ is not really constant. Even the location of instructions in memory can have a significant effect on performance, because some instructions can be fetched together with others. […] Only the meta-simulator can be trusted to give reliable information about a program’s actual behavior in practice; but such results can be difficult to interpret, because infinitely many configurations are possible. That’s why we often resort to the much simpler estimates of Table 1.

最后，我们可以使用Godbolt的Compiler Explorer查看典型编译器为此代码生成的代码。 (理想情况下，我们会查看 MMIX 指令，但由于我们无法做到这一点，所以让我们采用默认值，这似乎是 x68-64 gcc 8.2。)我删除了所有 o和 oo s。

对于代码版本:

  /*o*/ t = nd[cc].len - 1;
  /*o*/ nd[cc].len = t;

第一行生成的代码是:

  movsx rax, r13d
  sal rax, 4
  add rax, OFFSET FLAT:nd+8
  mov eax, DWORD PTR [rax]
  lea r14d, [rax-1]

第二行是:

  movsx rax, r13d
  sal rax, 4
  add rax, OFFSET FLAT:nd+8
  mov DWORD PTR [rax], r14d

对于代码版本:

  /*o ?*/ nd[cc].len --;

生成的代码是:

  movsx rax, r13d
  sal rax, 4
  add rax, OFFSET FLAT:nd+8
  mov eax, DWORD PTR [rax]
  lea edx, [rax-1]
  movsx rax, r13d
  sal rax, 4
  add rax, OFFSET FLAT:nd+8
  mov DWORD PTR [rax], edx

正如你所看到的(即使对 x86-64 汇编不太了解)只是前一种情况中生成的代码的串联(除了使用寄存器 edx 而不是 r14d )，所以它并不像如果将减量写在一行中可以节省您的内存。特别是，将其算作单个是不正确的，尤其是像 cover 这样的东西。这在该算法中被调用了很多次(通过跳舞链接来实现精确覆盖)。

因此 Knuth 编写的版本是正确的，因为它的目标是计算内存数量。他还可以写oo,nd[cc].len--; (数两个内存)正如您所观察到的，但在这种情况下乍一看可能看起来像是一个错误。 (顺便说一句，在您的问题 oo,nd[k].len--,nd[k].aux=i-1; 的示例中，两个 mems 来自 -- 中的负载和存储；而不是两个存储。)

关于c - 为什么 Knuth 使用这种笨拙的减量？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53979547/

27

4

0

文章推荐： java - 如何只获得类型为 XMLGregorianCalendar 的年份

文章推荐： Laravel Project Multi-Tenancy 、多库、一域

文章推荐： asp.net - 在 WebPart 编辑器区域中添加自定义属性

knuth - "Man Or Boy"Knuth 测试如何进行？
谁能解释一下 Man Or Boy Test返回值 -67？我徒劳地尝试写下结果，或者用调试器跟踪它。任何帮助将不胜感激。可以找到不同实现的列表 here . 最佳答案 This is a nic
knuth - 什么是 Knuth 的 WEB？
我一直在想弄清楚唐纳德·克努斯 (Donald Knuth) 的 WEB是的，但真的很矛盾。我从网页上了解到它类似于 doxygen，但我阅读的所有资料都坚持认为它是一种编程语言。但是，它看起来不像我
knuth - TAoCP 练习旁边的方括号中的数字意味着什么？
这是一个例子: [00] 2009 的二进制形式... [05]哪个字母... [10] 四位数量——半字节或十六进制数字... [15] 千字节... [M13]如果x是任意0和1的字符串... [
c++ - Knuth 置换算法奇怪的行为
我附上了一段代码，它根据 cout 语句给出了奇怪的输出。该程序主要计算 Knuth 的排列。输入是说:run1代码运行良好，第一次通过:调用跟踪将是: r un1 你的 n1 努尔 1 1nur
algorithm - Knuth 方差算法的复杂性
算法是这样的: def online_variance(data): n = 0 mean = 0 M2 = 0 for x in data: n =
algorithm - Knuth 的水库采样伪代码中可能存在错误
下面是 Knuth 的水库采样伪代码(如何从一组 k 数字中选择 n 数字，确保每个数字都具有相同的概率)。初始化:一个大小为:k的水库. for i = k+1 to N M = rand
c++ - Knuth 的乘法散列函数通过位移位
我尝试通过 A = 2654435769 的位移位来实现 Knuth 的乘法算法和 2^p 个元素的散列大小但是非移位和移位算法给出不同的结果我是如何尝试实现这两个算法的: template
algorithm - Knuth 优化中的单调性
我正在学习什么是 Knuth 优化。相关信息可通过here查询 Knuth 优化基本上有两个假设。一个是四边形不等式，另一个是单调性我完全可以理解什么是四边形不等式。但是，由于没有例子解释Mon
algorithm - Knuth 长除法算法
我正在实现 D. E. Knuth 的计算机编程艺术第 2 卷第 4.3.2 节的算法 D。在步骤 D3 中，我应该计算 q = floor(u[j+n]*BASE+u[j+n-1] / v[n-1
c++ - knuth 乘法哈希
这是 Knuth 乘法哈希的正确实现吗。 int hash(int v) { v *= 2654435761; return v >> 32; } 乘法溢出会影响算法吗？如何提高该方
c - 为什么 Knuth 使用这种笨拙的减量？
我正在查看 Don Knuth 教授的一些代码，这些代码是用 CWEB 编写并转换为 C 语言的。具体示例是 dlx1.w，可从 Knuth's website 获取。在某个阶段，struct nd
haskell - 类型代数和 Knuth 向上箭头表示法
通读this question和 this blog post让我更多地思考类型代数，特别是如何滥用它。基本上， 1) 我们可以将 Either A B 类型视为加法:A+B 2) 我们可以将有序对
c - 如何编译 Knuth 的程序？
我正在尝试编译 Donald Knuth 的程序之一 http://www-cs-faculty.stanford.edu/~uno/programs/grayspan.w 。我使用的是 Ubunt
c++ - 可以在 Knuth 堆上进行碎片整理吗？
我正在考虑是否可以消除 Knuth 内存堆上的外部碎片？在尝试解决这个问题之前，我不确定我们是否可以在堆上移动 block 。如果我们可以移动 block ，那么我相信解决外部碎片是微不足道的。我对
c - 试图理解 Knuth 的排列算法
关闭。这个问题需要debugging details .它目前不接受答案。编辑问题以包含 desired behavior, a specific problem or error, and th
算法:Donald Knuth 除法算法混淆
我正在尝试实现一个除以两个大精度数字的程序(我将它们作为字符串)。来自 Stack Overflow 上其他问题的人建议实现 Donald Knuth 的 The Art of Computer Pr
c - D.Knuth 舞蹈链接算法的术语解释
我已经从 D.Knuth 的 website 下载了DLX算法。在 D.Knuth 概述问题的第一部分中，将列分隔为“主要”列和其他列。这些“主要”列是哪些？提前致谢。最佳答案这是对 Exact
algorithm - Knuth 的向上箭头表示法是否有任何实际用途，如某些算法？
最近，我阅读了一些有关 Ackermann 函数和 Knuth 的向上箭头表示法的内容。我知道该符号用于表示变化很大的数字。但是，我找不到这种表示法的任何实际用途——该表示法应用于某些算法或程序。那么
c++ - 验证 Knuth 洗牌算法是否尽可能公正
我正在实现 Knuth shuffle对于我正在处理的 C++ 项目。我试图从我的洗牌中获得最公正的结果(而且我不是(伪)随机数生成方面的专家)。我只是想确保这是最公正的洗牌实现。 draw_t 是字
c - Knuth 的《编程艺术》第三版和欧几里得算法示例
我刚刚开始阅读 Knuth 的《编程艺术》第一卷，并阅读了第 4 页上他描述欧几里得算法的部分。他阐述了求两个数的最大公约数的步骤。您可以在这里阅读更多相关信息 https://en.wikipedi

首页

博学

6Ren·AI

商城

c - 为什么 Knuth 使用这种笨拙的减量？