- c - 在位数组中找到第一个零
- linux - Unix 显示有关匹配两种模式之一的文件的信息
- 正则表达式替换多个文件
- linux - 隐藏来自 xtrace 的命令
我正在尝试在 java 上实现 Fast Inverse Square Root 以加速 vector 规范化。但是,当我在 Java 中实现单精度版本时,一开始我获得的速度与 1F/(float)Math.sqrt()
大致相同,然后迅速下降到一半。这很有趣,因为虽然 Math.sqrt 使用(我假设)本地方法,但这涉及浮点除法,我听说这真的很慢。我计算数字的代码如下:
public static float fastInverseSquareRoot(float x){
float xHalf = 0.5F * x;
int temp = Float.floatToRawIntBits(x);
temp = 0x5F3759DF - (temp >> 1);
float newX = Float.intBitsToFloat(temp);
newX = newX * (1.5F - xHalf * newX * newX);
return newX;
}
使用我编写的一个短程序,每迭代 1600 万次,然后汇总结果,并重复,我得到如下结果:
1F / Math.sqrt() took 65209490 nanoseconds.
Fast Inverse Square Root took 65456128 nanoseconds.
Fast Inverse Square Root was 0.378224 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 64131293 nanoseconds.
Fast Inverse Square Root took 26214534 nanoseconds.
Fast Inverse Square Root was 59.123647 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 27312205 nanoseconds.
Fast Inverse Square Root took 56234714 nanoseconds.
Fast Inverse Square Root was 105.895914 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 26493281 nanoseconds.
Fast Inverse Square Root took 56004783 nanoseconds.
Fast Inverse Square Root was 111.392402 percent slower than 1F / Math.sqrt()
我一直得到两者速度大致相同的数字,然后进行迭代,其中快速反平方根节省了 1F/Math.sqrt()
所需时间的大约 60%,随后通过几次迭代,快速反平方根运行的时间大约是控制的两倍。我很困惑为什么 FISR 会从 Same -> 快 60% -> 慢 100%,而且每次我运行我的程序时都会发生这种情况。
编辑:以上数据是我在eclipse中运行的。当我用 javac/java
运行程序时,我得到了完全不同的数据:
1F / Math.sqrt() took 57870498 nanoseconds.
Fast Inverse Square Root took 88206794 nanoseconds.
Fast Inverse Square Root was 52.421004 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 54982400 nanoseconds.
Fast Inverse Square Root took 83777562 nanoseconds.
Fast Inverse Square Root was 52.371599 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 21115822 nanoseconds.
Fast Inverse Square Root took 76705152 nanoseconds.
Fast Inverse Square Root was 263.259133 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 20159210 nanoseconds.
Fast Inverse Square Root took 80745616 nanoseconds.
Fast Inverse Square Root was 300.539585 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 21814675 nanoseconds.
Fast Inverse Square Root took 85261648 nanoseconds.
Fast Inverse Square Root was 290.845374 percent slower than 1F / Math.sqrt()
EDIT2: 几次响应后,速度似乎在几次迭代后稳定下来,但它稳定到的数字非常不稳定。有人知道为什么吗?
这是我的代码(不是很简洁,但这是全部):
public class FastInverseSquareRootTest {
public static FastInverseSquareRootTest conductTest() {
float result = 0F;
long startTime, endTime, midTime;
startTime = System.nanoTime();
for (float x = 1F; x < 4_000_000F; x += 0.25F) {
result = 1F / (float) Math.sqrt(x);
}
midTime = System.nanoTime();
for (float x = 1F; x < 4_000_000F; x += 0.25F) {
result = fastInverseSquareRoot(x);
}
endTime = System.nanoTime();
return new FastInverseSquareRootTest(midTime - startTime, endTime
- midTime);
}
public static float fastInverseSquareRoot(float x) {
float xHalf = 0.5F * x;
int temp = Float.floatToRawIntBits(x);
temp = 0x5F3759DF - (temp >> 1);
float newX = Float.intBitsToFloat(temp);
newX = newX * (1.5F - xHalf * newX * newX);
return newX;
}
public static void main(String[] args) throws Exception {
for (int i = 0; i < 7; i++) {
System.out.println(conductTest().toString());
}
}
private long controlDiff;
private long experimentalDiff;
private double percentError;
public FastInverseSquareRootTest(long controlDiff, long experimentalDiff) {
this.experimentalDiff = experimentalDiff;
this.controlDiff = controlDiff;
this.percentError = 100D * (experimentalDiff - controlDiff)
/ controlDiff;
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(String.format("1F / Math.sqrt() took %d nanoseconds.%n",
controlDiff));
sb.append(String.format(
"Fast Inverse Square Root took %d nanoseconds.%n",
experimentalDiff));
sb.append(String
.format("Fast Inverse Square Root was %f percent %s than 1F / Math.sqrt()%n",
Math.abs(percentError), percentError > 0D ? "slower"
: "faster"));
return sb.toString();
}
}
最佳答案
JIT 优化器似乎已经放弃了对 Math.sqrt
的调用。
用你未修改的代码,我得到了
1F / Math.sqrt() took 65358495 nanoseconds.
Fast Inverse Square Root took 77152791 nanoseconds.
Fast Inverse Square Root was 18,045544 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 52872498 nanoseconds.
Fast Inverse Square Root took 75242075 nanoseconds.
Fast Inverse Square Root was 42,308531 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 23386359 nanoseconds.
Fast Inverse Square Root took 73532080 nanoseconds.
Fast Inverse Square Root was 214,422951 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 23790209 nanoseconds.
Fast Inverse Square Root took 76254902 nanoseconds.
Fast Inverse Square Root was 220,530610 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 23885467 nanoseconds.
Fast Inverse Square Root took 74869636 nanoseconds.
Fast Inverse Square Root was 213,452678 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 23473514 nanoseconds.
Fast Inverse Square Root took 73063699 nanoseconds.
Fast Inverse Square Root was 211,260168 percent slower than 1F / Math.sqrt()
1F / Math.sqrt() took 23738564 nanoseconds.
Fast Inverse Square Root took 71917013 nanoseconds.
Fast Inverse Square Root was 202,954353 percent slower than 1F / Math.sqrt()
fastInverseSquareRoot
的运行时间一直较慢,并且其运行时间都在同一个球场,而 Math.sqrt
调用速度大大加快。
更改代码以便无法避免对 Math.sqrt
的调用,
for (float x = 1F; x < 4_000_000F; x += 0.25F) {
result += 1F / (float) Math.sqrt(x);
}
midTime = System.nanoTime();
for (float x = 1F; x < 4_000_000F; x += 0.25F) {
result -= fastInverseSquareRoot(x);
}
endTime = System.nanoTime();
if (result == 0) System.out.println("Wow!");
我得到了
1F / Math.sqrt() took 184884684 nanoseconds.
Fast Inverse Square Root took 85298761 nanoseconds.
Fast Inverse Square Root was 53,863804 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 182183542 nanoseconds.
Fast Inverse Square Root took 83040574 nanoseconds.
Fast Inverse Square Root was 54,419278 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 165269658 nanoseconds.
Fast Inverse Square Root took 81922280 nanoseconds.
Fast Inverse Square Root was 50,431143 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 163272877 nanoseconds.
Fast Inverse Square Root took 81906141 nanoseconds.
Fast Inverse Square Root was 49,834815 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 165314846 nanoseconds.
Fast Inverse Square Root took 81124465 nanoseconds.
Fast Inverse Square Root was 50,927296 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 164079534 nanoseconds.
Fast Inverse Square Root took 80453629 nanoseconds.
Fast Inverse Square Root was 50,966689 percent faster than 1F / Math.sqrt()
1F / Math.sqrt() took 162350821 nanoseconds.
Fast Inverse Square Root took 79854355 nanoseconds.
Fast Inverse Square Root was 50,813704 percent faster than 1F / Math.sqrt()
很多 Math.sqrt
的时间慢了很多,而 fastInverseSqrt
的时间只稍微慢了一点(现在它必须在每次迭代中做一个减法).
关于java - 为什么快速平方根反比在 Java 上如此奇怪和缓慢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16551140/
我正在编写一个具有以下签名的 Java 方法。 void Logger(Method method, Object[] args); 如果一个方法(例如 ABC() )调用此方法 Logger,它应该
我是 Java 新手。 我的问题是我的 Java 程序找不到我试图用作的图像文件一个 JButton。 (目前这段代码什么也没做,因为我只是得到了想要的外观第一的)。这是我的主课 代码: packag
好的,今天我在接受采访,我已经编写 Java 代码多年了。采访中说“Java 垃圾收集是一个棘手的问题,我有几个 friend 一直在努力弄清楚。你在这方面做得怎么样?”。她是想骗我吗?还是我的一生都
我的 friend 给了我一个谜语让我解开。它是这样的: There are 100 people. Each one of them, in his turn, does the following
如果我将使用 Java 5 代码的应用程序编译成字节码,生成的 .class 文件是否能够在 Java 1.4 下运行? 如果后者可以工作并且我正在尝试在我的 Java 1.4 应用程序中使用 Jav
有关于why Java doesn't support unsigned types的问题以及一些关于处理无符号类型的问题。我做了一些搜索,似乎 Scala 也不支持无符号数据类型。限制是Java和S
我只是想知道在一个 java 版本中生成的字节码是否可以在其他 java 版本上运行 最佳答案 通常,字节码无需修改即可在 较新 版本的 Java 上运行。它不会在旧版本上运行,除非您使用特殊参数 (
我有一个关于在命令提示符下执行 java 程序的基本问题。 在某些机器上我们需要指定 -cp 。 (类路径)同时执行java程序 (test为java文件名与.class文件存在于同一目录下) jav
我已经阅读 StackOverflow 有一段时间了,现在我才鼓起勇气提出问题。我今年 20 岁,目前在我的家乡(罗马尼亚克卢日-纳波卡)就读 IT 大学。足以介绍:D。 基本上,我有一家提供簿记应用
我有 public JSONObject parseXML(String xml) { JSONObject jsonObject = XML.toJSONObject(xml); r
我已经在 Java 中实现了带有动态类型的简单解释语言。不幸的是我遇到了以下问题。测试时如下代码: def main() { def ks = Map[[1, 2]].keySet()
一直提示输入 1 到 10 的数字 - 结果应将 st、rd、th 和 nd 添加到数字中。编写一个程序,提示用户输入 1 到 10 之间的任意整数,然后以序数形式显示该整数并附加后缀。 public
我有这个 DownloadFile.java 并按预期下载该文件: import java.io.*; import java.net.URL; public class DownloadFile {
我想在 GUI 上添加延迟。我放置了 2 个 for 循环,然后重新绘制了一个标签,但这 2 个 for 循环一个接一个地执行,并且标签被重新绘制到最后一个。 我能做什么? for(int i=0;
我正在对对象 Student 的列表项进行一些测试,但是我更喜欢在 java 类对象中创建硬编码列表,然后从那里提取数据,而不是连接到数据库并在结果集中选择记录。然而,自从我这样做以来已经很长时间了,
我知道对象创建分为三个部分: 声明 实例化 初始化 classA{} classB extends classA{} classA obj = new classB(1,1); 实例化 它必须使用
我有兴趣使用 GPRS 构建车辆跟踪系统。但是,我有一些问题要问以前做过此操作的人: GPRS 是最好的技术吗?人们意识到任何问题吗? 我计划使用 Java/Java EE - 有更好的技术吗? 如果
我可以通过递归方法反转数组,例如:数组={1,2,3,4,5} 数组结果={5,4,3,2,1}但我的结果是相同的数组,我不知道为什么,请帮助我。 public class Recursion { p
有这样的标准方式吗? 包括 Java源代码-测试代码- Ant 或 Maven联合单元持续集成(可能是巡航控制)ClearCase 版本控制工具部署到应用服务器 最后我希望有一个自动构建和集成环境。
我什至不知道这是否可能,我非常怀疑它是否可能,但如果可以,您能告诉我怎么做吗?我只是想知道如何从打印机打印一些文本。 有什么想法吗? 最佳答案 这里有更简单的事情。 import javax.swin
我是一名优秀的程序员,十分优秀!