gpt4 book ai didi

arrays - C2 JIT 编译器何时触发 Java 循环谓词优化?

转载 作者:行者123 更新时间:2023-12-03 16:52:31 26 4
gpt4 key购买 nike

我正在尝试理解从 Java 循环生成的 native 代码。 native 代码应由 C2 编译器优化,但在我的简单示例中似乎缺少一些优化。

这是我根据https://wiki.openjdk.java.net/display/HotSpot/LoopPredication的最小示例编写的Java方法:

104    public static byte[] myLoop(int init, int limit, int stride, int scale, int offset, byte value, byte[] array) {
105 for (int i = init; i < limit; i += stride) {
106 array [ scale * i + offset] = value;
107 }
108 return array;
109 }

这些是提供给 Java 8 Hotspot VM 以强制进行 C2 编译的参数:

-server
-XX:-TieredCompilation
-XX:CompileThreshold=5
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintAssembly
-XX:-UseCompressedOops
-XX:+LogCompilation
-XX:+TraceClassLoading
-XX:+UseLoopPredicate
-XX:+RangeCheckElimination

这是 C2 生成的 amd64 native 代码('myLoop' 至少被调用 10000 次):

  # {method} {0x00007fcb5088ef38} 'myLoop' '(IIIIIB[B)[B' in 'MyClass'                                                                                                                                                                                                                                                                                      
# parm0: rsi = int
# parm1: rdx = int
# parm2: rcx = int
# parm3: r8 = int
# parm4: r9 = int
# parm5: rdi = byte
# parm6: [sp+0x40] = '[B' (sp of caller)
0x00007fcd44ee9fe0: mov %eax,0xfffffffffffec000(%rsp)
0x00007fcd44ee9fe7: push %rbp
0x00007fcd44ee9fe8: sub $0x30,%rsp ;*synchronization entry
; - MyClass::myLoop@-1 (line 105)

0x00007fcd44ee9fec: cmp %edx,%esi
0x00007fcd44ee9fee: jnl 0x7fcd44eea04a ;*if_icmplt
; - MyClass::myLoop@27 (line 105)

0x00007fcd44ee9ff0: mov 0x40(%rsp),%rax
0x00007fcd44ee9ff5: mov 0x10(%rax),%r10d ;*bastore
; - MyClass::myLoop@17 (line 106)
; implicit exception: dispatches to 0x00007fcd44eea051
0x00007fcd44ee9ff9: nopl 0x0(%rax) ;*aload
; - MyClass::myLoop@6 (line 106)

0x00007fcd44eea000: mov %esi,%ebx
0x00007fcd44eea002: imull %r8d,%ebx
0x00007fcd44eea006: add %r9d,%ebx ;*iadd
; - MyClass::myLoop@14 (line 106)

0x00007fcd44eea009: cmp %r10d,%ebx
0x00007fcd44eea00c: jnb 0x7fcd44eea02e ;*bastore
; - MyClass::myLoop@17 (line 106)

0x00007fcd44eea00e: add %ecx,%esi ;*iadd
; - MyClass::myLoop@21 (line 105)

0x00007fcd44eea010: movsxd %ebx,%r11
0x00007fcd44eea013: mov %dil,0x18(%rax,%r11) ; OopMap{rax=Oop off=56}
;*if_icmplt
; - MyClass::myLoop@27 (line 105)

0x00007fcd44eea018: test %eax,0xa025fe2(%rip) ; {poll}
0x00007fcd44eea01e: cmp %edx,%esi
0x00007fcd44eea020: jl 0x7fcd44eea000 ;*synchronization entry
; - MyClass::myLoop@-1 (line 105)

0x00007fcd44eea022: add $0x30,%rsp
0x00007fcd44eea026: pop %rbp
0x00007fcd44eea027: test %eax,0xa025fd3(%rip) ; {poll_return}
0x00007fcd44eea02d: retq
0x00007fcd44eea02e: movabs $0x7fcca3c810a8,%rsi ; {oop(a 'java/lang/ArrayIndexOutOfBoundsException')}
0x00007fcd44eea038: movq $0x0,0x18(%rsi) ;*bastore
; - MyClass::myLoop@17 (line 106)

0x00007fcd44eea040: add $0x30,%rsp
0x00007fcd44eea044: pop %rbp
0x00007fcd44eea045: jmpq 0x7fcd44e529a0 ; {runtime_call}
0x00007fcd44eea04a: mov 0x40(%rsp),%rax
0x00007fcd44eea04f: jmp 0x7fcd44eea022
0x00007fcd44eea051: mov %edx,%ebp
0x00007fcd44eea053: mov %ecx,0x40(%rsp)
0x00007fcd44eea057: mov %r8d,0x44(%rsp)
0x00007fcd44eea05c: mov %r9d,(%rsp)
0x00007fcd44eea060: mov %edi,0x4(%rsp)
0x00007fcd44eea064: mov %rax,0x8(%rsp)
0x00007fcd44eea069: mov %esi,0x10(%rsp)
0x00007fcd44eea06d: mov $0xffffff86,%esi
0x00007fcd44eea072: nop
0x00007fcd44eea073: callq 0x7fcd44dea1a0 ; OopMap{[8]=Oop off=152}
;*aload
; - MyClass::myLoop@6 (line 106)
; {runtime_call}
0x00007fcd44eea078: callq 0x7fcd4dc47c50 ;*aload
; - MyClass::myLoop@6 (line 106)
; {runtime_call}
0x00007fcd44eea07d: hlt
0x00007fcd44eea07e: hlt
0x00007fcd44eea07f: hlt

根据 https://wiki.openjdk.java.net/display/HotSpot/LoopPredication ,一种称为“数组范围消除”的优化消除了循环内的数组范围检查,但在循环之前添加了循环谓词。似乎 C2 尚未对“myLoop”进行此优化。循环的向后跳转在0x7fcd44eea020 并跳回0x7fcd44eea000。在循环中,仍然在 0x7fcd44eea009-0x7fcd44eea00c 处进行范围检查。

  1. 为什么循环中还有检查?
  2. 为什么没有运行循环预测优化?
  3. 如何强制执行所有优化?

最佳答案

解释就在 same page 上:

From the above example, the requirements to perform loop predication for array range check elimination are that init, limit, offset and array a are loop invariants, and stride and scale are compile time constants.

在您的示例中,scalestride 不是编译时常量,因此优化失败。

但是,如果您使用常量参数调用此方法,HotSpot 将能够消除由于 inling 和常量传播优化而导致的范围检查。

关于arrays - C2 JIT 编译器何时触发 Java 循环谓词优化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45373242/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com