gpt4 book ai didi

Storing a struct as a blob of data breaks with some optimization passes(将结构存储为BLOB数据会中断某些优化过程)

转载 作者:bug小助手 更新时间:2023-10-25 14:11:59 24 4
gpt4 key购买 nike



I'm writing a compiler with LLVM as backend. Recently I turned on the optimizations, and saw my programs break in strange ways. I managed to boil things down to a minimal code, and set of optimization passes, that reproduce the problem. Here's the code:

我正在编写一个以LLVM为后端的编译器。最近,我打开了优化,看到我的程序以奇怪的方式崩溃。我设法将事情归结为最少的代码和一组重现问题的优化过程。代码如下:


define i1 @main() {
entry:
; Allocate space on the stack for a record
%record = alloca { i8, i64, float }, align 8

; Store 1.0 in the third field of the record
%value = getelementptr inbounds { i8, i64, float }, { i8, i64, float }* %record, i32 0, i32 2
store float 1.000000e+00, float* %value, align 4

; Cast the record's location as a [3 x i64] blob, and load it
%tmpptr = bitcast { i8, i64, float }* %record to [3 x i64]*
%tmp = load [3 x i64], [3 x i64]* %tmpptr, align 4

; Store that blob on the stack
%blob = alloca [3 x i64], align 8
store [3 x i64] %tmp, [3 x i64]* %blob, align 4

; Load `value` in blob (the third field)
%record2 = bitcast [3 x i64]* %blob to { i8, i64, float }*
%value2ptr = getelementptr inbounds { i8, i64, float }, { i8, i64, float }* %record2, i32 0, i32 2
%value2 = load float, float* %value2ptr, align 4

; Check that value is `1.0`
%eq = fcmp oeq float %value2, 1.000000e+00

; Without optimization, returns 1 (true). With optimization, returns 0 (false)
ret i1 %eq
}

And here are the command I run to execute it, without and with optimizations:

下面是我运行来执行它的命令,没有经过优化,也有经过优化:


$ cat program.ll | opt | llc --filetype=obj -o /tmp/program.o && clang /tmp/program.o -o a.out && ./a.out
$ echo $?
1
$ cat program.ll | opt --instcombine --gvn | llc --filetype=obj -o /tmp/program.o && clang /tmp/program.o -o a.out && ./a.out
$ echo $?
0

Also, here's the version of LLVM I use:

另外,以下是我使用的LLVM版本:


$ llc --version
LLVM (http://llvm.org/):
LLVM version 14.0.6
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: tigerlake

...

It seems like the part that breaks is loading then storing the blob of data on the stack, then reading the value from there. If I only do two bitcasts in a row, without store/load, the problem vanishes.

看起来中断的部分是加载,然后将数据块存储在堆栈上,然后从那里读取值。如果我只连续进行两个位广播,而不存储/加载,这个问题就会消失。


Note that I left the first two fields of the record undefined for terseness, but if you do write data to them, the problem remains. Also, if you remove either the i8 or the i64 field, the problem disappears.

请注意,为了简洁起见,我没有定义记录的前两个字段,但如果您确实向它们写入数据,问题仍然存在。此外,如果删除i8或i64字段,问题也会消失。


I also noticed that if I manually pad the structure as { i8, i8, i16, i32, i64, float, i32 }, the problem also disappears. I don't understand why, however, because I computed the size of the array to be [3 x i64] based on the store size of the struct, which LLVM tells me is 24 bytes on my machine.

我还注意到,如果我手动将结构填充为{i8,i8,i16,i32,i64,Float,I32},问题也会消失。然而,我不明白为什么,因为我根据结构的存储大小计算出数组的大小为[3xi64],LLVM告诉我在我的机器上是24字节。


Looking a the IR generated by the optimization pass, it seems like it stores the float value in the 2nd i64 location in the array, instead of the 3rd. I cannot understand why. I imagine something in this code is undefined behavior, but I have no idea what.

从优化过程生成的IR来看,它似乎将浮点值存储在数组中的第二个i64位置,而不是第三个位置。我不明白为什么。我想这段代码中有一些东西是未定义的行为,但我不知道是什么。


更多回答

Haven't the time to really look at this, but in general, running the module verifier points to almost all such problems. It leaves the solution as an exercise for the readers, though ;)

我没有时间真正考虑这一点,但总的来说,运行模块验证器会指出几乎所有这样的问题。然而,它将解决方案留给读者作为练习;)

Thanks for the tip! Unfortunately, running the module verifier yields no results :/

谢谢你告诉我!遗憾的是,运行模块验证器没有产生任何结果:/

Shocking. Try -print-after-all to see which pass does the bad change, perhaps? I'll try to have a look when I'm back from vacation on Tuesday. No promises.

令人震惊。尝试打印之后,看看糟糕的传球可能会改变哪一种传球?当我星期二度假回来时,我会试着去看看。不能保证。

Thank you. It seems like the --instcombine pass is causing issues, essentially by inferring that the 3rd i64 in the array is unused, and that the float is in the 2nd i64. I don't understand why, which troubles me. For my own purposes, I managed to fix this in my compiler by replacing the store with a memcpy. I'm still however really curious to know what the problem might be, if you end up finding the time for this.

谢谢。似乎--instCombine传递引起了问题,主要是通过推断数组中的第三个i64未使用,而浮点数位于第二个i64。我不明白为什么,这让我很困扰。出于我自己的目的,我设法在我的编译器中修复了这个问题,用一个Memcpy替换了存储。然而,我仍然非常好奇地想知道问题可能是什么,如果你最终找到时间来做这个的话。

优秀答案推荐
更多回答

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com