gpt4 book ai didi

python - 执行 Numba 生成的程序集

转载 作者:行者123 更新时间:2023-12-03 17:04:32 24 4
gpt4 key购买 nike

在一个奇怪的事件中,我最终陷入了以下困境,我使用以下 Python 代码将 Numba 生成的程序集写入文件:

@jit(nopython=True, nogil=True)
def six():
return 6

with open("six.asm", "w") as f:
for k, v in six.inspect_asm().items():
f.write(v)

汇编代码已成功写入文件,但我不知道如何执行它。我尝试了以下方法:

$ as -o six.o six.asm
$ ld six.o -o six.bin
$ chmod +x six.bin
$ ./six.bin

但是,链接步骤失败并显示以下内容:
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000f0
six.o: In function `cpython::__main__::six$241':
<string>:(.text+0x20): undefined reference to `PyArg_UnpackTuple'
<string>:(.text+0x47): undefined reference to `PyEval_SaveThread'
<string>:(.text+0x53): undefined reference to `PyEval_RestoreThread'
<string>:(.text+0x62): undefined reference to `PyLong_FromLongLong'
<string>:(.text+0x74): undefined reference to `PyExc_RuntimeError'
<string>:(.text+0x88): undefined reference to `PyErr_SetString'

我怀疑 Numba 和/或 Python 标准库需要动态链接到生成的目标文件才能成功运行,但我不确定它是如何完成的(如果它甚至可以在第一个完成地方)。

我还尝试了以下方法,将中间 LLVM 代码写入文件而不是程序集:

with open("six.ll", "w") as f:
for k, v in six.inspect_llvm().items():
f.write(v)

进而

$ lli six.ll

但这也失败了,并出现以下错误:
'main' function not found in module.

更新:

事实证明,有一个实用程序可以找到要传递给 ld 的相关标志。命令来动态链接 Python 标准库。

$ python3-config --ldflags

退货
-L/Users/rayan/anaconda3/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation 

再次运行以下命令,这次使用正确的标志:

$ as -o six.o six.asm
$ ld six.o -o six.bin -L/Users/rayan/anaconda3/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation
$ chmod +x six.bin
$ ./six.bin

我现在得到
ld: warning: No version-min specified on command line
ld: entry point (_main) undefined. for inferred architecture x86_64

我尝试添加 _main程序集文件中的标签,但这似乎没有任何作用。关于如何定义入口点的任何想法?

更新 2:

这是有用的汇编代码,目标函数似乎是带有标签的函数 _ZN8__main__7six$241E :
    .text
.file "<string>"
.globl _ZN8__main__7six$241E
.p2align 4, 0x90
.type _ZN8__main__7six$241E,@function
_ZN8__main__7six$241E:
movq $6, (%rdi)
xorl %eax, %eax
retq
.Lfunc_end0:
.size _ZN8__main__7six$241E, .Lfunc_end0-_ZN8__main__7six$241E

.globl _ZN7cpython8__main__7six$241E
.p2align 4, 0x90
.type _ZN7cpython8__main__7six$241E,@function
_ZN7cpython8__main__7six$241E:
.cfi_startproc
pushq %rax
.cfi_def_cfa_offset 16
movq %rsi, %rdi
movabsq $.const.six, %rsi
movabsq $PyArg_UnpackTuple, %r8
xorl %edx, %edx
xorl %ecx, %ecx
xorl %eax, %eax
callq *%r8
testl %eax, %eax
je .LBB1_3
movabsq $_ZN08NumbaEnv8__main__7six$241E, %rax
cmpq $0, (%rax)
je .LBB1_2
movabsq $PyEval_SaveThread, %rax
callq *%rax
movabsq $PyEval_RestoreThread, %rcx
movq %rax, %rdi
callq *%rcx
movabsq $PyLong_FromLongLong, %rax
movl $6, %edi
popq %rcx
.cfi_def_cfa_offset 8
jmpq *%rax
.LBB1_2:
.cfi_def_cfa_offset 16
movabsq $PyExc_RuntimeError, %rdi
movabsq $".const.missing Environment", %rsi
movabsq $PyErr_SetString, %rax
callq *%rax
.LBB1_3:
xorl %eax, %eax
popq %rcx
.cfi_def_cfa_offset 8
retq
.Lfunc_end1:
.size _ZN7cpython8__main__7six$241E, .Lfunc_end1-_ZN7cpython8__main__7six$241E
.cfi_endproc

.globl cfunc._ZN8__main__7six$241E
.p2align 4, 0x90
.type cfunc._ZN8__main__7six$241E,@function
cfunc._ZN8__main__7six$241E:
movl $6, %eax
retq
.Lfunc_end2:
.size cfunc._ZN8__main__7six$241E, .Lfunc_end2-cfunc._ZN8__main__7six$241E

.type _ZN08NumbaEnv8__main__7six$241E,@object
.comm _ZN08NumbaEnv8__main__7six$241E,8,8
.type .const.six,@object
.section .rodata,"a",@progbits
.const.six:
.asciz "six"
.size .const.six, 4

.type ".const.missing Environment",@object
.p2align 4
.const.missing Environment:
.asciz "missing Environment"
.size ".const.missing Environment", 20


.section ".note.GNU-stack","",@progbits

最佳答案

浏览后[PyData.Numba]: Numba docs,以及一些调试、试验和错误,我得出了一个结论:你似乎偏离了你的追求之路(正如评论中所指出的那样)。
Numba 将 Python 代码(函数)转换为机器代码(原因很明显:速度)。它即时执行所有操作(在运行过程中转换、构建、插入),程序员只需将函数装饰为例如@numba.jit ([PyData.Numba]: Just-in-Time compilation)。
您遇到的行为 是正确的 . Dispatcher 对象(用于装饰六个函数)只为函数本身生成(汇编)代码(那里没有 main,因为代码正在当前进程中执行(Python 解释器的 main 函数))。所以,链接器提示没有主符号是正常的。这就像编写一个只包含以下内容的 C 文件:

int six() {
return 6;
}
为了使事情正常工作,您必须:
  • 将 .asm 文件构建为 .o(对象)文件(完成)
  • 包含#1 中的 .o 文件。进入一个图书馆,可以
  • 静态
  • 动态

  • 该库将在(最终)可执行文件中链接。此步骤是可选的,因为您可以直接使用 .o 文件
  • 将另一个定义 main 的文件(并调用六个 - 我认为这是全部目的)构建到一个 .o 文件中。由于我对汇编不太满意,所以我用 C 编写了它
  • 将 2 个实体(来自 #2. (#1.) 和 #3.)链接在一起

  • 作为替代方案,您可以查看 [PyData.Numba]: Compiling code ahead of time ,但请记住,它会生成一个 Python(扩展)模块。
    回到当前的问题。在 Ubuntu 18.04 64bit 上进行了测试。
    代码00.py:
    #!/usr/bin/env python

    import sys
    import math
    import numba


    @numba.jit(nopython=True, nogil=True)
    def six():
    return 6


    def main(*argv):
    six() # Call the function(s), otherwise `inspect_asm()` would return empty dict
    speed_funcs = [
    (six, numba.int32()),
    ]
    for func, _ in speed_funcs:
    file_name_asm = "numba_{0:s}_{1:s}_{2:03d}_{3:02d}{4:02d}{5:02d}.asm".format(func.__name__, sys.platform, int(round(math.log2(sys.maxsize))) + 1, *sys.version_info[:3])
    asm = func.inspect_asm()
    print("Writing to {0:s}:".format(file_name_asm))
    with open(file_name_asm, "wb") as fout:
    for k, v in asm.items():
    print(" {0:}".format(k))
    fout.write(v.encode())


    if __name__ == "__main__":
    print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    main(*sys.argv[1:])
    print("\nDone.")
    main00.c:
    #include <stdio.h>
    #include <dlfcn.h>

    //#define SYMBOL_SIX "_ZN8__main__7six$241E"
    #define SYMBOL_SIX "cfunc._ZN8__main__7six$241E"

    typedef int (*SixFuncPtr)();

    int main() {
    void *pMod = dlopen("./libnumba_six_linux.so", RTLD_LAZY);
    if (!pMod) {
    printf("Error (%s) loading module\n", dlerror());
    return -1;
    }
    SixFuncPtr pSixFunc = dlsym(pMod, SYMBOL_SIX);
    if (!pSixFunc)
    {
    printf("Error (%s) loading function\n", dlerror());
    dlclose(pMod);
    return -2;
    }
    printf("six() returned: %d\n", (*pSixFunc)());
    dlclose(pMod);
    return 0;
    }
    构建.sh:
    CC=gcc

    LIB_BASE_NAME=numba_six_linux

    FLAG_LD_LIB_NUMBALINUX="-Wl,-L. -Wl,-l${LIB_BASE_NAME}"
    FLAG_LD_LIB_PYTHON="-Wl,-L/usr/lib/python3.7/config-3.7m-x86_64-linux-gnu -Wl,-lpython3.7m"

    rm -f *.asm *.o *.a *.so *.exe

    echo Generate .asm
    python3 code00.py

    echo Assemble
    as -o ${LIB_BASE_NAME}.o ${LIB_BASE_NAME}_064_030705.asm

    echo Link library
    LIB_NUMBA="./lib${LIB_BASE_NAME}.so"
    #ar -scr ${LIB_NUMBA} ${LIB_BASE_NAME}.o
    ${CC} -o ${LIB_NUMBA} -shared ${LIB_BASE_NAME}.o ${FLAG_LD_LIB_PYTHON}

    echo Dump library contents
    nm -S ${LIB_NUMBA}
    #objdump -t ${LIB_NUMBA}

    echo Compile and link executable
    ${CC} -o main00.exe main00.c -ldl

    echo Exit script
    输出 :
    (py_venv_pc064_03.07.05_test0) [cfati@cfati-ubtu-18-064-00:~/Work/Dev/StackOverflow/q061678226]> ~/sopr.sh
    *** Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ***

    [064bit prompt]>
    [064bit prompt]> ls
    build.sh code00.py main00.c
    [064bit prompt]>
    [064bit prompt]> ./build.sh
    Generate .asm
    Python 3.7.5 (default, Nov 7 2019, 10:50:52) [GCC 8.3.0] 64bit on linux

    Writing to numba_six_linux_064_030705.asm:
    ()

    Done.
    Assemble
    Link library
    Dump library contents
    0000000000201020 B __bss_start
    00000000000008b0 0000000000000006 T cfunc._ZN8__main__7six$241E
    0000000000201020 0000000000000001 b completed.7698
    00000000000008e0 0000000000000014 r .const.missing Environment
    00000000000008d0 0000000000000004 r .const.six
    w __cxa_finalize
    0000000000000730 t deregister_tm_clones
    00000000000007c0 t __do_global_dtors_aux
    0000000000200e58 t __do_global_dtors_aux_fini_array_entry
    0000000000201018 d __dso_handle
    0000000000200e60 d _DYNAMIC
    0000000000201020 D _edata
    0000000000201030 B _end
    00000000000008b8 T _fini
    0000000000000800 t frame_dummy
    0000000000200e50 t __frame_dummy_init_array_entry
    0000000000000990 r __FRAME_END__
    0000000000201000 d _GLOBAL_OFFSET_TABLE_
    w __gmon_start__
    00000000000008f4 r __GNU_EH_FRAME_HDR
    00000000000006f0 T _init
    w _ITM_deregisterTMCloneTable
    w _ITM_registerTMCloneTable
    U PyArg_UnpackTuple
    U PyErr_SetString
    U PyEval_RestoreThread
    U PyEval_SaveThread
    U PyExc_RuntimeError
    U PyLong_FromLongLong
    0000000000000770 t register_tm_clones
    0000000000201020 d __TMC_END__
    0000000000201028 0000000000000008 B _ZN08NumbaEnv8__main__7six$241E
    0000000000000820 0000000000000086 T _ZN7cpython8__main__7six$241E
    0000000000000810 000000000000000a T _ZN8__main__7six$241E
    Compile and link executable
    Exit script
    [064bit prompt]>
    [064bit prompt]> ls
    build.sh code00.py libnumba_six_linux.so main00.c main00.exe numba_six_linux_064_030705.asm numba_six_linux.o
    [064bit prompt]>
    [064bit prompt]> # Run the executable
    [064bit prompt]>
    [064bit prompt]> ./main00.exe
    six() returned: 6
    [064bit prompt]>

    还发布(因为它很重要)numba_six_linux_064_030705.asm:
        .text
    .file "<string>"
    .globl _ZN8__main__7six$241E
    .p2align 4, 0x90
    .type _ZN8__main__7six$241E,@function
    _ZN8__main__7six$241E:
    movq $6, (%rdi)
    xorl %eax, %eax
    retq
    .Lfunc_end0:
    .size _ZN8__main__7six$241E, .Lfunc_end0-_ZN8__main__7six$241E

    .globl _ZN7cpython8__main__7six$241E
    .p2align 4, 0x90
    .type _ZN7cpython8__main__7six$241E,@function
    _ZN7cpython8__main__7six$241E:
    .cfi_startproc
    pushq %rax
    .cfi_def_cfa_offset 16
    movq %rsi, %rdi
    movabsq $.const.six, %rsi
    movabsq $PyArg_UnpackTuple, %r8
    xorl %edx, %edx
    xorl %ecx, %ecx
    xorl %eax, %eax
    callq *%r8
    testl %eax, %eax
    je .LBB1_3
    movabsq $_ZN08NumbaEnv8__main__7six$241E, %rax
    cmpq $0, (%rax)
    je .LBB1_2
    movabsq $PyEval_SaveThread, %rax
    callq *%rax
    movabsq $PyEval_RestoreThread, %rcx
    movq %rax, %rdi
    callq *%rcx
    movabsq $PyLong_FromLongLong, %rax
    movl $6, %edi
    popq %rcx
    .cfi_def_cfa_offset 8
    jmpq *%rax
    .LBB1_2:
    .cfi_def_cfa_offset 16
    movabsq $PyExc_RuntimeError, %rdi
    movabsq $".const.missing Environment", %rsi
    movabsq $PyErr_SetString, %rax
    callq *%rax
    .LBB1_3:
    xorl %eax, %eax
    popq %rcx
    .cfi_def_cfa_offset 8
    retq
    .Lfunc_end1:
    .size _ZN7cpython8__main__7six$241E, .Lfunc_end1-_ZN7cpython8__main__7six$241E
    .cfi_endproc

    .globl cfunc._ZN8__main__7six$241E
    .p2align 4, 0x90
    .type cfunc._ZN8__main__7six$241E,@function
    cfunc._ZN8__main__7six$241E:
    movl $6, %eax
    retq
    .Lfunc_end2:
    .size cfunc._ZN8__main__7six$241E, .Lfunc_end2-cfunc._ZN8__main__7six$241E

    .type _ZN08NumbaEnv8__main__7six$241E,@object
    .comm _ZN08NumbaEnv8__main__7six$241E,8,8
    .type .const.six,@object
    .section .rodata,"a",@progbits
    .const.six:
    .asciz "six"
    .size .const.six, 4

    .type ".const.missing Environment",@object
    .p2align 4
    ".const.missing Environment":
    .asciz "missing Environment"
    .size ".const.missing Environment", 20


    .section ".note.GNU-stack","",@progbits
    备注 :
  • numba_six_linux_064_030705.asm(以及从它派生的所有内容)包含六个函数的代码。实际上,有一堆符号(在 OSX 上,您也可以使用原生的 otool -T ),例如:
  • cfunc._ZN8__main__7six$241E - (C) 函数本身
  • _ZN7cpython8__main__7six$241E - Python 包装器:
  • 执行 C <=> Python 转换(通过 Python API 函数,如 PyArg_UnpackTuple)
  • 由于#1。它需要(取决于)libpython3.7m
  • 因此,nopython=True在这种情况下无效


  • 此外,这些符号的主要部分不是指可执行入口点(主函数),而是指 。 Python 模块的顶级命名空间 (__main__) .毕竟,这段代码应该是从 Python 运行的
  • 由于 C 普通函数的名称中包含一个点 ( . ),我无法直接从 C 中调用它(因为它是无效的标识符名称),所以我不得不 加载 (.so 和)手动函数 (dlopen/dlsym),导致代码比简单地调用函数更多。我没有尝试过,但我认为对生成的 .asm 文件进行以下(手动)更改将简化工作是有意义的:
  • 在组装之前在 .asm 文件中重命名纯 C 函数名称(如 __six 或任何其他不与另一个(显式或内部)名称冲突的有效 C 标识符)将使该函数可以直接从 C 调用
  • 删除 Python 包装器 (#2.) 也将摆脱 #22。


  • 更新#0
    感谢@PeterCordes,他分享了我丢失的那条确切信息( [GNU.GCC]: Controlling Names Used in Assembler Code),这是一个更简单的版本。
    main01.c:
    #include <stdio.h>

    extern int six() asm ("cfunc._ZN8__main__7six$241E");

    int main() {
    printf("six() returned: %d\n", six());
    }
    输出 :
    [064bit prompt]> # Resume from previous point + main01.c
    [064bit prompt]>
    [064bit prompt]> ls
    build.sh code00.py libnumba_six_linux.so main00.c main00.exe main01.c numba_six_linux_064_030705.asm numba_six_linux.o
    [064bit prompt]>
    [064bit prompt]> ar -scr libnumba_six_linux.a numba_six_linux.o
    [064bit prompt]>
    [064bit prompt]> gcc -o main01.exe main01.c ./libnumba_six_linux.a -Wl,-L/usr/lib/python3.7/config-3.7m-x86_64-linux-gnu -Wl,-lpython3.7m
    [064bit prompt]>
    [064bit prompt]> ls
    build.sh code00.py libnumba_six_linux.a libnumba_six_linux.so main00.c main00.exe main01.c main01.exe numba_six_linux_064_030705.asm numba_six_linux.o
    [064bit prompt]>
    [064bit prompt]> ./main01.exe
    six() returned: 6
    [064bit prompt]>

    关于python - 执行 Numba 生成的程序集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61678226/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com