gpt4 book ai didi

c - ELF 二进制分析静态与动态。汇编代码如何|指令内存映射变化?

转载 作者:行者123 更新时间:2023-12-01 13:16:33 25 4
gpt4 key购买 nike

./hello 是一个简单的 c 语言回显程序。
根据 objdump 文件头,

$ objdump -f ./hello

./hello: file format elf32-i386
architecture: i386, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x00000430

./hello 的起始地址为 0x430

现在在 gdb 中加载这个二进制文件。

(gdb) file ./hello
Reading symbols from ./hello...(no debugging symbols found)...done.
(gdb) x/x _start
0x430 <_start>: 0x895eed31
(gdb) break _start
Breakpoint 1 at 0x430
(gdb) run
Starting program: /1/vooks/cdac/ditiss/proj/binaries/temp/hello

Breakpoint 1, 0x00400430 in _start ()
(gdb) x/x _start
0x400430 <_start>: 0x895eed31
(gdb)

在上面的输出中,在设置断点或运行二进制文件之前,_start 的地址为0x430,但在运行之后,该地址变为0x400430

$ readelf -l ./hello | grep LOAD

LOAD 0x000000 0x00000000 0x00000000 0x007b4 0x007b4 R E 0x1000
LOAD 0x000eec 0x00001eec 0x00001eec 0x00130 0x00134 RW 0x1000

这种映射是如何发生的?

请帮忙。

最佳答案

基本上,在链接之后,ELF 文件格式为加载器提供了将程序加载到内存并运行它所需的所有信息。

每段代码和数据都放置在一个段内的偏移量内,如数据段、文本段等,通过向段起始地址添加适当的偏移量来访问特定函数或全局变量。

现在,ELF文件格式还包括程序头表:

An executable or shared object file's program header table is an array of structures, each describing a segment or other information that the system needs to prepare the program for execution. An object file segment contains one or more sections, as described in "Segment Contents".

然后操作系统加载器使用这些结构将图像加载到内存中。结构:

typedef struct {
Elf32_Word p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
Elf32_Word p_filesz;
Elf32_Word p_memsz;
Elf32_Word p_flags;
Elf32_Word p_align;
} Elf32_Phdr;

注意以下字段:

p_vaddr

The virtual address at which the first byte of the segment resides in memory

p_offset

The offset from the beginning of the file at which the first byte of the segment resides.

p_type

The kind of segment this array element describes or how to interpret the array element's information. Type values and their meanings are specified in Table 7-35.

在表 7-35 中,注意 PT_LOAD:

Specifies a loadable segment, described by p_filesz and p_memsz. The bytes from the file are mapped to the beginning of the memory segment. If the segment's memory size (p_memsz) is larger than the file size (p_filesz), the extra bytes are defined to hold the value 0 and to follow the segment's initialized area. The file size can not be larger than the memory size. Loadable segment entries in the program header table appear in ascending order, sorted on the p_vaddr member.

因此,通过查看这些字段(以及更多字段),加载器可以在 ELF 文件中定位段(可以包含多个部分),并以给定的时间将它们加载 (PT_LOAD) 到内存中虚拟地址。

现在,ELF 文件段的虚拟地址可以在运行时(加载时)更改吗?是的:

The virtual addresses in the program headers might not represent the actual virtual addresses of the program's memory image. See "Program Loading (Processor-Specific)".

因此,程序头包含操作系统加载器将加载到内存中的段(可加载段,其中包含可加载部分),但加载器放置它们的虚拟地址可能与 ELF 文件中的地址不同。

如何?

要理解它,让我们先阅读Base Address

Executable and shared object files have a base address, which is the lowest virtual address associated with the memory image of the program's object file. One use of the base address is to relocate the memory image of the program during dynamic linking.

An executable or shared object file's base address is calculated during execution from three values: the memory load address, the maximum page size, and the lowest virtual address of a program's loadable segment. The virtual addresses in the program headers might not represent the actual virtual addresses of the program's memory image. See "Program Loading (Processor-Specific)".

所以实践如下:

position-independent code. This code enables a segment's virtual address change from one process to another, without invalidating execution behavior.

Though the system chooses virtual addresses for individual processes, it maintains the relative positions of the segments. Because position-independent code uses relative addressing between segments, the difference between virtual addresses in memory must match the difference between virtual addresses in the file.

因此,通过使用相对寻址(PIE 位置独立可执行文件),实际位置可能与 ELF 文件中的地址不同。

来自 PeterCordes 的回答:

0x400000 is the Linux default base address for loading PIE executables with ASLR disabled (like GDB does by default).

因此对于您的特定情况(Linux 中的 PIE 可执行文件)加载器选择此基地址

当然,位置独立只是一种选择。没有它也可以编译程序,然后发生绝对寻址方式,其中ELF中的段地址与实际内存地址之间不能有差异段被加载到:

Executable file segments typically contain absolute code. For the process to execute correctly, the segments must reside at the virtual addresses used to create the executable file. The system uses the p_vaddr values unchanged as virtual addresses.

我建议你看一下elf图片加载的linux实现here ,以及那两个 SO 线程 herehere .

段落摘自 Oracle ELF 文档(herehere)

关于c - ELF 二进制分析静态与动态。汇编代码如何|指令内存映射变化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54295129/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com