c - ELF 二进制分析静态与动态。汇编代码如何|指令内存映射变化？-6ren

c - ELF 二进制分析静态与动态。汇编代码如何|指令内存映射变化？

转载作者：行者123 更新时间：2023-12-01 13:16:33

25

4

./hello 是一个简单的 c 语言回显程序。
根据 objdump 文件头，

$ objdump -f ./hello

./hello:     file format elf32-i386
architecture: i386, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x00000430

./hello 的起始地址为 0x430

现在在 gdb 中加载这个二进制文件。

(gdb) file ./hello
Reading symbols from ./hello...(no debugging symbols found)...done.
(gdb) x/x _start
0x430 <_start>: 0x895eed31
(gdb) break _start
Breakpoint 1 at 0x430
(gdb) run
Starting program: /1/vooks/cdac/ditiss/proj/binaries/temp/hello 

Breakpoint 1, 0x00400430 in _start ()
(gdb) x/x _start
0x400430 <_start>:  0x895eed31
(gdb)

在上面的输出中，在设置断点或运行二进制文件之前，_start 的地址为0x430，但在运行之后，该地址变为0x400430。

$ readelf -l ./hello | grep LOAD

 LOAD           0x000000 0x00000000 0x00000000 0x007b4 0x007b4 R E 0x1000
 LOAD           0x000eec 0x00001eec 0x00001eec 0x00130 0x00134 RW  0x1000

这种映射是如何发生的？

请帮忙。

最佳答案

基本上，在链接之后，ELF 文件格式为加载器提供了将程序加载到内存并运行它所需的所有信息。

每段代码和数据都放置在一个段内的偏移量内，如数据段、文本段等，通过向段起始地址添加适当的偏移量来访问特定函数或全局变量。

现在，ELF文件格式还包括程序头表:

An executable or shared object file's program header table is an array of structures, each describing a segment or other information that the system needs to prepare the program for execution. An object file segment contains one or more sections, as described in "Segment Contents".

然后操作系统加载器使用这些结构将图像加载到内存中。结构:

typedef struct {
        Elf32_Word      p_type;
        Elf32_Off       p_offset;
        Elf32_Addr      p_vaddr;
        Elf32_Addr      p_paddr;
        Elf32_Word      p_filesz;
        Elf32_Word      p_memsz;
        Elf32_Word      p_flags;
        Elf32_Word      p_align;
} Elf32_Phdr;

注意以下字段:

p_vaddr

The virtual address at which the first byte of the segment resides in memory

p_offset

The offset from the beginning of the file at which the first byte of the segment resides.

和p_type

The kind of segment this array element describes or how to interpret the array element's information. Type values and their meanings are specified in Table 7-35.

在表 7-35 中，注意 PT_LOAD:

Specifies a loadable segment, described by p_filesz and p_memsz. The bytes from the file are mapped to the beginning of the memory segment. If the segment's memory size (p_memsz) is larger than the file size (p_filesz), the extra bytes are defined to hold the value 0 and to follow the segment's initialized area. The file size can not be larger than the memory size. Loadable segment entries in the program header table appear in ascending order, sorted on the p_vaddr member.

因此，通过查看这些字段(以及更多字段)，加载器可以在 ELF 文件中定位段(可以包含多个部分)，并以给定的时间将它们加载 (PT_LOAD) 到内存中虚拟地址。

现在，ELF 文件段的虚拟地址可以在运行时(加载时)更改吗？是的:

The virtual addresses in the program headers might not represent the actual virtual addresses of the program's memory image. See "Program Loading (Processor-Specific)".

因此，程序头包含操作系统加载器将加载到内存中的段(可加载段，其中包含可加载部分)，但加载器放置它们的虚拟地址可能与 ELF 文件中的地址不同。

如何？

要理解它，让我们先阅读Base Address

Executable and shared object files have a base address, which is the lowest virtual address associated with the memory image of the program's object file. One use of the base address is to relocate the memory image of the program during dynamic linking.

An executable or shared object file's base address is calculated during execution from three values: the memory load address, the maximum page size, and the lowest virtual address of a program's loadable segment. The virtual addresses in the program headers might not represent the actual virtual addresses of the program's memory image. See "Program Loading (Processor-Specific)".

所以实践如下:

position-independent code. This code enables a segment's virtual address change from one process to another, without invalidating execution behavior.

Though the system chooses virtual addresses for individual processes, it maintains the relative positions of the segments. Because position-independent code uses relative addressing between segments, the difference between virtual addresses in memory must match the difference between virtual addresses in the file.

因此，通过使用相对寻址(PIE 位置独立可执行文件)，实际位置可能与 ELF 文件中的地址不同。

来自 PeterCordes 的回答:

0x400000 is the Linux default base address for loading PIE executables with ASLR disabled (like GDB does by default).

因此对于您的特定情况(Linux 中的 PIE 可执行文件)加载器选择此基地址。

当然，位置独立只是一种选择。没有它也可以编译程序，然后发生绝对寻址方式，其中ELF中的段地址与实际内存地址之间不能有差异段被加载到:

Executable file segments typically contain absolute code. For the process to execute correctly, the segments must reside at the virtual addresses used to create the executable file. The system uses the p_vaddr values unchanged as virtual addresses.

我建议你看一下elf图片加载的linux实现here ，以及那两个 SO 线程 here和 here .

段落摘自 Oracle ELF 文档(here 和 here)

关于c - ELF 二进制分析静态与动态。汇编代码如何|指令内存映射变化？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54295129/

25

4

0

文章推荐：数组列表中的 java.util.ConcurrentModificationException

文章推荐： create-react-app 忽略了 yarn 的注册表配置值

文章推荐： cordova - ionic cordova 从 sdcard 获取所有 mp3 文件

c++ - 映射/设置迭代器不是可取消引用的 C++ 映射
请看一下我的代码。 int main () { Program* allcommand = new Program; allcommand->addCommand("add", new
c++ - typedef 映射、for 循环调试断言、映射/设置不兼容
因此，当我遇到调试断言时，我正在编写代码。现在我很想知道为什么这段代码不起作用: for(Model::MeshMap::iterator it = obj1->GetMeshes().begin()
java - 组、收集器、映射(整数到字符串)、映射(映射到对象)
这是我上一个问题的延续 Group, Sum byType then get diff using Java streams . 按照建议，我应该作为单独的线程发布，而不是更新原始线程。因此，通过我
javascript - JQuery 映射 vs Javascript 映射 vs For 循环
我正在实现一些非常适合 map 的代码。但是，我要迭代的列表中有大量对象，所以我的问题是哪种方法是解决此问题的最佳方法: var stuff = $.map(listOfMyObjects, some
不同类成员函数指针的C++映射
我正在尝试创建一个包含不同类的成员函数指针的映射。成员函数都具有相同的签名。为了做到这一点，我所有的类都继承了一个 Object 类，它只有默认构造函数、虚拟析构函数和一个虚拟 ToString()
具有相同键类型和不同项目类型的c++映射
这个问题在这里已经有了答案: 关闭 11 年前。 Possible Duplicate: how do you make a heterogeneous boost::map? 有可能在 C++ 中
Mysql WHERE IN 映射
我有一个 Mysql 查询，请检查以下内容: SELECT `tbl_classSubjects`.`classID` , `tbl_classSubjects`.`sectionID` , `tbl
JNA直接与接口(interface)映射？
抱歉，这可能是一个基本问题。 JNA直接映射和接口(interface)映射有什么区别？我的解释是否正确: 直接映射 : 直接使用库对象(如 Java 中的静态 main) 接口(interface
以部分函数为值的 Scala 映射
在 Twitter's Scala school collections section ，它们显示了一个带有偏函数作为值的 Map: // timesTwo() was defined earlie
FFMPEG channel 映射
很难说出这里问的是什么。这个问题是模棱两可的、模糊的、不完整的、过于宽泛的或修辞的，无法以目前的形式得到合理的回答。如需帮助澄清这个问题以便重新打开它，visit the help center .
具有原始类型值类型的 Scala 映射
据我了解，从 scala stdlib 声明一个映射并没有将其专门用于原始类型。我要的不是付出装箱/拆箱的代价，而是同时拥有scala map 的接口(interface)。一个明显的选择是使用 tr
没有键路径的数组的 Restkit 映射
如何为这样的 JSON 响应创建对象映射，它只是一个整数数组: [ 565195, 565309, 565261, 565515, 565292, 565281, 566346, 5
NHibernate DTO 映射
是否可以为 DTO 对象创建映射然后查询它们而不是域？如果不解释为什么？如果我需要几个 dtos 怎么办？ DTos 是只读的 ID 由 NH 自动生成将来这些 dtos 将设置映射到链接的 d
包含混合类型值的 Scala 映射
我有一个返回的函数(常规代码) [words: "one two", row: 23, col: 45] 在 Scala 中，我将上面更改为 Scala Map，但随后我被迫将其声明为 Map[Str
python - 映射 - 特征重要性与标签分类
我有一组与 Vanilla 磅蛋糕烘焙相关的数据(200 行)，具有 27 个特征，如下所示。标签caketaste是衡量烤蛋糕的好坏程度，由 bad(0) 定义, neutral(1) , good
复杂连接的 Hibernate 映射
我有试图映射到新代码的遗留代码。 OLD_PERSON pid sid name age NEW_PERSON pid sid fid age RESOLVE_PERSON pid fid statu
带有鉴别器的 hibernate 映射
我有一个表，其中一个字段可以指向其他 3 个表之一中的外键，具体取决于鉴别器值是什么(Project、TimeKeep 或 CostCenter。通常这是用子类实现的，我想知道我有什么注意子类名称与
Haskell:映射 runST
我有一个类型 [ST s (Int, [Int])] 的绑定(bind)我正在尝试申请runST使用映射到每个元素，如下所示: name :: [ST s (Int, [Int])] --Of Cou
子类和连接子类的 NHibernate 映射
在我正在进行的项目中，我有以下实体:分析师、客户和承包商。每个都继承自基类 User。 public abstract class User { public virtual int Id
用户输入的 Vim 映射
我想知道是否可以在 Vim 中创建一个映射(对于普通模式)，允许用户在映射执行之前输入。我想为我最常用的 grep 命令创建一个快捷方式的映射。我希望命令允许输入我正在搜索的内容，然后在输入时执行。

首页

博学

6Ren·AI

商城

c - ELF 二进制分析静态与动态。汇编代码如何|指令内存映射变化？