我可以 mmap 长度大于文件大小的文件吗？-6ren

我可以 mmap 长度大于文件大小的文件吗？

转载作者：行者123 更新时间：2023-11-30 14:34:16

25

4

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

我不明白使用 MAP_PRIVATE 标志时 mmap 是如何工作的。我可以将大于文件 fd 大小的 length 传递给 mmap 吗？这样做之后，我可以写入和读取超出文件大小但在 length 以内的内存吗？

我正在编写一些计算文件 MD5 的代码。我决定编写仅将数据作为 void* 和 size_t len 进行操作的函数，而不是使用标准库流函数。之前，我使用malloc并在使用它们之前将文件复制到一些malloc'ed内存中，但事实证明这对于大文件来说相当慢，而且一旦我发现mmap就非常愚蠢.

我正在处理的问题是，在计算任何数据的MD5之前，some padding and information is appended to the data that will be hashed.使用之前的 malloc 解决方案，我只需计算需要附加多少数据，然后 realloc 并写入。现在，我预先计算需要附加多少数据，并将增加的长度传递给 mmap。在小文件上，这工作正常，但在大文件上，尝试写入文件大小之外的地址会导致段错误。

这就是我想要做的:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <sys/mman.h>
#include <sys/stat.h>


// The Data + length struct
struct data{
        void* s;
        size_t len;
};

//mmap on opened file descriptor into a data struct
struct data* data_ffile(int fd)
{
        struct data* ret = malloc(sizeof(struct data));

        //Get the length of the file
        struct stat desc;
        fstat(fd, &desc);
        ret->len = (size_t)desc.st_size;

        //Calculate the length after appending
        size_t new_len =  ret->len + 1;
        if((new_len % 64) > 56)
                new_len += (64 * 2) - (new_len % 64);
        else if((new_len % 64) <= 56)
                new_len += 64 - (new_len % 64);

        //Map the file with the increased length
        ret->s = mmap(NULL, new_len, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE, fd, 0);

        if(ret->s == MAP_FAILED) exit(-1);

        return ret;
}

//Append a character to the mmap'ed data
void data_addchar(struct data* w, unsigned char c)
{
        ((char*)w->s)[w->len++] = c;
        return;
}

void md5_append(struct data* md)
{
        data_addchar(md, 0x80);

        while((md->len % 64) != 56){
                data_addchar(md, (char)0);
        }
}

int main(int argc, char** argv)
{
        int fd = open(argv[1], O_RDONLY);
        struct data* in = data_ffile(fd);
        close(fd);

        md5_append(in);
}

我对 mmap 有基本的误解吗？

最佳答案

Can I pass a length greater than the size of file fd to mmap? After doing so, can I write and read the memory that exceeds the size of the file but is within length?

这全部记录在 mmap POSIX specification 中:

The system shall always zero-fill any partial page at the end of an object. Further, the system shall never write out any modified portions of the last page of an object which are beyond its end. References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.

是的，您可以 mmap 长度大于文件大小，并且
访问文件末尾以外的任何页面(最后一页(可能是部分页面)除外)将导致 SIGBUS。

关于我可以 mmap 长度大于文件大小的文件吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59024621/

25

4

0

文章推荐：从数组创建并打印二叉树

文章推荐：从技术上讲，使用 -O3 编译 gcc 会增加我的缓存未命中率

文章推荐： javascript - 如何在多次鼠标悬停时更改光标

文章推荐： C 程序中相同首字母的串联(不区分大小写)

mmap 用户调用与 mmap 内核调用之间的连接
我想了解 mmap 的工作原理。mmap 的用户级调用如下所示。 void *mmap(void *addr, size_t len, int prot, int flags, int
python - 如何将 mmap.mmap Python 对象转换为字符串？
我正在做一个Bottle驱动程序，我使用 yield 关键字和 mmap.mmap 对象在输出流中发送多个映射文件，如以下代码所示: for mapping in mappings: yield
Python，mmap，如果我不手动调用 mmap.close() 怎么办？
我来自 C++/RAII 世界。所以我对何时以及如何调用 mmap.close() 感到困惑[不是 file.close()]。或者，根本不调用它？会不会漏气？至于来自 document 的示例
c - mmap 和 valgrind，mmap 不会增加堆大小
我正在我的大学上操作系统类(class)，我们的任务之一是使用 mmap 实现简单的 malloc。现在我开始工作了，我尝试使用 valgrind 来检测遗留的任何错误。不管是否释放内存，valgri
linux - mmap(2) 与 mmap(3)
有谁知道 mmap(2) 和 mmap(3) 的区别是什么？手册第 3 节被描述为“本章描述了除第 2 章中描述的实现系统调用的库函数之外的所有库函数。” mmap(3) 不执行系统调用吗？阅读这两
python - mmap 多个文件 block 和缓存 mmap 对象 (Python)
我不知道我在理解mmap时错过了哪些知识。我就是想不通。但让我这样问我的问题: 我有很多(例如 3 个)文件 block ，其大小分别为 s1、s2、s3。 s1、s2 和 s3 均小于 Mmap (
c - 当 offset 递减传递给 mmap 时，mmap 调用失败
在 Linux 下: #free -m total used free shared buffers cachedMem:
python - 将 mmap 指针作为 mmap 对象从 C 传递给 python
我正在尝试将 C 库(beaglebone PRU 驱动程序 prussdrv.c)与 Python 连接。我想要访问的特定函数返回一个 mmap 指针，如下所示: int __prussdrv_me
c - 再次使用 mmap() ，重新使用以前的 mmap() 结果失败 - 使原始 ptr 无效？
当我调用mmap时: ptr = mmap(NULL, ...); 并要求系统提供一个缓冲区并将文件映射到其中，然后使用再次调用 mmap ptr2 = mmap(ptr, ...); 尝试
Java mmap 在 Android 上失败并显示 "mmap failed: ENOMEM (Out of memory)"
在 Android 上用 Java 内存映射一个大文件效果很好。但是当映射总数超过 ~1.5GB 时，即使有多个映射调用，它也会失败: mmap failed: ENOMEM (Out of memo
linux - 如何将 write() 优先于 mmap 更新(或延迟 mmap 页面缓存刷新)
我在具有 64G 内存和大量磁盘空间的 debian-64 上运行一个专门的数据库守护进程。它使用磁盘上的哈希表(mmaped)并通过定期 write() 调用将实际数据写入文件。当进行大量更新时，m
python - 为什么使用 Python mmap 模块比从 C++ 调用 POSIX mmap 慢得多？
C++代码: #include #include #include #include #include using namespace std; #define FILE_MODE (S_I
c - 对整个 4Kb block 使用 mmap() 是否可以，还是一次性对我的整个文件使用 mmap() 更好？
我想处理一个由 4Kb block 组成的文件。随着事情的发生，我将编写更多数据并映射新部分，取消映射我不再需要的部分。当要映射的文件数据总量约为 4Gb 时，仅 4Kb 的 map() 是否太小
python - 转换 mmap 对象(mmaps 不支持串联)/将 c 代码转换为 python
大家好，我正在尝试将下面的代码转换为 python(访问树莓派 1Mhz 计时器)，我不知道什么时候要映射对象，我们需要 + TIMER_OFFSET (timer = (long long int
c - 混合使用 mmap(2) 和 malloc(3) 的安全方法需要 : the result of mmap(2) must be continuous
我所做的是一个垃圾收集器，使用mmap(2)为用户空间分配空间，这就要求最初分配时可以从任何地方开始，但是后面的分配地址应该是与之前的分配连续，如下所示: page_size = getpagesiz
mmap() 内部结构
众所周知，最重要的 mmap() 功能是在许多进程之间共享文件映射。但众所周知，每个进程都有自己的地址空间。问题是内存映射文件(更具体地说，它的数据)真正保存在哪里，以及进程如何访问这些内存？我的
mmap - 内存映射文件可以有多大？
什么限制了内存映射文件的大小？我知道它不能大于未分配地址空间的最大连续块，并且应该有足够的可用磁盘空间。但是还有其他限制吗？最佳答案您太保守了:内存映射文件可能大于地址空间。查看内存映射文件的
mmap 大端与小端
如果我使用 mmap 来编写 uint32_t，我会遇到大端/小端约定的问题吗？特别是，如果我在 big-endian 机器上写入一些数据 mmap，当我尝试在 little-endian 机器上读取
c - 我如何从单个文件描述符分配多个 MMAP？
所以，对于我最后一年的项目，我使用 Video4Linux2 从相机中提取 YUV420 图像，将它们解析为 x264(本地使用这些图像)，然后通过 Live555 将编码流发送到 RTP/RTCP通
linux - mmap 是原子的吗？
是 mmap在它们的效果中调用原子？也就是说，是否由 mmap 进行了映射更改以原子方式出现在访问受影响区域的其他线程中？作为试金石，请考虑您执行 mmap 的情况。在一个全为零的文件中(来自线程

首页

博学

6Ren·AI

商城

我可以 mmap 长度大于文件大小的文件吗？