gpt4 book ai didi

linux - 将大数据写入套接字时最小化副本

转载 作者:IT王子 更新时间:2023-10-29 00:14:13 25 4
gpt4 key购买 nike

我正在编写一个处理图像(大数据)的应用程序服务器。将图像数据发送回客户端时,我试图尽量减少副本。我需要发送给客户端的处理过的图像位于从 jemalloc 获得的缓冲区中。我想到的将数据发送回客户端的方式是:

1) 简单的写调用。

// Allocate buffer buf.
// Store image data in this buffer.
write(socket, buf, len);

2) 我通过 mmap 而不是 jemalloc 获取缓冲区,尽管我假设 jemalloc 已经使用 mmap 创建了缓冲区。然后,我进行一个简单的调用以进行写入。

buf = mmap(file, len);  // Imagine proper options.
// Store image data in this buffer.
write(socket, buf, len);

3) 我像以前一样通过 mmap 获取缓冲区。然后我使用 sendfile 发送数据:

buf = mmap(in_fd, len);  // Imagine proper options.
// Store image data in this buffer.
int rc;
rc = sendfile(out_fd, file, &offset, count);
// Deal with rc.

似乎 (1) 和 (2) 可能会做同样的事情,因为 jemalloc 可能首先通过 mmap 分配内存。不过,我不确定 (3)。这真的会带来任何好处吗?图 4 关于这个 article在 Linux 上,零拷贝方法表明可以使用 sendfile 阻止进一步的拷贝:

no data is copied into the socket buffer. Instead, only descriptors with information about the whereabouts and length of the data are appended to the socket buffer. The DMA engine passes data directly from the kernel buffer to the protocol engine, thus eliminating the remaining final copy.

如果一切顺利,这似乎是一场胜利。我不知道我的 mmaped 缓冲区是否算作内核缓冲区。我也不知道什么时候可以安全地重新使用这个缓冲区。由于 fd 和 length 是唯一附加到套接字缓冲区的东西,我假设内核实际上将此数据异步写入套接字。如果它执行从 sendfile 的返回意味着什么?我如何知道何时重新使用此缓冲区?

所以我的问题是:

  1. 将大缓冲区(在我的例子中是图像)写入套接字的最快方法是什么?图像保存在内存中。
  2. 对映射文件调用 sendfile 是个好主意吗?如果是,陷阱是什么?这甚至会带来任何胜利吗?

最佳答案

看来我的猜测是正确的。我从这个 article 得到了我的信息.引用它:

Also these network write system calls, including sendfile, might and in many cases do return before the data sent over TCP by the method call has been acknowledged. These methods return as soon as all data is written into the socket buffers (sk buff) and is pushed to the TCP write queue, the TCP engine can manage alone from that point on. In other words at the time sendfile returns the last TCP send window is not actually sent to the remote host but queued. In cases where scatter-gather DMA is supported there is no seperate buffer which holds these bytes, rather the buffers(sk buffs) just hold pointers to the pages of OS buffer cache, where the contents of file is located. This might lead to a race condition if we modify the content of the file corresponding to the data in the last TCP send window as soon as sendfile is returned. As a result TCP engine may send newly written data to the remote host instead of what we originally intended to send.

假设来自映射文件的缓冲区甚至被认为是“可 DMA 的”,似乎没有办法知道在没有来自实际客户端的明确确认(通过网络)的情况下何时可以安全地重新使用它。我可能不得不坚持简单的写调用并招致额外的副本。有一个paper (也来自文章)更多细节。

编辑:这个article在拼接调用上也显示了问题。引用它:

Be aware, when splicing data from a mmap'ed buffer to a network socket, it is not possible to say when all data has been sent. Even if splice() returns, the network stack may not have sent all data yet. So reusing the buffer may overwrite unsent data.

关于linux - 将大数据写入套接字时最小化副本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20008707/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com