protocol-buffers - 大型数据集上的 gRPC 序列化缓慢-6ren

protocol-buffers - 大型数据集上的 gRPC 序列化缓慢

转载作者：行者123 更新时间：2023-12-05 04:35:38

25

4

我知道谷歌声明 protobufs 不支持大消息 ( i.e. greater than 1 MB )，但我正在尝试使用 gRPC 流式传输一个数十兆字节的数据集，似乎有些人说它是 ok , 或者至少用 some splitting ...

但是，当我尝试以这种方式发送数组时(repeated uint32)，在同一台本地计算机上大约需要 20 秒。

#proto
service PAS {
  // analyze single file
  rpc getPhotonRecords (PhotonRecordsRequest) returns (PhotonRecordsReply) {}
}

message PhotonRecordsRequest {
  string fileName = 1;
}

message PhotonRecordsReply {
  repeated uint32 PhotonRecords = 1;
}

其中 PhotonRecordsReply 的长度需要大约 1000 万 uint32...

有没有人知道如何加快速度？或者哪种技术更合适？

所以我认为我已经根据给出的评论和答案实现了流式处理，但它仍然需要相同的时间:

#proto
service PAS {
  // analyze single file
  rpc getPhotonRecords (PhotonRecordsRequest) returns (stream PhotonRecordsReply) {}
}

class PAS_GRPC(pas_pb2_grpc.PASServicer):

    def getPhotonRecords(self, request: pas_pb2.PhotonRecordsRequest, _context):
        raw_data_bytes = flb_tools.read_data_bytes(request.fileName)
        data = flb_tools.reshape_flb_data(raw_data_bytes)
        index = 0
        chunk_size = 1024
        len_data = len(data)
        while index < len_data:
            # last chunk
            if index + chunk_size > len_data:
                yield pas_pb2.PhotonRecordsReply(PhotonRecords=data[index:])
            # all other chunks
            else:
                yield pas_pb2.PhotonRecordsReply(PhotonRecords=data[index:index + chunk_size])
            index += chunk_size

最小重现 Github example

最佳答案

如果您将其更改为使用应该有所帮助的流。为我转移不到2秒。请注意，这是在没有 ssl 的情况下在本地主机上进行的。我把这段代码放在一起。我确实运行了它并且它起作用了。例如，不确定如果文件不是 4 字节的倍数会发生什么。此外，读取字节的字节顺序是 Java 的默认顺序。

我的 10 兆文件是这样制作的。

dd if=/dev/random  of=my_10mb_file bs=1024 count=10240

这是服务定义。我在此处添加的唯一内容是响应流。

service PAS {
  // analyze single file
  rpc getPhotonRecords (PhotonRecordsRequest) returns (stream PhotonRecordsReply) {}
}

这是服务器实现。

public class PhotonsServerImpl extends PASImplBase {

  @Override
  public void getPhotonRecords(PhotonRecordsRequest request, StreamObserver<PhotonRecordsReply> responseObserver) {
    log.info("inside getPhotonRecords");
    
    // open the file, I suggest using java.nio API for the fastest read times.
    Path file = Paths.get(request.getFileName());
    try (FileChannel fileChannel = FileChannel.open(file, StandardOpenOption.READ)) {

      int blockSize = 1024 * 4;
      ByteBuffer byteBuffer = ByteBuffer.allocate(blockSize);
      boolean done = false;
      while (!done) {
        PhotonRecordsReply.Builder response = PhotonRecordsReply.newBuilder();
        // read 1000 ints from the file.
        byteBuffer.clear();
        int read = fileChannel.read(byteBuffer);
        if (read < blockSize) {
          done = true;
        }
        // write to the response.
        byteBuffer.flip();
        for (int index = 0; index < read / 4; index++) {
          response.addPhotonRecords(byteBuffer.getInt());
        }
        // send the response
        responseObserver.onNext(response.build());
      }
    } catch (Exception e) {
      log.error("", e);
      responseObserver.onError(
          Status.INTERNAL.withDescription(e.getMessage()).asRuntimeException());
    }
    responseObserver.onCompleted();
    log.info("exit getPhotonRecords");

  }
}

客户端只记录接收到的数组的大小。

public long getPhotonRecords(ManagedChannel channel) {
  if (log.isInfoEnabled())
    log.info("Enter - getPhotonRecords ");

  PASGrpc.PASBlockingStub photonClient = PASGrpc.newBlockingStub(channel);

  PhotonRecordsRequest request = PhotonRecordsRequest.newBuilder().setFileName("/udata/jdrummond/logs/my_10mb_file").build();

  photonClient.getPhotonRecords(request).forEachRemaining(photonRecordsReply -> {
    log.info("got this many photons: {}", photonRecordsReply.getPhotonRecordsCount());
  });

  return 0;
}

关于protocol-buffers - 大型数据集上的 gRPC 序列化缓慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70993553/

25

4

0

文章推荐： r - 使用 Rselenium 在下拉框中选择选项

文章推荐： html - CSS 网格行垂直溢出其容器

protocols - 无状态协议(protocol)是否被认为比有状态协议(protocol)更好用？
我可以看到有状态的协议(protocol)可以减少像 cookie 这样的“模拟状态”。但是测试变得更加难以确保您的实现正确并重新连接，并且 session 继续可能很难处理。始终使用无状态协议(
protocols - 通用二进制协议(protocol)
我正在尝试为我的下一个分布式应用程序找到合适的协议(protocol)中间件。在过去的几天里，我找到了几个规范，想知道我是否错过了一个重要的规范？它应该是二进制协议(protocol)，支持 RPC，
protocols - 我在哪里可以找到自动柜员机使用的交易协议(protocol)？
我正在做一个研究生院软件工程项目，我正在寻找管理 ATM 和银行网络之间通信的协议(protocol)。我已经在谷歌上搜索了很长一段时间，虽然我找到了各种有关 ATM 的有趣信息，但我惊讶地发现似乎
protocol-buffers - 如何使用 Protocol Buffers 构建第三方串行通信协议(protocol)？
我正在开发一个 ECG 模块，它以字节为单位给出数据。有一个关于它的协议(protocol)文档解释了如何构建从模块中出来的数据包。我想解码该数据。我很困惑 Protocol Buffer 是否会对此
network-protocols - 面向消息的协议(protocol)和面向流的协议(protocol)之间的区别
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。想改进这个问题？将问题更新为 on-topic对于堆栈溢出。 3年前关闭。 Improve this qu
protocols - 理解ZMODEM协议(protocol)
我需要在我的程序中包含基本的文件发送和文件接收例程，并且需要通过 ZMODEM 协议(protocol)。问题是我无法理解规范。供引用，here is the specification . 规范没
protocols - 什么是联合协议(protocol)？
我最近听到这个术语来描述 Google 的新环聊协议(protocol)和 Whisper System 的新 encrypted texting app . The new TextSecure p
ios7 - 内容协议(protocol)。协议(protocol)不可转换为协议(protocol)
如何检查某个对象是否符合协议(protocol)？我试过这种方式，但出现错误: if lCell.conformsToProtocol(ContentProtocol) { } 最佳
ios - 协议(protocol)类型不能符合协议(protocol)，因为只有具体类型才能符合协议(protocol)
在应用程序中，我们有两种类型的贴纸，字符串和位图。每个贴纸包都可以包含两种类型。这就是我声明模型的方式: // Mark: - Models protocol Sticker: Codable { }
swift - 无法使用另一个符合协议(protocol)的协议(protocol)来符合协议(protocol)
这个问题在这里已经有了答案: Why can't a get-only property requirement in a protocol be satisfied by a property w
swift - 不支持使用 'Protocol' 作为符合协议(protocol) 'Protocol' 的具体类型
我有以下快速代码: protocol Animal { var name: String { get } } struct Bird: Animal { var name: String
iphone - 协议(protocol)中的协议(protocol)是否可以被视为包含它们采用的协议(protocol)？
我在遵循继承树的几个类中分配协议(protocol)。像这样: 头等舱 @protocol LevelOne - (void) functionA @end @interface BaseClass
fix-protocol - OUCH 协议(protocol)和 FIX 协议(protocol)有什么区别。两种协议(protocol)的消息看起来非常相似
我们之前使用的是 fix，但客户说使用 OUCH 进行交易，因为这样速度更快。我在互联网上查了一下，消息看起来很相似。它如何获得速度优势。请给我一些示例消息最佳答案基本上，FIX 消息以文本格式传
swift - 如何使协议(protocol)关联类型需要协议(protocol)继承而不是协议(protocol)采用
在我的 swift 项目中，我有一个使用协议(protocol)继承的案例，如下所示 protocol A : class{ } protocol B : A{ } 接下来我要实现的目标是声明另一个具
protocols - OPC UA 协议(protocol)与 MQTT 协议(protocol)
我想根据这两种协议(protocol)的一般特征(例如开销(数据包)、安全性、信息建模和可靠性)来比较 OPC UA 和 MQTT。我在哪里可以找到每个协议(protocol)的开销和其他特性的一些示
ios - 如何使一个协议(protocol)具有另一个协议(protocol)的属性，同时确保可以限制符合第一个协议(protocol)的类
本质上，我的最终目标是拥有一个协议(protocol) Log，它强制所有符合它的对象都有一个符合另一个协议(protocol) [LogEvent] 的对象数组. 但是，符合Log的类需要有特定类型
ios - 如何根据实现该协议(protocol)的两个实例的身份为协议(protocol)实现 Equatable 协议(protocol)？
我正在尝试为基于左操作数和右操作数标识的协议(protocol)实现 Equatable 协议(protocol)。换句话说:我如何为一个协议(protocol)实现 Equatable 协议(pro
protocols - smb协议(protocol)漏洞解决方案
问题不在于编程。我正在使用一台旧机器，微软停止了这些机器的补丁。有没有人针对攻击者已知的使用端口 445 的 SMB 协议(protocol)漏洞的解决方案？任何棘手的解决方案？换句话说，我想
protocols - Protocol Buffer 日志记录
在我们的业务中，我们需要记录到达我们服务器的每个请求/响应。目前，我们使用 xml 作为标准实现。如果我们需要调试/跟踪某些错误，则使用日志文件。如果我们切换到 Protocol Buffer
protocols - 协议(protocol)定义语言
你推荐什么协议(protocol)定义？我评估了 Google 的 Protocol Buffer ，但它不允许我控制正在构建的数据包中字段的位置。我认为 Thrift 也是如此。我的要求是: 指定

首页

博学

6Ren·AI

商城

protocol-buffers - 大型数据集上的 gRPC 序列化缓慢