c# - 提高大型结构列表的二进制序列化性能-6ren

c# - 提高大型结构列表的二进制序列化性能

转载作者：IT王子更新时间：2023-10-29 04:39:30

我有一个在 3 个整数中保存 3d 坐标的结构。在测试中，我将 100 万个随机点的列表<>放在一起，然后对内存流使用二进制序列化。

内存流的大小约为 21 MB - 这似乎非常低效，因为 1000000 点 * 3 坐标 * 4 字节应该至少为 11MB

它在我的测试装置上也需要大约 3 秒。

有什么改进性能和/或大小的想法吗？

(如果有帮助，我不必保留 ISerialzable 接口(interface)，我可以直接写入内存流)

编辑 - 根据下面的答案，我整理了一个比较 BinaryFormatter、'Raw' BinaryWriter 和 Protobuf 的序列化摊牌

using System;
using System.Text;
using System.Collections.Generic;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
using System.IO;
using ProtoBuf;

namespace asp_heatmap.test
{
    [Serializable()] // For .NET BinaryFormatter
    [ProtoContract] // For Protobuf
    public class Coordinates : ISerializable
    {
        [Serializable()]
        [ProtoContract]
        public struct CoOrd
        {
            public CoOrd(int x, int y, int z)
            {
                this.x = x;
                this.y = y;
                this.z = z;
            }
            [ProtoMember(1)]            
            public int x;
            [ProtoMember(2)]
            public int y;
            [ProtoMember(3)]
            public int z;
        }

        internal Coordinates()
        {
        }

        [ProtoMember(1)]
        public List<CoOrd> Coords = new List<CoOrd>();

        public void SetupTestArray()
        {
            Random r = new Random();
            List<CoOrd> coordinates = new List<CoOrd>();
            for (int i = 0; i < 1000000; i++)
            {
                Coords.Add(new CoOrd(r.Next(), r.Next(), r.Next()));
            }
        }

        #region Using Framework Binary Formatter Serialization

        void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue("Coords", this.Coords);
        }

        internal Coordinates(SerializationInfo info, StreamingContext context)
        {
            this.Coords = (List<CoOrd>)info.GetValue("Coords", typeof(List<CoOrd>));
        }

        #endregion

        # region 'Raw' Binary Writer serialization

        public MemoryStream RawSerializeToStream()
        {
            MemoryStream stream = new MemoryStream(Coords.Count * 3 * 4 + 4);
            BinaryWriter writer = new BinaryWriter(stream);
            writer.Write(Coords.Count);
            foreach (CoOrd point in Coords)
            {
                writer.Write(point.x);
                writer.Write(point.y);
                writer.Write(point.z);
            }
            return stream;
        }

        public Coordinates(MemoryStream stream)
        {
            using (BinaryReader reader = new BinaryReader(stream))
            {
                int count = reader.ReadInt32();
                Coords = new List<CoOrd>(count);
                for (int i = 0; i < count; i++)                
                {
                    Coords.Add(new CoOrd(reader.ReadInt32(),reader.ReadInt32(),reader.ReadInt32()));
                }
            }        
        }
        #endregion
    }

    [TestClass]
    public class SerializationTest
    {
        [TestMethod]
        public void TestBinaryFormatter()
        {
            Coordinates c = new Coordinates();
            c.SetupTestArray();

            // Serialize to memory stream
            MemoryStream mStream = new MemoryStream();
            BinaryFormatter bformatter = new BinaryFormatter();
            bformatter.Serialize(mStream, c);
            Console.WriteLine("Length : {0}", mStream.Length);

            // Now Deserialize
            mStream.Position = 0;
            Coordinates c2 = (Coordinates)bformatter.Deserialize(mStream);
            Console.Write(c2.Coords.Count);

            mStream.Close();
        }

        [TestMethod]
        public void TestBinaryWriter()
        {
            Coordinates c = new Coordinates();
            c.SetupTestArray();

            MemoryStream mStream = c.RawSerializeToStream();
            Console.WriteLine("Length : {0}", mStream.Length);

            // Now Deserialize
            mStream.Position = 0;
            Coordinates c2 = new Coordinates(mStream);
            Console.Write(c2.Coords.Count);
        }

        [TestMethod]
        public void TestProtoBufV2()
        {
            Coordinates c = new Coordinates();
            c.SetupTestArray();

            MemoryStream mStream = new MemoryStream();
            ProtoBuf.Serializer.Serialize(mStream,c);
            Console.WriteLine("Length : {0}", mStream.Length);

            mStream.Position = 0;
            Coordinates c2 = ProtoBuf.Serializer.Deserialize<Coordinates>(mStream);
            Console.Write(c2.Coords.Count);
        }
    }
}

结果 (注意 PB v2.0.0.423 测试版)

                Serialize | Ser + Deserialize    | Size
-----------------------------------------------------------          
BinaryFormatter    2.89s  |      26.00s !!!      | 21.0 MB
ProtoBuf v2        0.52s  |       0.83s          | 18.7 MB
Raw BinaryWriter   0.27s  |       0.36s          | 11.4 MB

显然这只是考虑速度/大小，并没有考虑任何其他因素。

最佳答案

使用 BinaryFormatter 的二进制序列化在其生成的字节中包含类型信息。这会占用额外的空间。例如，当您不知道另一端的数据结构时，它会很有用。

在您的情况下，您知道数据在两端的格式是什么，这听起来不会改变。所以你可以写一个简单的编码和解码方法。您的 CoOrd 类也不再需要可序列化。

我会使用 System.IO.BinaryReader 和 System.IO.BinaryWriter ，然后遍历每个 CoOrd 实例并将 X、Y、Z 属性值读/写到流中。假设您的许多数字小于 0x7F 和 0x7FFF，这些类甚至会将您的整数压缩到小于 11MB。

像这样:

using (var writer = new BinaryWriter(stream)) {
    // write the number of items so we know how many to read out
    writer.Write(points.Count);
    // write three ints per point
    foreach (var point in points) {
        writer.Write(point.X);
        writer.Write(point.Y);
        writer.Write(point.Z);
    }
}

从流中读取:

List<CoOrd> points;
using (var reader = new BinaryReader(stream)) {
    var count = reader.ReadInt32();
    points = new List<CoOrd>(count);
    for (int i = 0; i < count; i++) {
        var x = reader.ReadInt32();
        var y = reader.ReadInt32();
        var z = reader.ReadInt32();
        points.Add(new CoOrd(x, y, z));
    }
}

关于c# - 提高大型结构列表的二进制序列化性能，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/6478579/

文章推荐： c# - Convert.ToDateTime(bool) 的意义何在？

文章推荐： javascript - 将监听器动态添加到 Google map 标记

文章推荐： JavaScript "window.onload"– "window"真的有必要吗？

文章推荐： c# - CCI vs. Mono.Cecil——优点和缺点

performance - 提高 FOR 循环的性能
我正在比较工作簿中的工作表。该工作簿有两张名为 PRE 和 POST 的工作表，每张工作表都有相同的 19 列。行数每天都不同，但特定一天的两张表的行数相同。该宏将 PRE 工作表中的每一行与 POS
JavaScript:提高 FOR 循环的性能以阻止浏览器锁定？
我有一个对象数组，我一次循环遍历该数组一个对象，然后进行几次检查以查看该数组中的每个对象是否满足特定条件，如果该对象满足此条件，则复制一个属性将此对象放入数组中(该属性还包含另一个对象)。 for(v
c++ - 提高 += 运算符性能
我正在编写一个必须非常快的应用程序。我使用 Qt 5.5 和 Qt Creator，Qt 的 64 位 MSVC2013 编译版本。我使用非常困倦的 CS 来分析我的应用程序，我看到占用最多独占时间
java - 提高 for-each 性能
我有以下 CountDownTimer 在我的 Android 应用程序中不断运行。 CountDownTimer timer_status; timer_status = new CountDown
python - 提高 sklearn 中随机森林回归器的性能
有一个优化问题，我必须调用随机森林回归器的预测函数数千次。 from sklearn.ensemble import RandomForestRegressor rfr = RandomForestR
.net - 提高 nHibernate 数据访问层的性能
我正在努力提高现有 Asp.Net Web 应用程序的数据访问层的性能。场景是。它是一个基于 Web 的 Asp.Net 应用程序。数据访问层使用 NHibernate 1.2 构建并作为 WCF
video - 提高 ffmpeg 视频捕获性能？
我在我的 Intel Edison 上运行 Debian，并尝试使用 ffmpeg 通过 USB 网络摄像头捕获视频。我正在使用的命令是: ffmpeg -f video4linux2 -i /dev
performance - 提高 VBA 中的循环效率
我有一个 For循环遍历整数 1 到 9 并简单地找到与该整数对应的最底部的条目(即 1,1,1,2,3,4,5 将找到第三个“1”条目)并插入一个空白行。我将数字与仅对应于此代码的应用程序的字符串“
sql - 提高 Postgresql 查询的性能
我有一个带有非规范化架构(1 个表)的 postgresql 数据库，其中包含大约 400 万个条目。现在我有这个查询: SELECT count(*) AS Total, (SELECT c
coq - 提高 coq 策略的失败级别
在 Ltac 中实现复杂的策略时，有一些 Ltac 命令或策略调用我预计会失败以及预期失败(例如终止 repeat 或导致回溯)。这些故障通常在故障级别 0 时引发。更高级别引发的故障“逃避”周
performance - 提高 Ansible 性能
我正在尝试提高 ansible playbook 的性能。我有一个测试剧本如下: --- - name: Test hosts: localhost connection: local g
reactjs - 提高 axios 获取下载速度
我正在使用 axios从 Azure 存储 Blob 下载文件 (~100MB)。 axios({ method: 'get', url: uri, onDownloadProgress:
performance - 提高 ClojureScript 程序的性能
我有一个 ClojureScript 程序，主要对集合执行数学计算。它是在惯用的、独立于主机的 Clojure 中开发的，因此很容易对其进行基准测试。令我惊讶的是(与答案对 Which is fast
performance - 提高 jetty 性能
我有一个程序必须在硬件允许的情况下尽快发出数千个 http 请求。在现实世界中，这些连接中的每一个都将连接到一个离散的服务器，但我已经编写了一个测试程序来帮助我模拟负载(希望如此)。我的程序使用 A
performance - 提高 Fortran 代码性能的提示和技巧
就目前而言，这个问题不适合我们的问答形式。我们希望答案得到事实、引用资料或专业知识的支持，但这个问题可能会引发辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the
performance - 提高 Clojure 中点云边界框计算的性能
我正在计算 Clojure 中 3d 点云的边界框。点云表示为 Java 原始浮点数组，点云中的每个点都使用 4 个浮点存储，其中最后一个浮点未使用。像这样: [x0 y0 z0 u0 x1 y1
performance - 提高 magento 性能的最佳步骤是什么？
就目前而言，这个问题不适合我们的问答形式。我们希望答案得到事实、引用或专业知识的支持，但这个问题可能会引起辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the he
r - 提高 R 光线着色器图像的分辨率
我正在尝试使用rayshader 包制作图像。我很高兴能够使用如下代码创建一个 png 文件: library(ggplot2) library(rayshader) example_plot <-
jquery - 提高 jQuery 模板性能
更新显然，jQuery 模板可以被编译，并且它有助于显示带有 if 语句的模板的性能 here . 但是如图here ，预编译的 jQuery 模板对我的情况没有多大作用，因为我的模板不包含逻辑
iphone - 提高 ScrollView 的性能
我是编程新手。我有一个启用分页的 ScrollView ，其中包含许多页面(最多十个)，并且在每个页面上都有一个自定义按钮。每个自定义按钮都有一个自定义图像。我在 Interface Builder

IT王子

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c# - 提高大型结构列表的二进制序列化性能