Int64针对32位架构是按照4字节还是8字节对齐？-6ren

Int64针对32位架构是按照4字节还是8字节对齐？

转载作者：我是一只小鸟更新时间：2023-07-18 14:34:05

作为构建.NET的标准，CLI Spec（ ECMA-335 ）针对基元类型的对齐规则具有如下的描述。按照这个标准，我们是这么理解的：8字节的数据类型（int64、unsigned int64和float64）根据采用的机器指令架构选择4字节或者8字节对其。进一步来说，它们在x86/x64机器上的对其字节分别为4字节和8字节.

Built-in data types shall be properly aligned , which is defined as follows

1-byte, 2-byte, and 4-byte data is properly aligned when it is stored at a 1-byte, 2-byte, or 4-byte boundary, respectively. 。

8-byte data is properly aligned when it is stored on the same boundary required by the underlying hardware for atomic access to a native int . 。

Thus, int16 and unsigned int16 start on even address; int32 , unsigned int32 , and float32 start on an address divisible by 4; and int64 , unsigned int64 , and float64 start on an address divisible by 4 or 8, depending upon the target architecture . The native size types ( native int , native unsigned int , and & ) are always naturally aligned (4 bytes or 8 bytes, depending on the architecture). When generated externally, these should also be aligned to their natural size, although portable code can use 8-byte alignment to guarantee architecture independence. It is strongly recommended that float64 be aligned on an 8-byte boundary, even when the size of native int is 32 bits. 。

我们通过一个简单控制台程序来验证这个说法。为了在64位机器上模拟32位平台，我们按照如下的方式修改了.csproj文件，将PlatformTarget属性设置为x86（默认为Any CPU）.

                          
                            <
                          
                          
                            Project
                          
                          
                            Sdk
                          
                          =
                          
                            "Microsoft.NET.Sdk"
                          
                          
                            >
                          
                          
                            <
                          
                          
                            PropertyGroup
                          
                          
                            >
                          
                          
                            <
                          
                          
                            OutputType
                          
                          
                            >
                          
                          Exe
                          
                            </
                          
                          
                            OutputType
                          
                          
                            >
                          
                          
                            <
                          
                          
                            TargetFramework
                          
                          
                            >
                          
                          net7.0
                          
                            </
                          
                          
                            TargetFramework
                          
                          
                            >
                          
                          
                            <
                          
                          
                            ImplicitUsings
                          
                          
                            >
                          
                          enable
                          
                            </
                          
                          
                            ImplicitUsings
                          
                          
                            >
                          
                          
                            <
                          
                          
                            Nullable
                          
                          
                            >
                          
                          enable
                          
                            </
                          
                          
                            Nullable
                          
                          
                            >
                          
                          
                            <
                          
                          
                            AllowUnsafeBlocks
                          
                          
                            >
                          
                          True
                          
                            </
                          
                          
                            AllowUnsafeBlocks
                          
                          
                            >
                          
                          
                            <
                          
                          
                            PlatformTarget
                          
                          
                            >
                          
                          
                            x86
                          
                          
                            </
                          
                          
                            PlatformTarget
                          
                          
                            >
                          
                          
                            </
                          
                          
                            PropertyGroup
                          
                          
                            >
                          
                          
                            </
                          
                          
                            Project
                          
                          
                            >

在演示程序中，我们定义了如下一个名为Foobar的结构体Record。该结构体具有两个字段，类型分别为byte和ulong（unsigned int64）。我们将这两个字段分别设置为byte.Max(FF)和ulong.MaxValue(FF-FF-FF-FF-FF-FF-FF-FF-FF)，并将在内存中的二进制形式输出来。为了进一步确定当前的环境与CLI Spec的描述一致，我们将 Environment.Is64BitProcess属性（确定是不是64位处理器），ulong类型的字节数（确定这是一个”8-byte data”）和IntPtr.Size（确定native int类型的对其边界是4字节）.

                          
                            unsafe
                          
                          
{
    var bytes = 
                          
                            new
                          
                          
                            byte
                          
                          [
                          
                            sizeof
                          
                          (Foobar)];
    var foobar = 
                          
                            new
                          
                           Foobar(
                          
                            byte
                          
                          .MaxValue, 
                          
                            ulong
                          
                          .MaxValue);
    Marshal.Copy(
                          
                            new
                          
                           nint(Unsafe.AsPointer(
                          
                            ref
                          
                           foobar)), bytes, 0, bytes.Length);
    Console.WriteLine(BitConverter.ToString(bytes));
    Console.WriteLine($"
                          
                            Environment.Is64BitProcess = {Environment.Is64BitProcess}
                          
                          ");
    Console.WriteLine($"
                          
                            sizeof(ulong) = {sizeof(ulong)}
                          
                          ");
    Console.WriteLine($"
                          
                            IntPtr.Size = {IntPtr.Size}
                          
                          ");
}


                          
                            public
                          
                           record 
                          
                            struct
                          
                           Foobar(
                          
                            byte
                          
                           Foo, 
                          
                            ulong
                          
                           Bar);

从如下的输出可以看出，当前的环境与CLI Spec描述的32位处理器架构是一致的，但是ulong类型的字段Bar采用的对其长度是8字节而不是4字节（如果采用4字节对其的话，二进制形式应该FF-00-00-00-FF-FF-FF-FF-FF-FF-FF-FF-FF，如果保证Foobar自身按照8字节对齐，结果也应该是FF-00-00-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-00-00-00-00）.

对于这个问题，我们目前尚未找到一个权威的答案，莫不是我对CLI Spec的解读有误？还是我们的验证程序有问题？希望对此熟悉的朋友不吝赐教！我们目前Google如下这些相关的说法:

Memory alignment on a 32-bit Intel processor 。

The usual rule of thumb (straight from Intels and AMD's optimization manuals) is that every data type should be aligned by its own size. An int32 should be aligned on a 32-bit boundary, an int64 on a 64-bit boundary, and so on. A char will fit just fine anywhere. 。

Another rule of thumb is, of course "the compiler has been told about alignment requirements". You don't need to worry about it because the compiler knows to add the right padding and offsets to allow efficient access to data. 。

WHY IS THE DEFAULT ALIGNMENT FOR `INT64_T` 8 BYTE ON 32 BIT X86 ARCHITECTURE？

Interesting point: If you only ever load it as two halves into 32bit GP registers, then 4B alignment means those operations will happen with their natural alignment. 。

However, it's probably best if both halves of the variable are in the same cache line, since almost all accesses will read / write both halves. Aligning to the natural alignment of the whole thing takes care of that, even ignoring the other reasons below. 。

32bit x86 can load 64bit integers in a single 64bit-load using MMX or SSE2 movq . Handling 64bit add/sub/shift/ and bitwise booleans using vector instructions is more efficient (single instruction), as long as you don't need immediate constants or mul or div. The vector instructions with 64b elements are still available in 32b mode. 。

Atomic 64bit compare-and-exchange is also available in 32bit mode ( lock CMPXCHG8B m64 works just like 64bit mode's lock CMPXCHG16B m128 , using two implicit registers (edx:eax)). IDK what kind of penalty it has for crossing a cache-line boundary. 。

Modern x86 CPUs have essentially no penalty for misaligned loads/stores unless they cross cache-line boundaries, which is why I'm only saying that, and not saying that misaligned 64b would be bad in general. See the links in the x86 wiki, esp. Agner Fog's guides. 。

Why is the "alignment" the same on 32-bit and 64-bit systems？

MSVC targeting 32-bit x86 gives __int64 a minimum alignment of 4, but its default struct-packing rules align types within structs to min(8, sizeof(T)) relative to the start of the struct. (For non-aggregate types only). That's not a direct quote, that's my paraphrase of the MSVC docs link from @P.W's answer, based on what MSVC seems to actually do. (I suspect the "whichever is less" in the text is supposed to be outside the parens, but maybe they're making a different point about the interaction on the pragma and the command-line option?) 。

做了如下的补充实验，证明ulong类型的对齐规则确实与CLI Spec一致的。莫非8-byte 数据类型本身和作为符合类型（struct/class）字段成员时采用不同的对齐规则?

x64：如下的断言总是成立的.

                          var random = 
                            new 
                             Random();

                            unsafe 
                            
{
   
                            long 
                             v = random.NextInt64();
   Debug.Assert(
                            new 
                             IntPtr(Unsafe.AsPointer(
                            ref 
                             v)).ToInt64() % 
                            8 
                             == 0),
                          } 。

x86：如下的断言也总是成立的。

                          var random = 
                            new 
                             Random();

                            unsafe 
                            
{
  
                            long 
                             v = random.NextInt64();
  Debug.Assert(
                            new 
                             IntPtr(Unsafe.AsPointer(
                            ref 
                             v)).ToInt32() % 
                            4 
                             == 0),
                          } 。

x86：如下的断言就不能保证都成立。

                          var random = 
                            new 
                             Random();

                            unsafe 
                            
{
    
                            long 
                             v = random.NextInt64();
    Debug.Assert(
                            new 
                             IntPtr(Unsafe.AsPointer(
                            ref 
                             v)).ToInt32() % 
                            8 
                             == 0),
                          } 。

最后此篇关于Int64针对32位架构是按照4字节还是8字节对齐？的文章就讲到这里了,如果你想了解更多关于Int64针对32位架构是按照4字节还是8字节对齐？的内容请搜索CFSDN的文章或继续浏览相关文章，希望大家以后支持我的博客！。

文章推荐：极速安装kubernetes-1.22.0（三台CentOS7服务器）

文章推荐：并发编程---信号量线程同步

文章推荐： WebSSH远程管理Linux服务器、Web终端窗口自适应（二）

文章推荐：订单超时自动取消的技术方案解析及代码实现

c# - 字节 + 字节 = 未知结果
美好的一天!我试图添加两个字节变量并注意到奇怪的结果。 byte valueA = 255; byte valueB = 1; byte valueC = (byte)(valueA + valueB
ios - 转换[字节]？到[字节]
嗨，我是 swift 的新手，我正在尝试解码以 [Byte] 形式发回给我的字节数组？当我尝试使用 if let string = String(bytes: d, encoding: .utf8)
postgresql - 由于 IPV6 需要 128 位(16 字节)那么为什么在 postgres CIDR 数据类型中存储为 24 字节(8.1)和 19 字节(9.1)？
我正在使用 ipv4 和 ipv6 存储在 postgres 数据库中。因为 ipv4 需要 32 位(4 字节)而 ipv6 需要 128(16 字节)位。那么为什么在 postgres 中 CI
string - []字节(字符串)与[]字节(*字符串)
我很好奇为什么 Go 不提供 []byte(*string) 方法。从性能的角度来看，[]byte(string) 不会复制输入参数并增加更多成本(尽管这看起来很奇怪，因为字符串是不可变的，为什么要复
客户端发送 500 字节，但服务器接收 244 字节 - 套接字编程？
我正在尝试为UDP实现Stop-and-Wait ARQ。根据停止等待约定，我在 0 和 1 之间切换 ACK。正确的 ACK 定义为正确的序列号(0 或 1)AND消息长度。以下片段是我的代码的
php - filesize() 始终读取 0 字节，即使文件大小不是 0 字节
我在下面写了一些代码，目前我正在测试，所以代码中没有数据库查询。下面的代码显示 if(filesize($filename) != 0) 总是转到 else，即使文件不是 0 字节而是 16 字节那
java - 无法读取整个 header ；读取 0 字节；预计 512 字节
我使用 Apache poi 3.8 来读取 xls 文件，但出现异常: java.io.IOException: Unable to read entire header; 0 by
python - 为什么在调用 .clear() 后字典大小为 72 字节，而实例化时为 240 字节？
字典大小为 72 字节(根据 getsizeof(dict) 在字典上调用 .clear() 之后发生了什么，当新实例化的字典返回 240 字节时？我知道一个简单的 dict 的起始大小为“8”，并
c - 将 4 字节 int 交织到 8 字节 int
我目前正在努力创建一个函数，它接受两个 4 字节无符号整数，并返回一个 8 字节无符号长整数。我试图将我的工作基于 this research 描述的方法，但我的所有尝试都没有成功。我正在处理的具体输
c++ - 将 4 字节 int 解释为 4 字节 float
看看这个简单的程序: #include using namespace std; int main() { unsigned int i=0x3f800000; float* p=(float*)(
java - Java 中的字符串 "8000000000000000"(16 字节)相当于 "BCD"(8 字节)
我创建了自己的函数，将一个字符串转换为其等效的 BCD 格式的 bytes[]。然后我将此字节发送到 DataOutputStram (使用需要 byte[] 数组的写入方法)。问题出在数字字符串“8
c - 带有静态堆的小块内存分配器(典型值 <= 16 字节，稀有值 >= 64 字节，最大值 = 192)
此分配器将在具有静态内存的嵌入式系统中使用(即，没有可用的系统堆，因此“堆”将只是“char heap[4096]”) 周围似乎有很多“小型内存分配器”，但我正在寻找能够处理非常小的分配的一个。我说的
sql-server - 警告!最大 key 长度为 900 字节。索引的最大长度为 1000 字节
我将数据库脚本从 64 位系统传输到 32 位系统。当我执行脚本时，出现以下错误， Warning! The maximum key length is 900 bytes. The index 'U
linux - 128 字节 Ext2 和 256 字节 Ext3 的 inode 数据结构差异
想知道 128 字节 ext2 和 256 字节 ext3 文件系统之间的 inode 数据结构差异。我一直在为 ext2、128 字节 inode 使用此引用:http://www.nongnu.
java - Cassandra = 内存/编码- key 占用空间(哈希/字节[]=>十六进制=>UTF16=>字节[])
我试图理解使用 MD5 哈希作为 Cassandra key 在“内存/存储消耗”方面的含义: 我的内容(在 Java 中)的 MD5 哈希 = byte[] 长 16 个字节。 (16 字节来自维基
linux - 需要帮助 - 出现错误 : xrealloc: subst. c:4072: 无法重新分配 1073741824 字节(已分配 0 字节)
检查其他人是否也遇到类似问题。 shell脚本中的代码: ## Convert file into Unix format first. ## THIS is IMPORTANT. ###
c++ - x86 4 字节 float 与 8 字节 double (与 long long 相比)？
我们有一个测量数据处理应用程序，目前所有数据都保存为 C++ float，这意味着在我们的 x86/Windows 平台上为 32 位/4 字节。 (32 位 Windows 应用程序)。由于精度成
java - Long 的大小为 8 字节，那么在 JAVA 中如何将 'promoted' 转换为 float (4 字节)？
我读到在 Java 中 long 类型可以提升为 float 和 double ( http://www.javatpoint.com/method-overloading-in-java )。我想问
python - 将 n 个元素(大小 = 2 字节，十进制)的列表拆分为 2n 个元素(大小 = 1 字节，十六进制)
我有一个包含 n 个十进制元素的列表，其中每个元素都是两个字节长。可以说: x = [9000 , 5000 , 2000 , 400] 这个想法是将每个元素拆分为 MSB 和 LSB 并将其存储在
1 个 block (16 字节)的 Java AES-128 加密返回 2 个 block (32 字节)作为输出
我使用以下代码进行 AES-128 加密来编码一个 16 字节的 block ，但编码值的长度给出了 2 个 32 字节的 block 。我错过了什么吗？ plainEnc = AES.enc

我是一只小鸟

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Int64针对32位架构是按照4字节还是8字节对齐？