Word Addressing in a Single Processor Cache(单处理器缓存中的字寻址)-6ren

Word Addressing in a Single Processor Cache(单处理器缓存中的字寻址)

转载作者：bug小助手更新时间：2023-10-28 11:19:25

I've recently come across this description of a single processor cache saying that it's

我最近看到这样一个单处理器缓存的描述，说它是

"Word addressed (addresses are left shifted by 2 by adding “00” to end of address inside the processor, this implies that it can address 2^32*4 = 16GBytes of memory"

“已寻址的字(地址左移2，在处理器内部的地址末尾加上”00“，这意味着它可以寻址2^32*4=16 GB的内存”

I understand that word addressing means that each consecutive address holds a word of data versus byte addressing which holds a byte of data at each address. I further understand that left shifting address by 2 means multiplying the address by 4 so we are trying to obtain multiples of 4, but doesn't that imply that each address holds a byte of data thus this is not word addressing but in fact byte addressing and the processor has logic in which we only access a word at a time despite what the memory is like?

据我所知，字寻址意味着每个连续的地址保存一个字的数据，而字节寻址则在每个地址保存一个字节的数据。我进一步理解，左移地址2意味着乘以4的地址，所以我们试图获得4的倍数，但这不意味着每个地址持有一个字节的数据，因此这不是字寻址，但事实上字节寻址和处理器的逻辑，我们只访问一个字的时间，不管内存是什么样的？

So far I am confused on whether word addressed just means we have logic in place to access a word at a time or the actual memory is formatted in a way that each address will hold a whole word and not a byte

到目前为止，我感到困惑的是，字寻址只是意味着我们具有一次访问一个字的逻辑，还是实际存储器的格式化方式是每个地址将保存整个字而不是一个字节

更多回答

I forgot to mention but it's also mentioned that this single processor cache as a 32 bit address space

我忘了提了，但也提到了这个单处理器缓存作为32位地址空间

Is this a word-addressed cache in a system that uses byte-addressable memory? If the whole system is word-addressable, those 2 low bits of the address don't exist anywhere, and the minimum load/store size is a whole word.

这是使用字节可寻址内存的系统中的字寻址缓存吗？如果整个系统是可字寻址的，则地址的这两个低位不存在，并且最小加载/存储大小是一个完整的字。

By those 2 low bits of addressing not existing anywhere you mean they're both set to 0s right since we're in a 32 bit address space anyway?

这两个低位的地址在任何地方都不存在，你的意思是它们都被设置为0，因为我们无论如何都在32位地址空间中？

No, I mean there wouldn't be a place to have zeros. If the whole system (not just some cache) was word-addressable, the system could address 2^32 words, but not 2^34 separate bytes. The memory system wouldn't involve bytes at all, only 32-bit chunks of data. If software wanted to use shifts and stuff to pack / unpack octets of bits into words, that's fine but it wouldn't involve memory addresses. (And humans could think about that as accessing "bytes" of a word, but in terms of the memory system there'd be no such thing as a byte, and definitely not with each having its own address)

不，我的意思是不会有零的地方。如果整个系统(而不仅仅是一些缓存)是可字寻址的，则系统可以寻址2^32个字，但不能寻址2^34个单独的字节。存储系统将完全不涉及字节，只涉及32位数据块。如果软件想要使用移位和填充将八位字节打包/解包成字，这很好，但它不涉及内存地址。(人们可以认为这是在访问一个字的“字节”，但就存储系统而言，不会有字节这样的东西，而且每个字节肯定不会有自己的地址)

"word addressing means that each consecutive address holds a word of data versus byte addressing which holds a byte of data at each address" This is correct and standard usage for the terms word addressable vs. byte addressable. See also en.wikipedia.org/wiki/Word_addressing, en.wikipedia.org/wiki/Byte_addressing

字寻址意味着每个连续的地址保存一个数据字，而字节寻址在每个地址保存一个字节的数据。这是术语字可寻址与字节可寻址的正确和标准用法。另请参阅en.wikipedia.org/wiki/word_Addressing，en.wikipedia.org/wiki/Byte_Addressing

优秀答案推荐

It would help to know where this description came from but nevertheless, perhaps the following helps illuminate what might be going on.

知道这个描述是从哪里来的将会有所帮助，但尽管如此，下面的内容可能有助于阐明可能发生的事情。

When we talk about byte- versus word-addressable memories (caches, RAM, SSDs, or otherwise) we are referring to the minimum width of data that can be referenced by an address. Regardless of the addressable width of a single location, the memory will still hold bytes, grouped together in 4s for words, or 8s for double-words, etc. Just the same as a byte is always made of bits (in modern systems: typically 8 bits per byte, but that's not always the case.)

当我们谈论字节可寻址存储器与字可寻址存储器(缓存、RAM、SSD或其他)时，我们指的是地址可以引用的最小数据宽度。无论单个位置的可寻址宽度如何，内存仍将保存字节，对于字，分组为4，对于双字，分组为8，依此类推。就像字节总是由位组成一样(在现代系统中：通常是每字节8位，但情况并不总是如此)。

For the following exmaples, let's assume we have 8 address bits. (Though real systems usually have between 16 and 54 address bits. None, that I know of, have full 64-bit addresses.)

对于下面的示例，让我们假设我们有8个地址位。(尽管实际系统通常具有16到54个地址位。据我所知，没有一个拥有完整的64位地址。)

With a byte-addressable memory, 8 address bits allows us to reference 2^8 locations. To get the size of memory, we multiply by the size (in bytes) of each addressed location. That gives us 2^8 * 1byte = 256 bytes

使用字节可寻址存储器，8个地址位允许我们引用2^8个位置。为了得到内存的大小，我们乘以每个寻址位置的大小(以字节为单位)。这给了我们2^8*1byte=256个字节

With a word-addressable memory, 8 address bits allows us to reference the same number of locations (2^8.) However, the size (in bytes) of each addressed location is now 4 bytes. That gives us 2^8 * 4 = 1024 bytes (1KiB.)

对于字可寻址存储器，8个地址位允许我们引用相同数量的位置(2^8)。但是，每个寻址位置的大小(以字节为单位)现在是4个字节。这给我们提供了2^8*4=1024字节(1KiB。)

Now, the confusion starts to arise when we look at where the address bits are coming from.

现在，当我们查看地址位从哪里来时，混淆开始出现。

Let's say I have 2 instruction sets (ISAs). Both have load/store instructions that take an address to memory. In one ISA, the instructions use byte-addressing. In the other, the instructions use word-addressing.

假设我有两个指令集(ISA)。两者都有将地址带到内存的加载/存储指令。在一个ISA中，指令使用字节寻址。在另一种情况下，指令使用字寻址。

ld r1, 0x12

Looking at the byte index in memory that such a load instruction accesses, what would we expect in the two ISAs?

看看这样的加载指令访问的内存中的字节索引，我们会在两个ISA中期待什么？

In the byte-addressed ISA, it's easy: the byte address is 0x12 (decimal: 18.)

在字节寻址的ISA中，这很容易：字节地址是0x12(十进制：18)。

In the word-addressed ISA, it's different. The 0x12 is an address that refers to a word in memory, rather than an individual byte. To get the index in memory as a number of bytes, we multiply by 4 (bytes per word.) That is to say, we shift the address left by 2. So 0x12 becomes 0x48 (decimal: 72.)

在以单词为地址的ISA中，情况就不同了。0x12是指存储器中的字而不是单个字节的地址。要以字节数表示内存中的索引，我们需要乘以4(每个字的字节数)。也就是说，我们将地址左移2。因此0x12变成0x48(十进制：72。)

Okay, so an Instruction Set Architecture (ISA) can define how a number is interpreted - whether it's a word-address or byte-address.

好的，指令集体系结构(ISA)可以定义如何解释一个数字--它是字地址还是字节地址。

Additionally, when it comes to caches, hardware doesn't behave in the neat way the ISA presents to software engineers.

此外，当涉及到缓存时，硬件的行为并不像ISA向软件工程师展示的那样简洁。

For example, modern caches may present a dword-addressed, word-addressed or byte-addressed interface to the processor core (or other variations.) In a word-addressed scenario, to store an individual byte, the cache controller must first load the whole word in which the byte resides into a temporary internal buffer, update it with the byte being stored, then store that word back into the cache's SRAM array. This is known as a read-modify-write (RMW.)

例如，现代高速缓存可以向处理器核心(或其他变体)提供双字寻址、字寻址或字节寻址的接口。在字寻址方案中，要存储单个字节，高速缓存控制器必须首先将该字节所在的整个字加载到临时内部缓冲区中，用存储的字节更新它，然后将该字存储回高速缓存的SRAM数组中。这称为读-修改-写(RMW)。

See Are there any modern CPUs where a cached byte store is actually slower than a word store? for some real examples such as ARM Cortex-A15 where ARM's manual explains that updating the ECC (Error Correction Code) data for the 32-bit chunk is part of the reason for an RMW being needed for isolated byte or 16-bit stores. (ARM is a byte-addressable ISA, unlike the one described in the question.)

请参阅是否有任何现代CPU的缓存字节存储实际上比字存储慢？对于一些真实的例子，例如ARM Cortex-A15，ARM的手册解释说，更新32位块的ECC（纠错码）数据是隔离字节或16位存储需要RMW的部分原因。(ARM是字节可寻址ISA，与问题中描述的ISA不同。）

RMW is relatively straightforward for a single byte. It's more complex for, say, a half-word (i.e. 2 byte) load/store on a byte-addressable memory. This is because the 2 bytes may cross a natural word boundary. Thus, it may be that two whole words must be loaded/stored to access a single half-word.

对于单字节而言，RMW相对简单。比方说，在字节可寻址存储器上进行半字(即2字节)加载/存储要复杂得多。这是因为这2个字节可能跨越自然字边界。因此，可能必须加载/存储两个完整字才能访问单个半字。

With word-addressable memory, the half-word will (only ever) be the bottom 2 bytes of a naturally aligned word, making such a 2-word RMW impossible/unnecessary. However, it would also be impossible to directly perform a half-word load/store on the upper half-word of any given word, since word-addressing doesn't allow for referencing that upper half-word directly. (Some ISAs offer separate load/store instructions to access the different positions within a word - such as load half-word/load-byte instructions that take a word address and an index within the word.)

对于字可寻址存储器，半字将(永远)是自然对齐字的底部2个字节，使得这样的2字RMW不可能/不必要。然而，也不可能在任何给定字的上半字上直接执行半字加载/存储，因为字寻址不允许直接引用该上半字。(有些ISA提供单独的加载/存储指令来访问字内的不同位置，例如加载半字/加载字节指令，这些指令接受字地址和字内的索引。)

Going back to the cache: On the other side of the cache (the interface to main memory or other layers of cache), neither byte- nor word-addressing are used. Instead, row-addresses are used, where a row may be 64, 128 or more bytes in size.

返回到缓存：在缓存的另一端(到主内存或其他缓存层的接口)，既不使用字节寻址，也不使用字寻址。取而代之的是使用行地址，其中行的大小可以是64、128或更多字节。

So, what's the point of all this? At the end of the day, it all comes down the physical hardware resources. If you know that something is word-addressed, you don't need to store the bottom 2 bits of the address because (if they were needed at all) they are guaranteed to always be zero. We can save silicon by not storing/communicating more bits than necessary. We can also simplify hardware logic by knowing certain scenarios are impossible (such as the split half-word load/store described earlier.) Knowing they're impossible cases means we don't need logic (i.e. hardware) to handle them.

那么，这一切有什么意义呢？归根结底，这一切都会影响到物理硬件资源。如果您知道某些内容是按字寻址的，则不需要存储地址的最低2位，因为(如果需要的话)它们保证始终为零。我们可以通过不存储/传递不必要的更多位来节省硅。我们还可以通过知道某些场景是不可能的(例如前面描述的拆分半字加载/存储)来简化硬件逻辑。知道它们是不可能的情况意味着我们不需要逻辑(即硬件)来处理它们。

Additionally, if you consider the bits needed to encode an instruction (in either of the earlier 2 ISA examples), by using word-addressing, the ISA can access (/address) more memory without needing more encoding bits per instruction.

此外，如果考虑编码指令所需的位(在前面的两个ISA示例中)，通过使用字寻址，ISA可以访问(/地址)更多内存，而不需要每条指令更多的编码位。

Word-addressing also offers other practical benefits. Ask yourself, why don't we have "bit addressing"? Most memory cannot be accessed on a bit-by-bit or byte-by-byte basis. At best, a whole word or whole row must be loaded/stored. So word-addressing saves hardware the effort of masking off the bottom two address bits, loading a word, then masking&shifting the loaded word to make it align to the address.

单词寻址还提供了其他实际好处。问问你自己，为什么我们没有“位寻址”？大多数内存不能逐位或逐字节地访问。最多只能加载/存储整个单词或整行。因此，字寻址省去了硬件屏蔽底部两个地址位的工作，加载一个字，然后屏蔽和移位加载的字，使其与地址对齐。

For loads/stores, the issue of word- versus byte-addressing also overlaps with the issue of (natural) alignment of the address. But that's a topic for another question.

对于加载/存储，字对字节寻址的问题也与地址(自然)对齐的问题重叠。但这是另一个问题的主题。

更多回答

AFAIK, all ISAs that provide byte stores at all make them thread-safe. If an RMW cycle to update the containing word in cache is necessary, it won't visibly step on stores from another core (or DMA from a device). So the RMW cycle has to be inside the cache, holding on to MESI ownership of the cache line for the duration, not just something the core does which the cache isn't aware of. (Unless you're on a single-core system with DMA that isn't cache-coherent.) See Can modern x86 hardware not store a single byte to memory? (which covers non-x86 as well)

AFAIK，所有提供字节存储的ISA都使它们是线程安全的。如果需要RMW周期来更新缓存中的包含字，则它不会明显地涉及来自另一个核心的存储(或来自设备的DMA)。因此，RMW周期必须在高速缓存内，在此期间保持高速缓存线的MESI所有权，而不仅仅是核心所做的高速缓存不知道的事情。(除非您在具有非缓存一致性的DMA的单核系统上。)看看现代x86硬件能不能将一个字节存储到内存中？(也包括非x86)

Agreed in so far as it's ownership of the word in cache (and thus across all caches) that facilitates the behaviour. Where that logic actually sits in hardware is variable and matters little. Apologies, my answer was aimed at illustrating the work involved, rather than an in-depth description of the optimised operation of a byte store by RMW.

只要缓存中的单词的所有权(从而跨所有缓存)促进了行为，就达成了一致。这种逻辑在硬件中的实际位置是可变的，无关紧要。抱歉，我的回答旨在说明所涉及的工作，而不是深入描述RMW对字节存储的优化操作。

Yeah, I did upvote it since it walks through the relevant concepts. But I worry that simplified explanations can give rise to misconceptions about more "advanced" topics (or worse, support existing ones). Sometimes it's possible to choose phrasing that's still simple but avoids implying anything technically incorrect even without bringing up the other thing (like atomicity and thread-safety for the surrounding bytes).

是的，我对它的评价更高了，因为它涉及到了相关的概念。但我担心简单化的解释可能会导致人们对更高级的主题产生误解(或者更糟糕的是，支持现有的主题)。有时，可以选择仍然简单的措辞，但避免暗示任何技术上不正确的内容，即使不提出其他内容(如周围字节的原子性和线程安全性)。

So I might say something like "the <s>core</s> cache controller must first load or access the containing word...", to imply that this happens inside the cache controller's logic, not like data load/store operations from the core (which could be separated from each other by the length of the store buffer, on CPUs that have one.) On a simple CPU without a store buffer, the mental picture readers get is essentially the same, and if they're not thinking about thread-safety then that complication isn't brought up at all. But if they do, the phrasing guides them the right way.

因此，我可能会说“核缓存控制器必须首先加载或访问包含字...”，以暗示这发生在缓存控制器的逻辑内部，而不像来自核的数据加载/存储操作(在具有存储缓冲区的CPU上，这些操作可以通过存储缓冲区的长度彼此分开)。在没有存储缓冲区的简单CPU上，读者得到的印象基本上是一样的，如果他们没有考虑线程安全，那么根本就不会提出这种复杂性。但如果他们这样做了，措辞就会引导他们走上正确的道路。

I made an edit with my suggestion. Feel free to roll back or edit further if you have a different idea you like better.

我根据我的建议做了编辑。如果您有更喜欢的不同想法，可以随时回滚或进一步编辑。

文章推荐： iphone - 如何拦截点击 UITextView 中的链接？

java - JSR 107 - 缓存 (JCache) 与 CPU 缓存
我阅读了有关 JSR 107 缓存 (JCache) 的内容。我很困惑:据我所知，每个 CPU 都管理其缓存内存(无需操作系统的任何帮助)。那么，为什么我们需要 Java 缓存处理程序？ (如果C
jquery - 使用 jQuery 缓存，缓存 jQuery Sortable 对象
好吧，我是 jQuery 的新手。我一直在这里和那里搞乱一点点并习惯它。我终于明白了(它并不像某些人想象的那么难)。因此，鉴于此链接:http://jqueryui.com/sortable/#dis
hibernate 缓存？
我正在使用 Struts 2 和 Hibernate。我有一个简单的表，其中包含一个日期字段，用于存储有关何时发生特定操作的信息。这个日期值显示在我的 jsp 中。我遇到的问题是hibernate更
缓存-修复浏览器本地缓存页面
我有点不确定这里发生了什么，但是我试图解释正在发生的事情，也许一旦我弄清楚我到底在问什么，就可能写一个更好的问题。我刚刚安装了Varnish，对于我的请求时间来说似乎很棒。这是一个Magneto 2
haskell 缓存
解决 Project Euler 的问题后，我在论坛中发现了以下 Haskell 代码: fillRow115 minLength = cache where cache = ((map fill
Python包代理/缓存
我正试图找到一种方法来为我网络上的每台计算机缓存或存储某些 python 包。我看过以下解决方案: pypicache但它不再被积极开发，作者推荐 devpi，请参见此处:https://bitbuc
缓存 WebSocket
我想到的一个问题是可以从一开始就缓存网络套接字吗？在我的拓扑中，我在通过双 ISP 连接连接到互联网的 HAProxy 服务器后面有 2 个 Apache 服务器(带有 Google PageSpee
Linux内存管理(缓存)
我很难说出不同缓存区域 (OS) 之间的区别。我想简要解释一下磁盘\缓冲区\交换\页面缓存。他们住在哪里？它们之间的主要区别是什么？据我了解，页面缓存是主内存的一部分，用于存储从 I/O 设备获取的
LeetCode_数据结构设计_困难_460. LFU 缓存
1.题目请你为最不经常使用（LFU）缓存算法设计并实现数据结构。实现 LFUCache 类： LFUCache(int capacity) - 用数据结构的容量 capacity 初始化对象 in
LeetCode_数据结构设计_中等_146. LRU 缓存
1.题目请你设计并实现一个满足 LRU (最近最少使用) 缓存约束的数据结构。实现 LRUCache 类： ① LRUCache(int capacity) 以正整数作为容量 capacity
Django 缓存 - 删除某些页面的缓存
我想在访问该 View 时关闭某些页面的缓存。它适用于简单查询模型对象的页面。好像什么时候 'django.middleware.cache.FetchFromCacheMiddleware', 启
WiX ExePackage 缓存
documents为 ExePackage element state Cache属性的目的是 Whether to cache the package. The default is "yes".
Docker 缓存，它是如何工作的？
我知道 docker 用图层存储每个图像。如果我在一台开发服务器上有多个用户，并且每个人都在运行相同的 Dockerfile，但将镜像存储为 user1_myapp . user2 将其存储为 use
Codeigniter - 缓存 - 服务器？
在 Codeigniter 中没有出现缓存问题几年后，我发现了一个问题。我在其他地方看到过该问题，但没有适合我的解决方案。例如，如果我在 View 中更改一些纯 html 文本并上传新文件并按 F5
caching - Janusgraph 缓存
我在 Janusgraph 文档中阅读了有关 Janusgraph Cache 的内容。关于事务缓存，我几乎没有怀疑。我在我的应用程序中使用嵌入式 janusgrah 服务器。如果我只对例如进行读取
javascript - 有没有办法从终端重新启动无效/缓存？
我想知道是否有来自终端的任何命令可以用来匹配 Android Studio 中执行文件>使缓存无效/重新启动的使用。谢谢! 最佳答案 According to a JetBrains employe
python - 带有默认可选参数的内存/缓存
我想制作一个 python 装饰器来内存函数。例如，如果 @memoization_decorator def add(a, b, negative=False): print "Com
jquery - 缓存 $(this) 是否会带来性能提升？
我经常在 jQuery 事件处理程序中使用 $(this) 并且从不缓存它。如果我愿意的话 var $this = $(this); 并且将使用变量而不是构造函数，我的代码会获得任何显着的额外性能吗？
使用模式匹配禁止 Varnish 缓存
是的，我要说实话，我不知道varnish vcl，我可以解决一些基本问题，但是我不太清楚，这就是为什么我遇到问题了。我正在尝试通过http请求设置缓存禁止，但是该请求不能通过DNS而是通过 Varn
Varnish 缓存-无法处理4000个并发用户
在 WP 站点上加载约 4000 个并发用户时遇到此问题。这是我的配置: F5 负载均衡器 ---> Varnish 4，8 核，32 Gb RAM ---> 9 个后端，4 个核，每个 16 RA

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Word Addressing in a Single Processor Cache(单处理器缓存中的字寻址)