- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我被告知并从英特尔的手册中读到可以将指令写入内存,但指令预取队列已经获取过时的指令并将执行那些旧指令。我没有成功地观察到这种行为。我的方法如下。
英特尔软件开发手册第 11.6 节指出
A write to a memory location in a code segment that is currently cached in the processor causes the associated cache line (or lines) to be invalidated. This check is based on the physical address of the instruction. In addition, the P6 family and Pentium processors check whether a write to a code segment may modify an instruction that has been prefetched for execution. If the write affects a prefetched instruction, the prefetch queue is invalidated. This latter check is based on the linear address of the instruction.
int fd = open("code_area", O_RDWR | O_CREAT, S_IRWXU | S_IRWXG | S_IRWXO);
assert(fd>=0);
write(fd, zeros, 0x1000);
uint8_t *a1 = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FILE | MAP_SHARED, fd, 0);
uint8_t *a2 = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FILE | MAP_SHARED, fd, 0);
assert(a1 != a2);
fun:
push %rbp
mov %rsp, %rbp
xorq %rax, %rax # Return value 0
# A far jump simulated with a far return
# Push the current code segment %cs, then the address we want to far jump to
xorq %rsi, %rsi
mov %cs, %rsi
pushq %rsi
leaq copy(%rip), %r15
pushq %r15
lretq
copy:
# Overwrite the two nops below with `inc %eax'. We will notice the change if the
# return value is 1, not zero. The passed in pointer at %rdi points to the same physical
# memory location of fun_ins, but the linear addresses will be different.
movw $0xc0ff, (%rdi)
fun_ins:
nop # Two NOPs gives enough space for the inc %eax (opcode FF C0)
nop
pop %rbp
ret
fun_end:
nop
a1
调用函数,但我传递了一个指向
a2
的指针作为代码修改的目标。
#define DIFF(a, b) ((long)(b) - (long)(a))
long sz = DIFF(fun, fun_end);
memcpy(a1, fun, sz);
void *tochange = DIFF(fun, fun_ins);
int val = ((int (*)(void*))a1)(tochange);
最佳答案
我想,你应该检查 MACHINE_CLEARS.SMC
CPU 的性能计数器(MACHINE_CLEARS
事件的一部分)(它在 Sandy Bridge 1 中可用,用于您的 Air powerbook;也可以在您的 Xeon 上可用,即 Nehalem 2 - 搜索“smc” )。您可以使用 oprofile
, perf
或英特尔的 Vtune
找到它的值(value):
http://software.intel.com/sites/products/documentation/doclib/iss/2013/amplifier/lin/ug_docs/GUID-F0FD7660-58B5-4B5D-AA9A-E1AF21DDCA0E.htm
Machine Clears
Metric Description
Certain events require the entire pipeline to be cleared and restarted from just after the last retired instruction. This metric measures three such events: memory ordering violations, self-modifying code, and certain loads to illegal address ranges.
Possible Issues
A significant portion of execution time is spent handling machine clears. Examine the MACHINE_CLEARS events to determine the specific cause.
MACHINE_CLEARS Event Code: 0xC3 SMC Mask: 0x04
Self-modifying code (SMC) detected.
Number of self-modifying-code machine clears detected.
This event fires when self-modifying code is detected. This can be typically used by folks who do binary editing to force it to take certain path (e.g. hackers). This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel 64 and IA-32 processors. The modified cache line is written back to the L2 and LLC caches. Also, the instructions would need to be re-loaded hence causing performance penalty.
mov
;你的 nops 已经在筹备中。但是 SMC 将在 mov 退休时提高,它将杀死所有正在筹备中的东西,包括 nops。
The penalty for executing a piece of code immediately after modifying it is approximately 19 clocks for P1, 31 for PMMX, and 150-300 for PPro, P2, P3, PM. The P4 will purge the entire trace cache after self-modifying code. The 80486 and earlier processors require a jump between the modifying and the modified code in order to flush the code cache. ...
Self-modifying code is not considered good programming practice. It should be used only if the gain in speed is substantial and the modified code is executed so many times that the advantage outweighs the penalties for using self-modifying code.
Placing writable data in the code segment might be impossible to distinguish from self-modifying code. Writable data in the code segment might suffer the same performance penalty as self-modifying code.
Software should avoid writing to a code page in the same 1-KByte subpage that is being executed or fetching code in the same 2-KByte subpage of that is being written. In addition, sharing a page containing directly or speculatively executed code with another processor as a data page can trigger an SMC condition that causes the entire pipeline of the machine and the trace cache to be cleared. This is due to the self-modifying code condition.
I'm a little confused about the manual mentioning P6 and Pentium processors
11.6 SELF-MODIFYING CODE A write to a memory location in a code segment that is currently cached in the processor causes the associated cache line (or lines) to be invalidated. This check is based on the physical address of the instruction. In addition, the P6 family and Pentium processors check whether a write to a code segment may modify an instruction that has been prefetched for execution. If the write affects a prefetched instruction, the prefetch queue is invalidated. This latter check is based on the linear address of the instruction. For the Pentium 4 and Intel Xeon processors, a write or a snoop of an instruction in a code segment, where the target instruction is already decoded and resident in the trace cache, invalidates the entire trace cache. The latter behavior means that programs that self-modify code can cause severe degradation of performance when run on the Pentium 4 and Intel Xeon processors.
In practice, the check on linear addresses should not create compatibility problems among IA-32 processors. Applications that include self-modifying code use the same linear address for modifying and fetching the instruction.
Systems software, such as a debugger, that might possibly modify an instruction using a different linear address than that used to fetch the instruction, will execute a serializing operation, such as a CPUID instruction, before the modified instruction is executed, which will automatically resynchronize the instruction cache and prefetch queue. (See Section 8.1.3, “Handling Self- and Cross-Modifying Code,” for more information about the use of self-modifying code.)
For Intel486 processors, a write to an instruction in the cache will modify it in both the cache and memory, but if the instruction was prefetched before the write, the old version of the instruction could be the one executed. To prevent the old instruction from being executed, flush the instruction prefetch unit by coding a jump instruction immediately after any write that modifies an instruction
Self modifying code is detected using a translation lookaside buffer .. [which] has physical page addresses stored therein over which snoops can be performed using the physical memory address of a store into memory. ... To provide finer granularity than a page of addresses, FINE HIT bits are included with each entry in the cache associating information in the cache to portions of a page within memory.
Therefore snoops, triggered by store instructions into memory, can perform SMC detection by comparing the physical address of all instructions stored within the instruction cache with the address of all instructions stored within the associated page or pages of memory. If there is an address match, it indicates that a memory location was modified. In the case of an address match, indicating an SMC condition, the instruction cache and instruction pipeline are flushed by the retirement unit and new instructions are fetched from memory for storage into the instruction cache.
Because snoops for SMC detection are physical and the ITLB ordinarily accepts as an input a linear address to translate into a physical address, the ITLB is additionally formed as a content-addressable memory on the physical addresses and includes an additional input comparison port (referred to as a snoop port or reverse translation port)
关于c - 使用自修改代码观察 x86 上的陈旧指令提取,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17395557/
我想知道是否有一种方法可以重复记录而不进行排序?有时候,我想保持原始顺序,只想删除重复的记录。 是否可以? 顺便说一句,以下是我所知道的有关重复记录的信息,这些记录最终会进行排序。 1。 proc s
我想更新我的 Activity 中依赖于另一个列表的数据的列表。这两个数据列表都是从我的 View 模型的 Activity 中观察到的。从第一个列表获取数据后,我需要在此列表上运行 for 循环以获
我无法理解这个问题。我怎样才能等待 i==2 完成然后再继续其他 i 的操作? class Observable { constructor() { this.observer
我正在观察这样的 Ember Data RecordArray: myArray: function() { return MyRecord.find(); }.property(), isDir
我想在动画开始时观察 strokeEnd 键路径。但是它不起作用,我哪里出错了? - (void)addAnimation { // do animation CABasicAnima
是否可以在 Algorand 中观看某个交易,就像在以太坊中观看某个事件一样? 最佳答案 官方 algod 和 indexer API 目前不支持在 Algorand 上观看交易/事件。 您可以通过使
我有一个可以拖放到其他 View 之上的 View (可以说是类别)。为了检测我在哪个类别 View 之上,我将它们的帧存储在一个帧数组中,这发生在它们不可见叠加层的 onAppear 中。 (这基于
是否可以将观察者添加到可见性更改(即调用 show() 和 hide())时触发的 DOM 元素?谢谢! 最佳答案 如果您想观察任何对 .show() 或 .hide() 的调用,并且可以访问 jQu
我对保存在 NSUserdefaults 中的特定键的值变化感兴趣。然而,我所拥有的并不适合我。 observeValueForKeyPath 不会被触发。 更新:我想我已经发现了这个问题。如果我使用
我正在寻找在 UITableView 顶部实现捏入/捏出,我已经研究了几种方法,包括这个: Similar question 但是,虽然我可以创建一个 UIViewTouch 对象并将其覆盖到我的 U
我有一个在界面中公开的可变数组。我还公开了数组访问器来修改数组。如果数组内发生任何修改,我将不得不使用 KVO 重置并重新计算一些数据。为了支持 KVO,我使用 array accessors如下图:
当 NSPopupButton 发生变化时如何获得方法调用?谢谢! 最佳答案 您只需添加一个操作方法,就像使用 NSButton 或任何其他控件一样。 关于iphone - 观察 NSPopupBut
我正在尝试让键值观察适用于 NSMutableArray。下面是被观察类 MyObservee 的 .h 文件: @interface MyObservee : NSObject { @pri
我很难理解让 Node.js 进程(异步)运行但仍然触发“退出”状态,以便在 CPU 处理完成后我可以做更多事情。 例如,我有一个 Google 地方信息抓取工具,可以在所有可用的 CPU 上高效地分
我正在尝试编写行为类似于kubectl get pods --watch . 这样,每次 pod 的状态发生变化时,我都会被触发。 我创建了一个 go项目(在集群中运行)并添加以下代码: podsWa
我有这个代码: 当时我需要触发Javascript方法或具有给定 id 的 div 隐藏或显示,这将在屏幕调整大小期间发生(因此 u k-hidden-small ),这可以
我想使用 Couchbase,但我想在一些类似于 RethinkDB 的方式实现更改跟踪。 似乎有很多方法可以将更改从 Couchbase 服务器推送给我。 DCP 点击 XDCR 哪一个是正确的选择
虽然 MutationObserver 允许监视 HTMLElement 属性的显式大小更改,但它似乎没有一种方法/配置允许我监视其大小的隐式更改,这些更改是由浏览器。 这是一个例子: const o
我有一个 auto-carousel 指令,它循环访问链接元素的子元素。 但是,子级尚未加载到 DOM 中,因为它们的 ng-if 表达式尚未解析。 如何确保父指令知道其 DOM 树已发生更改?
有没有办法观察 AngularJS 指令中函数表达式的值变化?我有以下 HTML 和 JavaScript,模板中 {{editable()}} 的插值显示该值计算为 true,而检查 Chrome
我是一名优秀的程序员,十分优秀!