gpt4 book ai didi

multithreading - 使用clEnqueueMapBuffer和 'querying whether the command has finished'的OpenCL主机的内存可见性

转载 作者:行者123 更新时间:2023-12-03 13:03:39 36 4
gpt4 key购买 nike

OpenCL 1.1标准说(5.2.3):

If blocking_map is CL_FALSE i.e. map operation is non-blocking, the pointer to the mapped region returned by clEnqueueMapBuffer cannot be used until the map command has completed. The event argument returns an event object which can be used to query the execution status of the map command. When the map command is completed, the application can access the contents of the mapped region using the pointer returned by clEnqueueMapBuffer.



但是在(5.9,紧随表5.15之后)有以下语句:

Using clGetEventInfo to determine if a command identified by event has finished execution (i.e. CL_EVENT_COMMAND_EXECUTION_STATUS returns CL_COMPLETE) is not a synchronization point. There are no guarantees that the memory objects being modified by command associated with event will be visible to other enqueued commands.



Q1 :所以,我想知道是否还有其他方法可以“查询执行”
映射命令的状态”以及查询返回“CL_COMPLETE”时是否隔离了内存一致性(在这种情况下,是针对主机)?
Q2 :我缺少什么吗?
Q3 :针对这种情况的典型OpenCL习惯用法是什么?

最佳答案

1-使用入队障碍并从该命令获取事件以具有可见性并与主机进行细粒度的同步

等待它在while循环中查询会使用更多的cpu,但至少具有良好的粒度

2个事件,用于细粒度控制。等待和可见性的障碍

例如,clwaitforevents既提供了查询结果,又使用了更少的cpu,但比查询的粒度更大

设备端仅使用事件网络在队列之间具有图形

3-没有任何典型的。选择哪个最适合您的问题

关于multithreading - 使用clEnqueueMapBuffer和 'querying whether the command has finished'的OpenCL主机的内存可见性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42760740/

36 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com