iOS Metal : The fastest way to read read-only data?

情况: 在 Metal 内核函数中，线程组中的每个线程一次读取完全相同的值。内核伪代码:

kernel void foo(device   int2*   ranges,  
                constant float3& readonlyBuffer,  
                device   float*  results,  
                uint lno [[ threadgroup_position_in_grid ]])  
{  
  float acc = 0.0;  

  for(int i=ranges[lno].x; i<ranges[lno].y; i++) {  
    // each thread in threadgroup processes the same value from the buffer  
    acc += process( readonlyBuffer[i] );  
  }  

  results[...] = acc;  
}

问题:为了优化缓冲区读取，我将 readonlyBuffer 的地址空间限定符从 device 更改为 constant 。尽管 Apple documentation 这对内核性能的影响为零说了一些不同的话:

The constant address space is optimized for multiple instances executing a graphics or kernel function accessing the same location in the buffer.

问题:

如何提高常量缓冲区的内存读取时间？
我可以将缓冲区(或至少其中一部分)移动到片上缓存(如 Constant Buffer Preloading (第 24 页))吗？

最佳答案

在您的示例代码中，索引到 readonlyBuffer 会生成编译器错误。

假设readonlyBuffer被声明为指针，那么编译器并不静态地知道其大小，并且无法将数据移动到常量内存空间。

如果 readonlyBuffer 很小(您只有 4KB 常量内存可供使用)，请将其放入结构中，如下所示:

struct ReadonlyBuffer {
    float3 values[MAX_BUFFER_SIZE];
};

然后执行:

kernel void foo(device   int2*   ranges,  
                constant ReadonlyBuffer& readonlyBuffer,  
                device   float*  results,  
                uint lno [[ threadgroup_position_in_grid ]])

最后，运行 GPU 跟踪(“捕获 GPU 帧”)并确保不会收到以下错误:

The Compiler was not able to Preload your Buffer. Kernel Function, Buffer Index: 1.

有关缓冲区预加载的更多信息，请参阅:https://developer.apple.com/videos/play/wwdc2016/606/?time=408

关于iOS Metal : The fastest way to read read-only data?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39401417/

上一篇：google-cloud-platform - 为什么在 Cloud Composer 中会自动生成一个名为 'airflow_monitoring' 的 DAG？

下一篇：ios - 仅使用 PHAsset 和 PHAssetCollection 的 PHFetchOptions() 照片

相关文章：

ios - 如何仅针对较新的 iOS 设备？

ios - isAvailableForServiceType(SLServiceTypeTwitter) 在安装 Twitter 客户端时总是返回 true

ios - 具有来自一个 API 源的多个部分的 RxDataSources tableView

swift - 在使用 2 个不同的 ViewController 时解包 Optional 值时，自定义类抛出错误意外发现 nil

ios - 如何跟踪贴纸的使用情况及其在 iOS 设备中的安装

iOS Metal Swift newComputePipelineStateWithFunction 不工作错误

ios - currentDrawable 不可预测的返回时间

ios - 从本地文件系统检索视频数据

objective-c - "Current language: auto; currently objective-c"是什么意思？

当我选择 "Print"菜单并取消它时，不会在 iOS 10.0 上调用 UIActivityViewController.completionWithItemsHandler