我对 Vulkan 还很陌生,正在处理 nvpro 示例 vk mini path tracer并将一个初始化为 0 的计数器变量作为 SSBO 发送到我的计算着色器。我能够增加着色器中的值并在 CPU 上重新解释该值。但是,该值低于我的预期,并且每次返回不同的数字(大约 30-60)。我无法弄清楚如何同步缓冲区值,我认为它可能必须处理 vkCmdDispatch 函数调用中工作组的并行化。
有没有办法增加此计数器并使其反射(reflect)在所有其他着色器调用中?我必须在哪里设置此同步?着色器代码还是CPU端?
我尝试研究 GLSL 的 memoryBarrierBuffer() 以及其他内存屏障概念,但我无法判断这些概念是否适用于具有工作组调度的单个计算着色器。
渲染.cpp:
const uint64_t rayCount = 0;
nvvk::Buffer genRayCount;
...
VkCommandBuffer uploadCmdBuffer = AllocateAndBeginOneTimeCommandBuffer(context, cmdPool);
// We get these buffers' device addresses, and use them as storage buffers and build inputs.
const VkBufferUsageFlags usage = VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT;
const VkMemoryPropertyFlags memUsage = VK_MEMORY_PROPERTY_HOST_CACHED_BIT
| VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
| VK_MEMORY_PROPERTY_HOST_COHERENT_BIT;
genRayCount = allocator.createBuffer(uploadCmdBuffer, sizeof(uint64_t), &rayCount, usage, memUsage);
... // other bindings
descriptorSetContainer.addBinding(3, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT);
// descriptor sets
std::array<VkWriteDescriptorSet, 4> writeDescriptorSets;
...
// binding 3 for ray gen counter
VkDescriptorBufferInfo genRayCountDescriptorBufferInfo{};
genRayCountDescriptorBufferInfo.buffer = genRayCount.buffer;
genRayCountDescriptorBufferInfo.range = genRayCountSizeBytes;
writeDescriptorSets[3] = descriptorSetContainer.makeWrite(0, 2, &genRayCountDescriptorBufferInfo);
...
vkCmdDispatch(cmdBuffer, (uint32_t(cam.width) + workgroup_width - 1) / workgroup_width,
(uint32_t(cam.height) + workgroup_height - 1) / workgroup_height, 1);
...
raytrace.comp.glsl:
...
layout(binding = 3, set = 0) buffer generatedRayCount
{
uint genRayCount;
};
...
void main()
{
// The resolution of the buffer, which in this case is a hardcoded vector
// of 2 unsigned integers:
const uvec2 resolution = uvec2(cam.render_width, cam.render_height);
// Get the coordinates of the pixel for this invocation:
//
// .-------.-> x
// | |
// | |
// '-------'
// v
// y
uvec2 pixel = gl_GlobalInvocationID.xy;
// If the pixel is outside of the image, don't do anything:
if((pixel.x >= resolution.x) || (pixel.y >= resolution.y))
{
return;
}
// Get the index of this invocation in the buffer:
uint linearIndex = resolution.x * pixel.y + pixel.x;
uint frame_size = resolution.x * resolution.y;
uint count = genRayCount;
if(rayBuffer[linearIndex].isIntersected) {
//... generate a 2nd ray
// increment ray counter
genRayCount += 1;
}
memoryBarrierBuffer();
const vec3 pixelColor = (rayBuffer[linearIndex].isIntersected) ? vec3(0.9) :
vec3(float(pixel.x) / resolution.x,
float(count) / resolution.x, float(pixel.y) / resolution.y);
// Write the color to the buffer.
imageData[linearIndex] += pixelColor;
}
最佳答案
您可能想使用atomic add在这里。
+= 操作导致未定义的行为。假设您有 500 个标记为 0-499 的线程。假设计数器的当前值为 10。线程 32 读取 10,线程 411 也读取 10。它们都尝试将 11 写入您的变量,导致计数不足。您也可能遇到线程用较低的计数值覆盖较高的计数值以及各种多线程数据损坏的情况。
您需要同步您的线程才能使其正常工作。
关于glsl - 尝试在计算着色器中同步计数器 SSBO 的增量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71417744/