我今天需要你的帮助!我开始在一个非常简单的用例中使用计算着色器: 我有一个深度相机,我想计算靠近相机的物体的边界框。
但是我有太多像素需要处理,我想使用 GPGPU、计算着色器和并行化来计算它。
我目前有一个问题,当我运行我的程序时,我有相同的最小和最大坐标。所以我认为我的所有组和线程都会同时写入我的 StructuredBuffers。
您知道如何做到这一点吗?
这是我的计算着色器的代码:
#pragma kernel ComputeBoundingBox
//We define the size of a group in the x, y and z directions, y and z direction will just be one (because 1D array is used for depthData)
#define thread_group_size_x 1024
#define thread_group_size_y 1
#define thread_group_size_z 1
//Size of the depthData picture
#define width 512;
#define height 424;
//DataBuffer = depthData of the camera
//minBuffer, maxBuffer, array of size 3 with min/max x, y and z
//mask = image area to process
RWStructuredBuffer<float> dataBuffer;
globallycoherent RWStructuredBuffer<float>minBuffer;
globallycoherent RWStructuredBuffer<float> maxBuffer;
RWStructuredBuffer<float> mask;
float xValue = 0, yValue = 0, zValue = 0;
[numthreads(thread_group_size_x, thread_group_size_y, thread_group_size_z)]
void ComputeBoundingBox(uint3 id : SV_DispatchThreadID)
{
//xValue and yValue = [X,Y] index in 2D
//zValue = depthValue of [X,Y] index
xValue = (id.x + 1) % width;
yValue = (id.x + 1) / width;
zValue = dataBuffer[id.x];
if (mask[id.x] > 0.49)
{
if (zValue > 500 && zValue < 1500)
{
if (xValue < minBuffer[0])
minBuffer[0] = xValue;
else if (xValue > maxBuffer[0])
maxBuffer[0] = xValue;
if (yValue < minBuffer[1])
minBuffer[1] = yValue;
else if (yValue > maxBuffer[1])
maxBuffer[1] = yValue;
if (zValue < minBuffer[2])
minBuffer[2] = zValue;
else if (zValue > maxBuffer[2])
maxBuffer[2] = zValue;
}
}
}
这是调用计算着色器的代码部分:
void RunShader()
{
dataBuffer.SetData(depthDataFloat);
minDataBuffer.SetData(reinitialiseMinBuffer);
maxDataBuffer.SetData(reinitialiseMaxBuffer);
maskBuffer.SetData(mask);
computeShader.SetBuffer(_kernel, "dataBuffer", dataBuffer);
computeShader.SetBuffer(_kernel, "minBuffer", minDataBuffer);
computeShader.SetBuffer(_kernel, "maxBuffer", maxDataBuffer);
computeShader.SetBuffer(_kernel, "mask", maskBuffer);
computeShader.Dispatch(_kernel, 212, 1, 1);
}
最佳答案
在您的情况下,您不处理数据竞争,因此多个线程可以在同一位置写入。
为了确保您的写入是原子的,您需要使用互锁函数。 这些仅适用于 uint,但在您的情况下(假设深度数据始终> 0), float 的二进制比较将匹配它们的值的比较。
这是修改后的着色器:
#pragma kernel ComputeBoundingBox
#define thread_group_size_x 1024
#define thread_group_size_y 1
#define thread_group_size_z 1
//Size of the depthData picture
#define width 512;
#define height 424;
//DataBuffer = depthData of the camera
//minBuffer, maxBuffer, array of size 3 with min/max x, y and z
//mask = image area to process
StructuredBuffer<float> dataBuffer;
RWStructuredBuffer<float>minBuffer;
RWStructuredBuffer<float> maxBuffer;
StructuredBuffer<float> mask;
[numthreads(thread_group_size_x, thread_group_size_y, thread_group_size_z)]
void ComputeBoundingBox(uint3 id : SV_DispatchThreadID)
{
//xValue and yValue = [X,Y] index in 2D
//zValue = depthValue of [X,Y] index
uint xValue = (id.x + 1) % width;
uint yValue = (id.x + 1) / width;
uint zValue = asuint(dataBuffer[id.x]);
if (mask[id.x] > 0.49)
{
if (zValue > 500 && zValue < 1500)
{
uint oldValue;
InterlockedMin(minBuffer[0],xValue,oldValue);
InterlockedMax(maxBuffer[0],xValue,oldValue);
InterlockedMin(minBuffer[1],yValue,oldValue);
InterlockedMax(maxBuffer[1],yValue,oldValue);
InterlockedMin(minBuffer[2],zValue,oldValue);
InterlockedMax(maxBuffer[2],zValue,oldValue);
}
}
}
我也将 dataBuffer 和 mask 指定为 StructuredBuffer(因为您只读取这些内容,因此将它们绑定(bind)起来通常会更快)。
此外,您还需要确保首先使用合适的值清除最小/最大缓冲区(即在调用该着色器之前)。
这可以通过一个简单的计算着色器来完成(调度单个线程):
RWStructuredBuffer<float> minBuffer;
RWStructuredBuffer<float> maxBuffer;
[numthreads(1, 1, 1)]
void ClearBuffers(uint3 id : SV_DispatchThreadID)
{
uint maxUint = 0xffffffff;
uint minUint = 0;
minBuffer[0]= asfloat(maxUint);
minBuffer[1]= asfloat(maxUint);
minBuffer[2]= asfloat(maxUint);
maxBuffer[0]= asfloat(minUint);
maxBuffer[1]= asfloat(minUint);
maxBuffer[2]= asfloat(minUint);
}
请注意,在这种情况下 uint/float 别名将起作用,因此您无需执行任何转换。
关于unity-game-engine - Unity 计算着色器同步,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44760027/