我写了这个简单的代码:
public static class Bezier
{
private static Vector2 GetInterpolatedPoint(Vector2 a, Vector2 b, float t)
{
return new Vector2(a.X * (1f - t) + b.X * t, a.Y * (1f - t) + b.Y * t);
}
public static Vector2 GetPoint(IList<Vector2> points, float t)
{
int initialCount = points.Count;
Vector2[] buf;
while (points.Count > 2)
{
buf = new Vector2[points.Count-1];
for (int i = 0; i < points.Count - 1; i++)
{
buf[i] = GetInterpolatedPoint(points[i], points[i + 1], t);
}
points = buf;
}
return GetInterpolatedPoint(points[0], points[1], t);
}
}
输入:
- 100 个随机 2D 点的列表
- 0.0001f 步(10000 次迭代)
结果:
- 以 ~1200 毫秒计算。
对于我的项目来说,它太慢了。我怎样才能改善这个结果?我需要有关此代码改进的任何提示或其他更有效的计算方法。
最佳答案
使用最新版本的 C# 可以显着加快速度。
您可以通过几种方式做到这一点:
- 使用
Span<T>
和stackalloc
以避免堆分配。 - 重复使用该跨度,以便只需要一次内存分配。这需要将值的计数维护为一个单独的变量。
- 指定
Vector2
getInterpolatedPoint()
的参数作为in
.
将它们放在一起(并将方法重命名为具有 Opt
后缀):
public static class Bezier
{
static Vector2 getInterpolatedPointOpt(in Vector2 a, in Vector2 b, float t)
{
return new Vector2(a.X * (1f - t) + b.X * t, a.Y * (1f - t) + b.Y * t);
}
public static Vector2 GetPointOpt(IList<Vector2> points, float t)
{
int count = points.Count;
Span<Vector2> array = stackalloc Vector2[count]; // Not suitable for large arrays.
for (int i = 0; i < count - 1; i++)
{
array[i] = getInterpolatedPointOpt(points[i], points[i + 1], t);
}
while (--count > 2)
{
for (int i = 0; i < count - 1; i++)
{
array[i] = getInterpolatedPointOpt(array[i], array[i + 1], t);
}
}
return getInterpolatedPointOpt(array[0], array[1], t);
}
}
第一次迭代的单独循环使用原始数据,其余迭代的循环使用跨度。这避免了在插入原始数据之前复制原始数据。
我写了下面的测试代码,但是你应该使用 Benchmark.Net 进行适当的计时(这个测试代码假设原始未优化的方法仍在类中):
Random rng = new Random(123445);
Vector2[] buff = new Vector2[10000];
var sw = new Stopwatch();
for (int i = 0; i < 10; ++i)
{
for (int j = 0; j < buff.Length; ++j)
{
buff[j].X = rng.NextSingle();
buff[j].Y = rng.NextSingle();
}
sw.Restart();
Vector2 result1 = Bezier.GetPointOpt(buff, 0.0001f);
Console.WriteLine("GetPointOpt() took " + sw.Elapsed.TotalMilliseconds);
sw.Restart();
Vector2 result2 = Bezier.GetPoint(buff, 0.0001f);
Console.WriteLine("GetPoint() took " + sw.Elapsed.TotalMilliseconds);
Console.WriteLine(result1);
Console.WriteLine(result2);
Console.WriteLine();
}
输出(在 Release模式下运行):
GetPointOpt() took 107.5646
GetPoint() took 631.597
<0.49709368, 0.55904883>
<0.49709368, 0.55904883>
GetPointOpt() took 95.9445
GetPoint() took 502.4843
<0.5745254, 0.48695806>
<0.5745254, 0.48695806>
GetPointOpt() took 99.6847
GetPoint() took 507.7098
<0.59854025, 0.4798301>
<0.59854025, 0.4798301>
GetPointOpt() took 94.5894
GetPoint() took 480.0481
<0.4043915, 0.3434864>
<0.4043915, 0.3434864>
GetPointOpt() took 96.8213
GetPoint() took 481.8792
<0.4802326, 0.5464641>
<0.4802326, 0.5464641>
GetPointOpt() took 92.3702
GetPoint() took 478.834
<0.64281195, 0.72084826>
<0.64281195, 0.72084826>
GetPointOpt() took 109.5258
GetPoint() took 507.9398
<0.5327367, 0.6595588>
<0.5327367, 0.6595588>
GetPointOpt() took 92.3229
GetPoint() took 474.3246
<0.5084992, 0.57937586>
<0.5084992, 0.57937586>
GetPointOpt() took 107.6864
GetPoint() took 479.8337
<0.25628412, 0.6602504>
<0.25628412, 0.6602504>
GetPointOpt() took 98.2489
GetPoint() took 500.7994
<0.71759677, 0.50711>
<0.71759677, 0.50711>
此优化代码的运行速度大约快了五倍。
附录1:Benchmark.NET测试代码:
public class UnderTest
{
public UnderTest()
{
for (int j = 0; j < _buff.Length; ++j)
{
_buff[j].X = _rng.NextSingle();
_buff[j].Y = _rng.NextSingle();
}
}
[Benchmark(Baseline = true)]
public void Unoptimised()
{
Bezier.GetPoint(_buff, 0.0001f);
}
[Benchmark]
public void Optimised()
{
Bezier.GetPointOpt(_buff, 0.0001f);
}
readonly Random _rng = new Random(123445);
readonly Vector2[] _buff = new Vector2[10000];
}
结果:
| Method | Mean | Error | StdDev | Median | Ratio | RatioSD |
|------------ |---------:|---------:|---------:|---------:|------:|--------:|
| Unoptimised | 459.6 ms | 11.03 ms | 32.35 ms | 451.0 ms | 1.00 | 0.00 |
| Optimised | 106.8 ms | 2.13 ms | 3.79 ms | 106.5 ms | 0.22 | 0.02 |
附录 2:编写 getInterpolatedPoint()
内联方法。
看起来像这样:
public static Vector2 GetPointOpt(IList<Vector2> points, float t)
{
int count = points.Count;
Span<Vector2> array = stackalloc Vector2[count]; // Not suitable for large arrays.
for (int i = 0; i < count - 1; i++)
{
array[i].X = points[i].X * (1f - t) + points[i + 1].X * t;
array[i].Y = points[i].Y * (1f - t) + points[i + 1].Y * t;
}
while (--count > 2)
{
for (int i = 0; i < count - 1; i++)
{
array[i].X = array[i].X * (1f - t) + array[i + 1].X * t;
array[i].Y = array[i].Y * (1f - t) + array[i + 1].Y * t;
}
}
return new Vector2(
array[0].X * (1f - t) + array[1].X * t,
array[0].Y * (1f - t) + array[1].Y * t);
}
但是,编写此内联的改进非常有限,这可能说明了 JIT 编译器的有效性,它可能已经内联了 getInterpolatedPoint()
方法。
结果:
| Method | Mean | Error | StdDev | Ratio |
|------------ |----------:|---------:|----------:|------:|
| Unoptimised | 437.18 ms | 8.635 ms | 13.186 ms | 1.00 |
| Optimised | 77.90 ms | 1.546 ms | 3.735 ms | 0.18 |
比率是 0.18 而不是 0.22 - 比调用 getInterpolatedPoint()
快了大约 20% .
关于c# - 我的贝塞尔曲线代码太慢了。我该如何解决?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74041759/