c++ - 为 AT91SAM9 ARM 处理器 Linux 用户空间程序优化的 FFT 和数学

我正在使用 Atmel 的 AT91SAM9G20 处理器为嵌入式 Linux 系统开发 C/C++ 软件。我需要使用 Linux 用户空间程序使用定点(或浮点)数学快速计算 FFT。我知道汇编器可能是实现的方式，并且在使用 gcc 编译器进行编译时可能需要一个额外的 -mpcu 开关。进行此实现的最佳方法是什么？是否有任何好的书籍引用资料或优化的 FOSS 库可用？

我必须实现一些算法，这些算法也需要多次应用小 FFT 长度(即 1024 点)，我想知道某些库(例如 kissfft)是否也能正常工作。我也对长 FFT 长度感兴趣，因此下面答案中建议的 FFTW 也能很好地工作。

除了这个问题，我还想知道在 ARM9 Linux 用户空间程序中如何处理整数除法。如果我除以两个整数(例如 25/4)，除法是使用软 float 完成的吗？我还需要实现一些繁重的数字运算算法，我想知道在这里使用定点是否比 float 学更好，以及 gcc 编译器如何真正处理事情。

最佳答案

FFTw 包含特定于 CPU 的优化(也可以进行编译时/运行时 CPU 分析)。

Version 3.3.1 introduces support for the ARM Neon extensions

http://www.fftw.org/#features

来自常见问题解答:问题 4.2。为什么 FFTW 这么快？

This is a complex question, and there is no simple answer. In fact, the authors do not fully know the answer, either. In addition to many small performance hacks throughout FFTW, there are three general reasons for FFTW's speed.

FFTW uses a variety of FFT algorithms and implementation styles that can be arbitrarily composed to adapt itself to a machine. See Q4.1 `How does FFTW work?'.

FFTW uses a code generator to produce highly-optimized routines for computing small transforms.

FFTW uses explicit divide-and-conquer to take advantage of the memory hierarchy.

关于c++ - 为 AT91SAM9 ARM 处理器 Linux 用户空间程序优化的 FFT 和数学，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9874763/

c++ - 为 AT91SAM9 ARM 处理器 Linux 用户空间程序优化的 FFT 和数学

上一篇：linux - 如何使用 tc 和 cgroups 对数据包进行优先级排序

下一篇：linux - QEMU 和 KVM 问题