swift - 使用 AVAudioSourceNode 播放声音时的噪音

标签 swift avfoundation watchos

我正在使用TinySoundFont在 watchOS 上使用 SF2 文件。我想实时播放框架生成的原始音频(这意味着按下相应按钮后立即调用 tsf_note_on 并在新数据出现后立即调用 tsf_render_short )需要)。我正在使用 AVAudioSourceNode 来实现这一点。

尽管当我将其渲染到文件中时声音渲染得很好,但使用 AVAudioSourceNode 播放时确实很吵。 (基于 the answer from Rob Napier ,这可能是因为我忽略了时间戳属性 - 我正在寻找解决该问题的解决方案。)是什么导致了此问题以及如何修复它?

我正在寻找一种实时渲染音频而不是预先计算音频的解决方案,因为我也想正确处理循环声音。

You can download a sample GitHub project here.

ContentView.swift

import SwiftUI
import AVFoundation

struct ContentView: View {
    @ObservedObject var settings = Settings.shared

    init() {
        settings.prepare()
    }

    var body: some View {
        Button("Play Sound") {
            Settings.shared.playSound()

            if !settings.engine.isRunning {
                do {
                    try settings.engine.start()
                } catch {
                    print(error)
                }
            }
        }
    }
}

设置.swift

import SwiftUI
import AVFoundation

class Settings: ObservableObject {
    static let shared = Settings()

    var engine: AVAudioEngine!
    var sourceNode: AVAudioSourceNode!

    var tinySoundFont: OpaquePointer!

    func prepare() {
        let soundFontPath = Bundle.main.path(forResource: "GMGSx", ofType: "sf2")
        tinySoundFont = tsf_load_filename(soundFontPath)
        tsf_set_output(tinySoundFont, TSF_MONO, 44100, 0)

        setUpSound()
    }

    func setUpSound() {
        if let engine = engine,
           let sourceNode = sourceNode {
            engine.detach(sourceNode)
        }

        engine = .init()

        let mixerNode = engine.mainMixerNode

        let audioFormat = AVAudioFormat(
            commonFormat: .pcmFormatInt16,
            sampleRate: 44100,
            channels: 1,
            interleaved: false
        )

        guard let audioFormat = audioFormat else {
            return
        }

        sourceNode = AVAudioSourceNode(format: audioFormat) { silence, timeStamp, frameCount, audioBufferList in
            guard let data = self.getSound(length: Int(frameCount)) else {
                return 1
            }

            let ablPointer = UnsafeMutableAudioBufferListPointer(audioBufferList)

            data.withUnsafeBytes { (intPointer: UnsafePointer<Int16>) in
                for index in 0 ..< Int(frameCount) {
                    let value = intPointer[index]

                    // Set the same value on all channels (due to the inputFormat, there's only one channel though).
                    for buffer in ablPointer {
                        let buf: UnsafeMutableBufferPointer<Int16> = UnsafeMutableBufferPointer(buffer)
                        buf[index] = value
                    }
                }
            }

            return noErr
        }

        engine.attach(sourceNode)
        engine.connect(sourceNode, to: mixerNode, format: audioFormat)

        do {
            try AVAudioSession.sharedInstance().setCategory(.playback)
        } catch {
            print(error)
        }
    }

    func playSound() {
        tsf_note_on(tinySoundFont, 0, 60, 1)
    }

    func getSound(length: Int) -> Data? {
        let array = [Int16]()
        var storage = UnsafeMutablePointer<Int16>.allocate(capacity: length)
        storage.initialize(from: array, count: length)

        tsf_render_short(tinySoundFont, storage, Int32(length), 0)
        let data = Data(bytes: storage, count: length)

        storage.deallocate()

        return data
    }
}

最佳答案

AVAudioSourceNode 初始值设定项采用渲染 block 。在您使用的模式(实时播放)中,这是一个实时回调,因此您有一个非常紧迫的期限来用请求的数据填充 block 并返回它以便可以播放。您没有大量时间进行计算。您肯定没有时间访问文件系统。

在您的 block 中,您在每个渲染周期重新计算整个 WAV,然后将其写入磁盘,然后从磁盘读取它,然后填充所请求的 block 。您忽略请求的时间戳,并始终从样本零开始填充缓冲区。不匹配是引起嗡嗡声的原因。事实上,你处理得太慢可能是导致音调下降的原因。

根据文件的大小,实现此目的的最简单方法是首先将所有内容解码到内存中,然后填充请求的时间戳和长度的缓冲区。看起来您的 C 代码已经生成了 PCM 数据,因此无需将其转换为 WAV 文件。它似乎已经是正确的格式。

Apple 为 Signal Generator 提供了一个很好的示例项目您应该将其用作起点。下载它并确保它按预期工作。然后交换您的 SF2 代码。您可能还会发现有关此内容的视频很有帮助:What’s New in AVAudioEngine .


这里使用的最简单的工具可能是 AVAudioPlayerNode。您的 SoundFontHelper 使事情变得更加复杂,因此我将其删除并直接从 Swift 调用 TSF。为此,请创建一个名为 tsf.c 的文件,如下所示:

#define TSF_IMPLEMENTATION
#include "tsf.h"

并将其添加到BridgingHeader.h:

#import "tsf.h"

将 ContentView 简化为:

import SwiftUI

struct ContentView: View {
    @ObservedObject var settings = Settings.shared

    init() {
        // You'll want error handling here.
        try! settings.prepare()
    }

    var body: some View {
        Button("Play Sound") {
            settings.play()
        }
    }
}

剩下的就是新版本的“设置”,这就是它的核心:

import SwiftUI
import AVFoundation

class Settings: ObservableObject {
    static let shared = Settings()

    var engine = AVAudioEngine()
    let playerNode = AVAudioPlayerNode()
    var tsf: OpaquePointer
    var outputFormat = AVAudioFormat()

    init() {
        let soundFontPath = Bundle.main.path(forResource: "GMGSx", ofType: "sf2")
        tsf = tsf_load_filename(soundFontPath)

        engine.attach(playerNode)
        engine.connect(playerNode, to: engine.mainMixerNode, format: nil)

        updateOutputFormat()
    }

    // For simplicity, this object assumes the outputFormat does not change during its lifetime.
    // It's important to watch for route changes, and recreate this object if they occur. For details, see:
    // https://developer.apple.com/documentation/avfaudio/avaudiosession/responding_to_audio_session_route_changes
    func updateOutputFormat() {
        outputFormat = engine.mainMixerNode.outputFormat(forBus: 0)
    }

    func prepare() throws {
        // Start the engine
        try AVAudioSession.sharedInstance().setCategory(.playback)
        try engine.start()
        playerNode.play()

        updateOutputFormat()

        // Configure TSF. The only important thing here is the sample rate, which can be different on different hardware.
        // Core Audio has a defined format of "deinterleaved 32-bit floating point."
        tsf_set_output(tsf,
                       TSF_STEREO_UNWEAVED,            // mode
                       Int32(outputFormat.sampleRate), // sampleRate
                       0)                              // gain
    }

    func play() {
        tsf_note_on(tsf,
                    0,   // preset_index
                    60,  // key (middle C)
                    1.0) // velocity

        // These tones have a long falloff, so you want a lot of source data. This is 10s.
        let frameCount = 10 * Int(outputFormat.sampleRate)

        // Create a buffer for the samples
        let buffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: AVAudioFrameCount(frameCount))!
        buffer.frameLength = buffer.frameCapacity

        // Render the samples. Do not mix. This buffer has been extended to
        // the needed size by the assignment to `frameLength` above. The call to
        // `assumingMemoryBound` is known to be correct because the format is Float32.
        let ptr = buffer.audioBufferList.pointee.mBuffers.mData?.assumingMemoryBound(to: Float.self)
        tsf_render_float(tsf,
                         ptr,                // buffer
                         Int32(frameCount),  // samples
                         0)                  // mixing (do not mix)

        // All done. Play the buffer, interrupting whatever is currently playing
        playerNode.scheduleBuffer(buffer, at: nil, options: .interrupts)
    }
}

您可以在 my fork 找到完整版本。您还可以看到第一个提交,这是维护 SoundFontHelper 并进行转换来处理它的另一种方法,但首先正确渲染音频要简单得多。

关于swift - 使用 AVAudioSourceNode 播放声音时的噪音,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74997076/

相关文章:

ios - Swift 使用联系人框架,使用电话号码搜索以获取姓名和用户图像

swift - 将 firebase 快照保存为变量以在其他地方使用

ios - 当应用程序在后台运行时(即按下主页按钮时)停止音乐

ios - 结构 'State' 不能用作属性

ios - 以编程方式在 Xcode 中获取设备纵横比

json - 使用 json 数据的 Food 类的正确结构

json - Swift:将 JSON 从 URL 转换为多个字符串

macos - 如何在 OS X 上实时录制和播放音频

iphone - 将 CVImageBufferRef 转换为 CVPixelBufferRef

swift - 有没有办法在 watchOS 上运行 AudioKit?