iphone - OS X/iOS - 使用 AudioConverterFillComplexBuffer 进行缓冲区的采样率转换

标签 iphone macos core-audio sample-rate audio-converter

我正在为 audio library called XAL 编写 CoreAudio 后端。输入缓冲器可以具有不同的采样率。我使用单个音频单元进行输出。想法是在将缓冲区发送到音频单元之前对其进行转换和混合。

只要输入缓冲区具有与输出音频单元相同的属性(采样率、 channel 数等),一切都可以正常工作。因此,混合部分起作用了。

但是,我陷入了采样率和 channel 数转换的困境。据我所知,使用音频转换器服务 API 最容易做到这一点。我已经成功构建了一个转换器;这个想法是输出格式与输出单元格式相同,但可能会根据转换器的目的进行调整。

音频转换器已成功构建,但在调用 AudioConverterFillComplexBuffer() 时,我收到输出状态错误 -50。

如果我能让另一组人关注这段代码,我会很高兴。问题可能出在 AudioConverterNew() 下面的某个地方。变量stream包含传入(和传出)缓冲区数据,streamSize包含传入(和传出)缓冲区数据的字节大小。

我做错了什么?

void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
{
    if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel || 
        buffer->getChannels() != unitDescription.mChannelsPerFrame || 
        buffer->getSamplingRate() != unitDescription.mSampleRate)
    {
        printf("INPUT STREAM SIZE: %d\n", *streamSize);
        // describe the input format's description
        AudioStreamBasicDescription inputDescription;
        memset(&inputDescription, 0, sizeof(inputDescription));
        inputDescription.mFormatID = kAudioFormatLinearPCM;
        inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
        inputDescription.mChannelsPerFrame = buffer->getChannels();
        inputDescription.mSampleRate = buffer->getSamplingRate();
        inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
        inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
        inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
        inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;
        printf("INPUT : %lu bytes per packet for sample rate %g, channels %d\n", inputDescription.mBytesPerPacket, inputDescription.mSampleRate, inputDescription.mChannelsPerFrame);

        // copy conversion output format's description from the
        // output audio unit's description.
        // then adjust framesPerPacket to match the input we'll be passing.

        // framecount of our input stream is based on the input bytecount.
        // output stream will have same number of frames, but different
        // number of bytes.
        AudioStreamBasicDescription outputDescription = unitDescription;
        outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
        outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;
        printf("OUTPUT : %lu bytes per packet for sample rate %g, channels %d\n", outputDescription.mBytesPerPacket, outputDescription.mSampleRate, outputDescription.mChannelsPerFrame);

        // create an audio converter
        AudioConverterRef audioConverter;
        OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
        printf("Created audio converter %p (status: %d)\n", audioConverter, acCreationResult);
        if(!audioConverter)
        {
            // bail out
            free(*stream);
            *streamSize = 0;
            *stream = (unsigned char*)malloc(0);
            return;
        }

        // calculate number of bytes required for output of input stream.
        // allocate buffer of adequate size.
        UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerFrame); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
        unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
        memset(outputBuffer, 0, outputBytes);
        printf("OUTPUT BYTES : %d\n", outputBytes);

        // describe input data we'll pass into converter
        AudioBuffer inputBuffer;
        inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
        inputBuffer.mDataByteSize = *streamSize;
        inputBuffer.mData = *stream;

        // describe output data buffers into which we can receive data.
        AudioBufferList outputBufferList;
        outputBufferList.mNumberBuffers = 1;
        outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
        outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
        outputBufferList.mBuffers[0].mData = outputBuffer;

        // set output data packet size
        UInt32 outputDataPacketSize = outputDescription.mBytesPerPacket;

        // convert
        OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
                                                          CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
                                                          &inputBuffer, /* void *inInputDataProcUserData */
                                                          &outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
                                                          &outputBufferList, /* AudioBufferList *outOutputData */
                                                          NULL /* AudioStreamPacketDescription *outPacketDescription */
                                                          );
        printf("Result: %d wheee\n", result);

        // change "stream" to describe our output buffer.
        // even if error occured, we'd rather have silence than unconverted audio.
        free(*stream);
        *stream = outputBuffer;
        *streamSize = outputBytes;

        // dispose of the audio converter
        AudioConverterDispose(audioConverter);
    }
}


OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
                                                                UInt32* ioNumberDataPackets,
                                                                AudioBufferList* ioData,
                                                                AudioStreamPacketDescription** ioDataPacketDescription,
                                                                void* inUserData)
{
    printf("Converter\n");
    if(*ioNumberDataPackets != 1)
    {
        xal::log("_converterComplexInputDataProc cannot provide input data; invalid number of packets requested");
        *ioNumberDataPackets = 0;
        ioData->mNumberBuffers = 0;
        return -50;
    }

    *ioNumberDataPackets = 1;
    ioData->mNumberBuffers = 1;
    ioData->mBuffers[0] = *(AudioBuffer*)inUserData;

    *ioDataPacketDescription = NULL;

    return 0;
}

最佳答案

使用音频转换器服务(现已作为 BSD-licensed XAL audio library 的一部分提供)的核心音频采样率转换和 channel 数转换的工作代码:

void CoreAudio_AudioManager::_convertStream(Buffer* buffer, unsigned char** stream, int *streamSize)
{
    if (buffer->getBitsPerSample() != unitDescription.mBitsPerChannel || 
        buffer->getChannels() != unitDescription.mChannelsPerFrame || 
        buffer->getSamplingRate() != unitDescription.mSampleRate)
    {
        // describe the input format's description
        AudioStreamBasicDescription inputDescription;
        memset(&inputDescription, 0, sizeof(inputDescription));
        inputDescription.mFormatID = kAudioFormatLinearPCM;
        inputDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
        inputDescription.mChannelsPerFrame = buffer->getChannels();
        inputDescription.mSampleRate = buffer->getSamplingRate();
        inputDescription.mBitsPerChannel = buffer->getBitsPerSample();
        inputDescription.mBytesPerFrame = (inputDescription.mBitsPerChannel * inputDescription.mChannelsPerFrame) / 8;
        inputDescription.mFramesPerPacket = 1; //*streamSize / inputDescription.mBytesPerFrame;
        inputDescription.mBytesPerPacket = inputDescription.mBytesPerFrame * inputDescription.mFramesPerPacket;

        // copy conversion output format's description from the
        // output audio unit's description.
        // then adjust framesPerPacket to match the input we'll be passing.

        // framecount of our input stream is based on the input bytecount.
        // output stream will have same number of frames, but different
        // number of bytes.
        AudioStreamBasicDescription outputDescription = unitDescription;
        outputDescription.mFramesPerPacket = 1; //inputDescription.mFramesPerPacket;
        outputDescription.mBytesPerPacket = outputDescription.mBytesPerFrame * outputDescription.mFramesPerPacket;

        // create an audio converter
        AudioConverterRef audioConverter;
        OSStatus acCreationResult = AudioConverterNew(&inputDescription, &outputDescription, &audioConverter);
        if(!audioConverter)
        {
            // bail out
            free(*stream);
            *streamSize = 0;
            *stream = (unsigned char*)malloc(0);
            return;
        }

        // calculate number of bytes required for output of input stream.
        // allocate buffer of adequate size.
        UInt32 outputBytes = outputDescription.mBytesPerPacket * (*streamSize / inputDescription.mBytesPerPacket); // outputDescription.mFramesPerPacket * outputDescription.mBytesPerFrame;
        unsigned char *outputBuffer = (unsigned char*)malloc(outputBytes);
        memset(outputBuffer, 0, outputBytes);

        // describe input data we'll pass into converter
        AudioBuffer inputBuffer;
        inputBuffer.mNumberChannels = inputDescription.mChannelsPerFrame;
        inputBuffer.mDataByteSize = *streamSize;
        inputBuffer.mData = *stream;

        // describe output data buffers into which we can receive data.
        AudioBufferList outputBufferList;
        outputBufferList.mNumberBuffers = 1;
        outputBufferList.mBuffers[0].mNumberChannels = outputDescription.mChannelsPerFrame;
        outputBufferList.mBuffers[0].mDataByteSize = outputBytes;
        outputBufferList.mBuffers[0].mData = outputBuffer;

        // set output data packet size
        UInt32 outputDataPacketSize = outputBytes / outputDescription.mBytesPerPacket;

        // fill class members with data that we'll pass into
        // the InputDataProc
        _converter_currentBuffer = &inputBuffer;
        _converter_currentInputDescription = inputDescription;

        // convert
        OSStatus result = AudioConverterFillComplexBuffer(audioConverter, /* AudioConverterRef inAudioConverter */
                                                          CoreAudio_AudioManager::_converterComplexInputDataProc, /* AudioConverterComplexInputDataProc inInputDataProc */
                                                          this, /* void *inInputDataProcUserData */
                                                          &outputDataPacketSize, /* UInt32 *ioOutputDataPacketSize */
                                                          &outputBufferList, /* AudioBufferList *outOutputData */
                                                          NULL /* AudioStreamPacketDescription *outPacketDescription */
                                                          );

        // change "stream" to describe our output buffer.
        // even if error occured, we'd rather have silence than unconverted audio.
        free(*stream);
        *stream = outputBuffer;
        *streamSize = outputBytes;

        // dispose of the audio converter
        AudioConverterDispose(audioConverter);
    }
}


OSStatus CoreAudio_AudioManager::_converterComplexInputDataProc(AudioConverterRef inAudioConverter,
                                                                UInt32* ioNumberDataPackets,
                                                                AudioBufferList* ioData,
                                                                AudioStreamPacketDescription** ioDataPacketDescription,
                                                                void* inUserData)
{
    if(ioDataPacketDescription)
    {
        xal::log("_converterComplexInputDataProc cannot provide input data; it doesn't know how to provide packet descriptions");
        *ioDataPacketDescription = NULL;
        *ioNumberDataPackets = 0;
        ioData->mNumberBuffers = 0;
        return 501;
    }

    CoreAudio_AudioManager *self = (CoreAudio_AudioManager*)inUserData;

    ioData->mNumberBuffers = 1;
    ioData->mBuffers[0] = *(self->_converter_currentBuffer);

    *ioNumberDataPackets = ioData->mBuffers[0].mDataByteSize / self->_converter_currentInputDescription.mBytesPerPacket;
    return 0;
}

在 header 中,作为 CoreAudio_AudioManager 类的一部分,以下是相关的实例变量:

    AudioStreamBasicDescription unitDescription;
    AudioBuffer *_converter_currentBuffer;
    AudioStreamBasicDescription _converter_currentInputDescription;
<小时/>

几个月后,当我查看此内容时,我意识到我没有记录这些更改。

如果您对更改内容感兴趣:

  • 看回调函数CoreAudio_AudioManager::_converterComplexInputDataProc
  • 必须在 ioNumberDataPackets 中正确指定输出数据包的数量
  • 这需要引入新的实例变量来保存缓冲区(之前的 inUserData)和输入描述(用于计算要送入 Core Audio 转换器的数据包数量)<
  • “输出”数据包(输入转换器的数据包)的计算是根据我们的回调收到的数据量以及输入格式包含的每个数据包的字节数来完成的

希望此编辑能够帮助 future 的读者(包括我自己)!

关于iphone - OS X/iOS - 使用 AudioConverterFillComplexBuffer 进行缓冲区的采样率转换,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6610958/

相关文章:

audio - 简单音频单元使AU Lab崩溃,寻找HasCustomView吗?

iphone - 在 UINavigationController 和模式演示中将数据从子级传递到父级的更好方法 : reference to parent or delegates?

c - OpenCv Makefile Undefined symbols for architecture x86_64 错误

iphone - 如何在 iOS 上将文件存储在不同的临时/缓存文件夹中

xcode - 配置要使用特定构建工具版本构建的 Xcode 项目

linux - 脚本和空格

swift - 音频流格式和数据类型与 Core Audio 的混淆

objective-c - 从核心音频获取内置输出

iphone - 最大大小仍然适合20MB Appstore限制?

iphone - 在标签栏 Controller 中制作几乎相似的标签的最佳方法