delphi - 尽管输出文件为 "smooth",但在渲染 DirectShow 过滤器期间出现卡顿

标签 delphi filter directshow audio-streaming dspack

我有一个使用 DSPACK 组件库用 Delphi 6 编写的 DirectShow 应用程序。我有两个相互配合的过滤器图表。

过滤器图表具有以下结构:

  1. 捕获过滤器,缓冲区大小为 100 毫秒。
  2. (连接到)样本采集过滤器。

“辅助”过滤器图具有此结构。

  1. 自定义推送源过滤器,将音频直接接受到其管理的音频缓冲区仓库。
  2. (连接到)渲染过滤器。

推送源过滤器使用事件来控制音频的传送。它的 FillBuffer() 命令等待事件。当新的音频数据添加到缓冲区时,会发出该事件信号。

当我运行过滤器图表时,我听到音频中出现微小的“间隙”。通常我将这种情况与构造不正确的音频缓冲区联系起来,这些缓冲区未填充或其中有“间隙”。但作为测试,我添加了一个 Tee Filter,并连接了一个 WAV Dest Filter,然后连接了一个 File Writer 过滤器。当我检查输出的 WAV 文件时,它非常平滑且连续。换句话说,我从扬声器中听到的间隙在输出文件中并不明显。

这表明虽然来自捕获过滤器的音频正在成功传播,但音频缓冲区的传输正在受到周期性干扰。我听到的“间隙”不是每秒 10 次,而是每秒 2 或 3 次,有时甚至是短时间根本没有间隙。所以并不是每个缓冲区都会发生这种情况,否则我会每秒听到 10 次间隙。

我的第一个猜测是它是一个锁定问题,但我对事件设置了 150 毫秒的超时,如果发生这种情况,则会引发异常。没有抛出任何异常。我还在应用程序中使用的每个关键部分上设置了 40 毫秒的超时,并且没有触发其中任何一个。我检查了我的 OutputDebugString() 转储,并且无信号(被阻止)和有信号(未阻止)发生之间的时间显示出在 94 毫秒和 140 毫秒之间交替的相当恒定的模式。换句话说,我的推送源过滤器中的 FillBuffer() 调用会阻塞 94 毫秒,然后阻塞 140 毫秒,然后重复。请注意,持续时间略有偏差,但相当有规律。考虑到 Windows 线程切换的变幻莫测,该模式似乎与等待捕获过滤器以 100 毫秒间隔将其音频缓冲区转储到推送源过滤器的线程一致。

认为我在推送源过滤器中使用双缓冲,因此我相信,如果没有任何锁定机制花费 200 毫秒或更多的组合时间,我不应该中断音频流。但除了锁定问题之外,我想不出还有什么其他原因会导致这些症状。我已将 DecideBufferSize() 方法中的代码包含在下面的推送源过滤器中,以防我做错了什么。虽然有点长,但我还在下面添加了 FillBuffer() 调用,以展示我如何生成时间戳,以防可能产生影响。

尽管所有音频缓冲区都完好无损地交付,还有什么可能导致我的音频流到渲染过滤器时断断续续?

问题:我必须自己实现双缓冲吗?我认为 DirectShow 渲染过滤器可以为您做到这一点,否则我在没有自定义推送源过滤器的情况下创建的其他过滤器图表将无法正常工作。但也许因为我在过滤器图中创建了另一个锁定/解锁情况,所以我需要添加自己的双缓冲层?我当然想避免这种情况,以避免额外的延迟,所以如果我的情况有其他解决方案,我想知道。

function TPushSourcePinBase_wavaudio.DecideBufferSize(Allocator: IMemAllocator; Properties: PAllocatorProperties): HRESULT;
var
    // pvi: PVIDEOINFOHEADER;
    errMsg: string;
    Actual: ALLOCATOR_PROPERTIES;
    sampleSize, numBytesPerBuffer: integer;
    // ourOwnerFilter: TPushSourceFilterBase_wavaudio;
begin
    if (Allocator = nil) or (Properties = nil) then
    begin
        Result := E_POINTER;
        // =========================== EXIT POINT ==============
        Exit;
    end; // if (Allocator = nil) or (Properties = nil) then

    FFilter.StateLock.Lock;
    try
        // Allocate enough space for the desired amount of milliseconds
        //  we want to buffer (approximately).
        numBytesPerBuffer := Trunc((FOurOwnerFilter.WaveFormatEx.nAvgBytesPerSec / 1000) * FBufferLatencyMS);

        // Round it up to be an even multiple of the size of a sample in bytes.
        sampleSize := bytesPerSample(FOurOwnerFilter.WaveFormatEx);

        // Round it down to the nearest increment of sample size.
        numBytesPerBuffer := (numBytesPerBuffer div sampleSize) * sampleSize;

        if gDebug then OutputDebugString(PChar(
            '(TPushSourcePinBase_wavaudio.DecideBufferSize) Resulting buffer size for audio is: ' + IntToStr(numBytesPerBuffer)
        ));

        // Sanity check on the buffer size.
        if numBytesPerBuffer < 1 then
        begin
            errMsg := '(TPushSourcePinBase_wavaudio.DecideBufferSize) The calculated number of bytes per buffer is zero or less.';

            if gDebug then OutputDebugString(PChar(errMsg));
            MessageBox(0, PChar(errMsg), 'PushSource Play Audio File filter error', MB_ICONERROR or MB_OK);

            Result := E_FAIL;
            // =========================== EXIT POINT ==============
            Exit;
        end;

        // --------------- Do the buffer allocation -----------------

        // Ensure a minimum number of buffers
        if (Properties.cBuffers = 0) then
            Properties.cBuffers := 2;

        Properties.cbBuffer := numBytesPerBuffer;

        Result := Allocator.SetProperties(Properties^, Actual);

        if Failed(Result) then
            // =========================== EXIT POINT ==============
            Exit;

        // Is this allocator unsuitable?
        if (Actual.cbBuffer < Properties.cbBuffer) then
            Result := E_FAIL
        else
            Result := S_OK;

    finally
        FFilter.StateLock.UnLock;
    end; // try()
end;

// *******************************************************


// This is where we provide the audio data.
function TPushSourcePinBase_wavaudio.FillBuffer(Sample: IMediaSample): HResult;
    // Given a Wave Format and a Byte count, convert the Byte count
    //  to a REFERENCE_TIME value.
    function byteCountToReferenceTime(waveFormat: TWaveFormat; numBytes: LongInt): REFERENCE_TIME;
    var
        durationInSeconds: Extended;
    begin
        if waveFormat.nAvgBytesPerSec <= 0 then
            raise Exception.Create('(TPushSourcePinBase_wavaudio.FillBuffer::byteCountToReferenceTime) Invalid average bytes per second value found in the wave format parameter: ' + IntToStr(waveFormat.nAvgBytesPerSec));

        // Calculate the duration in seconds given the audio format and the
        //  number of bytes requested.
        durationInSeconds := numBytes / waveFormat.nAvgBytesPerSec;

        // Convert it to increments of 100ns since that is the unit value
        //  for DirectShow timestamps (REFERENCE_TIME).
        Result :=
            Trunc(durationInSeconds * REFTIME_ONE_SECOND);
    end;

    // ---------------------------------------------------------------

    function min(v1, v2: DWord): DWord;
    begin
        if v1 <= v2 then
            Result := v1
        else
            Result := v2;
    end;

    // ---------------------------------------------------------------

var
    pData: PByte;
    cbData: Longint;
    pwfx: PWaveFormat;
    aryOutOfDataIDs: TDynamicStringArray;
    intfAudFiltNotify: IAudioFilterNotification;
    i: integer;
    errMsg: string;
    bIsShuttingDown: boolean;

    // MSDN: The REFERENCE_TIME data type defines the units for reference times
    //  in DirectShow. Each unit of reference time is 100 nanoseconds.
    Start, Stop: REFERENCE_TIME;
    durationInRefTime, ofsInRefTime: REFERENCE_TIME;
    wfOutputPin: TWaveFormat;

    aryDebug: TDynamicByteArray;
begin
    aryDebug := nil;

    if (Sample = nil) then
    begin
        Result := E_POINTER;
        // =========================== EXIT POINT ==============
        Exit;
    end; // if (Sample = nil) then

    // Quick lock to get sample size.
    FSharedState.Lock;
    try
        cbData := Sample.GetSize;
    finally
        // Don't want to have our filter state locked when calling
        //  isEnoughDataOrBlock() since that call can block.
        FSharedState.UnLock;
    end; // try

    aryOutOfDataIDs := nil;
    intfAudFiltNotify := nil;

    // This call will BLOCK until have enough data to satisfy the request
    //  or the buffer storage collection is freed.
    if FOurOwnerFilter.bufferStorageCollection.isEnoughDataOrBlock(cbData, bIsShuttingDown) then
    begin
        // If we are shutting down, just exit with S_FALSE as the return to
        //   tell the caller we are done streaming.
        if bIsShuttingDown then
        begin
            Result := S_FALSE;

            // =========================== EXIT POINT ==============
            exit;
        end; // if bIsShuttingDown then

        // Re-acquire the filter state lock.
        FSharedState.Lock;

        try
            // Get the data and return it.

            // Access the sample's data buffer
            cbData := Sample.GetSize;
            Sample.GetPointer(pData);

            // Make sure this format matches the media type we are supporting.
            pwfx := AMMediaType.pbFormat;       // This is the format that our Output pin is set to.
            wfOutputPin := waveFormatExToWaveFormat(FOurOwnerFilter.waveFormatEx);

            if not isEqualWaveFormat(pwfx^, wfOutputPin) then
            begin
                Result := E_FAIL;

                errMsg :=
                    '(TPushSourcePinBase_wavaudio.FillBuffer) The wave format of the incoming media sample does not match ours.'
                    + CRLF
                    + ' > Incoming sample: ' + waveFormatToString(pwfx^)
                    + CRLF
                    + ' > Our output pin: ' + waveFormatToString(wfOutputPin);
                OutputDebugString(PChar(errMsg));

                postComponentLogMessage_error(errMsg, FOurOwnerFilter.FFilterName);

                MessageBox(0, PChar(errMsg), 'PushSource Play Audio File filter error', MB_ICONERROR or MB_OK);

                Result := E_FAIL;

                // =========================== EXIT POINT ==============
                exit;
            end; // if not isEqualWaveFormatEx(pwfx^, FOurOwnerFilter.waveFormatEx) then

            // Convert the Byte index into the WAV data array into a reference
            //  time value in order to offset the start and end timestamps.
            ofsInRefTime := byteCountToReferenceTime(pwfx^, FWaveByteNdx);

            // Convert the number of bytes requested to a reference time vlaue.
            durationInRefTime := byteCountToReferenceTime(pwfx^, cbData);

            // Now I can calculate the timestamps that will govern the playback
            //  rate.
            Start := ofsInRefTime;
            Stop := Start + durationInRefTime;

            {
            OutputDebugString(PChar(
                '(TPushSourcePinBase_wavaudio.FillBuffer) Wave byte index, start time, stop time: '
                + IntToStr(FWaveByteNdx)
                + ', '
                + IntToStr(Start)
                + ', '
                + IntToStr(Stop)
            ));
            }

            Sample.SetTime(@Start, @Stop);

            // Set TRUE on every sample for uncompressed frames
            Sample.SetSyncPoint(True);

            // Check that we're still using audio
            Assert(IsEqualGUID(AMMediaType.formattype, FORMAT_WaveFormatEx));

{
// Debugging.
FillChar(pData^, cbData, 0);
SetLength(aryDebug, cbData);
if not FOurOwnerFilter.bufferStorageCollection.mixData(@aryDebug[0], cbData, aryOutOfDataIDs) then
}
            // Grab the requested number of bytes from the audio data.
            if not FOurOwnerFilter.bufferStorageCollection.mixData(pData, cbData, aryOutOfDataIDs) then
            begin
                // We should not have had any partial copies since we
                //  called isEnoughDataOrBlock(), which is not supposed to
                //  return TRUE unless there is enough data.
                Result := E_FAIL;

                errMsg := '(TPushSourcePinBase_wavaudio.FillBuffer) The mix-data call returned FALSE despite our waiting for sufficient data from all participating buffer channels.';
                OutputDebugString(PChar(errMsg));

                postComponentLogMessage_error(errMsg, FOurOwnerFilter.FFilterName);

                MessageBox(0, PChar(errMsg), 'PushSource Play Audio File filter error', MB_ICONERROR or MB_OK);

                Result := E_FAIL;

                // =========================== EXIT POINT ==============
                exit;
            end; // if not FOurOwnerFilter.bufferStorageCollection.mixData(pData, cbData, aryOutOfDataIDs) then

            // ------------- OUT OF DATA NOTIFICATIONS -----------------

            {
                WARNING:  TBufferStorageCollection automatically posts
                AudioFilterNotification messages to any buffer storage
                that has a IRequestStep user data interface attached to
                it!.
            }

            if FOurOwnerFilter.wndNotify > 0 then
            begin
                // ----- Post Audio Notification to Filter level notify handle ---
                if Length(aryOutOfDataIDs) > 0 then
                begin
                    for i := Low(aryOutOfDataIDs) to High(aryOutOfDataIDs) do
                    begin
                        // Create a notification and post it.
                        intfAudFiltNotify := TAudioFilterNotification.Create(aryOutOfDataIDs[i], afnOutOfData);

                        // ourOwnerFilter.intfNotifyRequestStep.triggerResult(intfAudFiltNotify);
                        PostMessageWithUserDataIntf(FOurOwnerFilter.wndNotify, WM_PUSH_SOURCE_FILTER_NOTIFY, intfAudFiltNotify);
                    end; // for()
                end; // if Length(aryOutOfDataIDs) > 0 then
            end; // if FOurOwnerFilter.wndNotify > 0 then

            // Advance the Wave Byte index by the number of bytes requested.
            Inc(FWaveByteNdx, cbData);

            Result := S_OK;
        finally
            FSharedState.UnLock;
        end; // try
    end
    else
    begin
        // Tell DirectShow to stop streaming with us.  Something has
        //  gone seriously wrong with the audio streams feeding us.
        errMsg := '(TPushSourcePinBase_wavaudio.FillBuffer) Time-out occurred while waiting for sufficient data to accumulate in our audio buffer channels.';
        OutputDebugString(PChar(errMsg));

        postComponentLogMessage_error(errMsg, FFilter.filterName);
        MessageBox(0, PChar(errMsg), 'PushSource Play Audio File filter error', MB_ICONERROR or MB_OK);

        Result := E_FAIL;
    end;
end;

最佳答案

首先,要对音频输出进行故障排除,您需要检查渲染器属性。高级选项卡可为您提供这些信息,您也可以通过 IAMAudioRendererStats 查询它们以编程方式接口(interface)。与文件播放中的属性不同的事情应该是对您流式传输正确性的警告。

Advanced Audio Renderer Properties

由于库存过滤器中的音频属性页面不像 DriectShow 视频过滤器那样坚如磐石,因此您可能需要一些技巧来弹出它。在您的应用程序中,当流式传输处于事件状态时,使用 OleCreatePropertyFrame 直接从您的代码、GUI 线程显示过滤器属性(例如,按下某个临时按钮的响应)。

对于播放问题的典型原因,我会检查以下内容:

  • 您不会为样本添加时间戳,并且您会按照推送的速度进行播放,有时您推送的内容会晚于之前样本播放完成的时间
  • 您的时间戳看起来正确,但它们相对于当前播放时间有所回溯,并且对于渲染器来说它们可能部分地出现延迟

这两种情况都应该对高级选项卡数据有所反射(reflect)。

关于delphi - 尽管输出文件为 "smooth",但在渲染 DirectShow 过滤器期间出现卡顿,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8331012/

相关文章:

delphi - 如何指定从 Delphi TStream 读取的组件的所有者?

delphi - Delphi 重构示例,涉及可直接访问数据库表的数据感知控件和数据模块

c# - 在 C# winforms 应用程序中使用文本框过滤 Treeview

video - FFMPEG - 改变我的覆盖视频的饱和度

C# 注册嵌入式 Directshow 过滤器

delphi - Delphi中如何检测屏幕分辨率变化?

delphi - 为什么在为工具按钮分配操作后无法使用它们?

django - 如何在 Django admin 中实现全局隐式过滤器?

ffmpeg 和内联视频设备名称

c++ - 使用 libde265 直接显示过滤器