我的代码可以成功提取 H264 流的所有 NAL 单元,并将其打包到 Avi 文件中。我还可以解析 SPS、PPS 以及 NAL 单元类型 1 和 5。然后,我提取整个 GOP(图片组),从 SPS 和 PPS 开始,然后是 IDR NAL 单元,最后以最后一个非 IDR NAL 结束下一个 SPS 之前的单元。
然后,我根据规范 8.2 重新排序 NAL 单元,以获得正确的 PicOrderCnt(将其作为播放时间戳 PTS 传递)。
所以我有一个包含 GOP 的 NAL 单元的数组和第二个包含 NAL 单元的 PTS 的数组。
然后,我将起始代码更改为 AVC 格式(NAL 单元的长度)。
之后,我将不带起始代码的 SPS 和 PPS 放入 CMVideoFormatDescriptionCreateFromH264ParameterSets
。
之后,我将所有 NAL 单元([UInt8]-数组)放入 CMSampleBuffer
,使用 PicOrderCnt 作为 CMSampleTimingInfo
。
然后,我的代码在使用 VTDecompressionSessionDecodeFrame
时成功解码视频帧。
不幸的是,某些 GOP 不起作用,对于某些帧,我收到错误 kVTVideoDecoderBadDataErr
并且我无法解释原因。
例如:我的实际组是从DTS(解码时间戳)770开始的,什么是关键帧。这是我的调试打印:
DTS: 770 | PTS: 771 | NAL-Type 5: frame_num: 0 // slice_type: 7 // pic_order_cnt_lsb: 0 | PicOrderCnt: 0
DTS: 771 | PTS: 773 | NAL-Type 1: frame_num: 1 // slice_type: 5 // pic_order_cnt_lsb: 4 | PicOrderCnt: 4
DTS: 772 | PTS: 772 | NAL-Type 1: frame_num: 2 // slice_type: 6 // pic_order_cnt_lsb: 2 | PicOrderCnt: 2
DTS: 773 | PTS: 776 | NAL-Type 1: frame_num: 2 // slice_type: 5 // pic_order_cnt_lsb: 10 | PicOrderCnt: 10
DTS: 774 | PTS: 775 | NAL-Type 1: frame_num: 3 // slice_type: 6 // pic_order_cnt_lsb: 8 | PicOrderCnt: 8
DTS: 775 | PTS: 774 | NAL-Type 1: frame_num: 4 // slice_type: 6 // pic_order_cnt_lsb: 6 | PicOrderCnt: 6
DTS: 776 | PTS: 779 | NAL-Type 1: frame_num: 4 // slice_type: 5 // pic_order_cnt_lsb: 16 | PicOrderCnt: 16
...
DTS: 815 | PTS: 818 | NAL-Type 1: frame_num: 14 // slice_type: 5 // pic_order_cnt_lsb: 30 | PicOrderCnt: 94
DTS: 816 | PTS: 817 | NAL-Type 1: frame_num: 15 // slice_type: 6 // pic_order_cnt_lsb: 28 | PicOrderCnt: 92
DTS: 817 | PTS: 816 | NAL-Type 1: frame_num: 0 // slice_type: 6 // pic_order_cnt_lsb: 26 | PicOrderCnt: 90
DTS: 818 | PTS: 821 | NAL-Type 1: frame_num: 0 // slice_type: 5 // pic_order_cnt_lsb: 36 | PicOrderCnt: 100
DTS: 819 | PTS: 820 | NAL-Type 1: frame_num: 1 // slice_type: 6 // pic_order_cnt_lsb: 34 | PicOrderCnt: 98
DTS: 820 | PTS: 819 | NAL-Type 1: frame_num: 2 // slice_type: 6 // pic_order_cnt_lsb: 32 | PicOrderCnt: 96
DTS: 821 | PTS: 824 | NAL-Type 1: frame_num: 2 // slice_type: 5 // pic_order_cnt_lsb: 42 | PicOrderCnt: 106
DTS: 822 | PTS: 823 | NAL-Type 1: frame_num: 3 // slice_type: 6 // pic_order_cnt_lsb: 40 | PicOrderCnt: 104
DTS: 823 | PTS: 822 | NAL-Type 1: frame_num: 4 // slice_type: 6 // pic_order_cnt_lsb: 38 | PicOrderCnt: 102
DTS: 824 | PTS: 827 | NAL-Type 1: frame_num: 4 // slice_type: 5 // pic_order_cnt_lsb: 48 | PicOrderCnt: 112
DTS: 825 | PTS: 826 | NAL-Type 1: frame_num: 5 // slice_type: 6 // pic_order_cnt_lsb: 46 | PicOrderCnt: 110
我从 PTS 819 开始遇到错误。
这是我的代码:
func decodeGroup(_ group: AviH264Analyzer.GOP, fps: Double) {
DispatchQueue(label: "decode").async {
let sps = group.spsNAL.bytesWithoutStartCode
let pps = group.ppsNAL.bytesWithoutStartCode
var formatDesc: CMVideoFormatDescription?
var status = sps.withUnsafeBufferPointer { spsPtr in
pps.withUnsafeBufferPointer { ppsPtr in
let paramSet = [spsPtr.baseAddress!, ppsPtr.baseAddress!]
let paramSizes = [sps.count, pps.count]
return paramSet.withUnsafeBufferPointer { paramSetPtr in
paramSizes.withUnsafeBufferPointer { paramSizesPtr in
CMVideoFormatDescriptionCreateFromH264ParameterSets(allocator: nil,
parameterSetCount: 2,
parameterSetPointers: paramSetPtr.baseAddress!,
parameterSetSizes: paramSizesPtr.baseAddress!,
nalUnitHeaderLength: 4,
formatDescriptionOut: &formatDesc)
}
}
}
}
var callback = VTDecompressionOutputCallbackRecord()
callback.decompressionOutputCallback = { (_, _, status, _, imageBuffer, pts, _) in
if let imageBuffer {
let ciImage = CIImage(cvImageBuffer: imageBuffer)
if let cgImage = CIContext().createCGImage(ciImage, from: ciImage.extent) {
let rep = NSBitmapImageRep(cgImage: cgImage)
if let imgData = rep.representation(using: .png, properties: [:]), let nsImage = NSImage(data: imgData) {
let frameNumber = Int(pts.value)-1
if !VideoBuffer.shared.buffer.map({ $0.frameNumber }).contains(frameNumber) {
VideoBuffer.shared.buffer.append(VideoFrame(frameNumber: frameNumber, image: nsImage))
}
}
}
}
}
let decoderParameters = NSMutableDictionary()
let destinationPixelBufferAttributes = NSMutableDictionary()
destinationPixelBufferAttributes.setValue(
NSNumber(value: kCVPixelFormatType_32ARGB),
forKey: kCVPixelBufferPixelFormatTypeKey as String
)
var decompressionSession: VTDecompressionSession?
status = VTDecompressionSessionCreate(allocator: kCFAllocatorDefault,
formatDescription: formatDesc!,
decoderSpecification: decoderParameters,
imageBufferAttributes: destinationPixelBufferAttributes,
outputCallback: &callback,
decompressionSessionOut: &decompressionSession)
if status != noErr {
handleStatus(status)
} else {
print("DecompressionSession sucessfully created")
}
let nalus = group.nalus
self.decodeNALUnits(nalus: nalus,
order: group.order,
fps: fps,
formatDesc: formatDesc!,
decompressionSession: decompressionSession!)
}
}
func decodeNALUnits(nalus: [PictureNAL], order: [Int], fps: Double, formatDesc: CMVideoFormatDescription, decompressionSession: VTDecompressionSession) {
var videoData = [UInt8]()
var sampleSizeArray = [Int]()
for nalu in nalus {
var bytes = nalu.bytesWithoutStartCode
// the length of the NALU
var bigLen = CFSwapInt32HostToBig(UInt32(bytes.count))
bytes.insert(contentsOf: withUnsafeBytes(of: &bigLen, { Array($0) }), at: 0)
videoData += bytes
sampleSizeArray.append(bytes.count)
}
var blockBuffer: CMBlockBuffer?
let count = videoData.count
var status = videoData.withUnsafeMutableBufferPointer { bufferPointer in
return CMBlockBufferCreateWithMemoryBlock(allocator: kCFAllocatorDefault,
memoryBlock: bufferPointer.baseAddress!,
blockLength: count,
blockAllocator: kCFAllocatorNull,
customBlockSource: nil,
offsetToData: 0,
dataLength: count,
flags: 0,
blockBufferOut: &blockBuffer)
}
if status != noErr {
handleStatus(status)
} else {
print("CMBlockBufferCreateWithMemoryBlock sucess")
}
let frameDuration = CMTimeMake(value: 1, timescale: Int32(fps))
var timing = [CMSampleTimingInfo]()
for i in 0..<nalus.count {
let pts = order[i]
let presentationTime = CMTimeMake(value: Int64(pts), timescale: Int32(fps))
let timingInfo = CMSampleTimingInfo(duration: frameDuration,
presentationTimeStamp: presentationTime,
decodeTimeStamp: CMTime.invalid)
timing.append(timingInfo)
}
var sampleBuffer: CMSampleBuffer?
status = CMSampleBufferCreateReady(allocator: kCFAllocatorDefault,
dataBuffer: blockBuffer,
formatDescription: formatDesc,
sampleCount: sampleSizeArray.count,
sampleTimingEntryCount: timing.count,
sampleTimingArray: &timing,
sampleSizeEntryCount: sampleSizeArray.count,
sampleSizeArray: sampleSizeArray,
sampleBufferOut: &sampleBuffer)
if status != noErr {
handleStatus(status)
} else {
print("CMSampleBufferCreateReady sucess")
}
guard let buffer = sampleBuffer else {
print("Could not unwrap sampleBuffer!")
return
}
var outputBuffer: CVPixelBuffer?
status = VTDecompressionSessionDecodeFrame(decompressionSession,
sampleBuffer: buffer,
flags: [._EnableAsynchronousDecompression, ._EnableTemporalProcessing],
frameRefcon: &outputBuffer,
infoFlagsOut: nil)
if status != noErr {
print(status)
handleStatus(status)
} else {
print("VTDecompressionSessionDecodeFrame sucess")
}
}
最佳答案
我简直不敢相信......我通过将我自己的解析器中的 NAL 单元与 libav 解析的 NAL 单元进行比较找到了解决方案。我能发现的唯一区别是,我的解析器删除了模拟预防字节(H264 规范的 7.3.1)。经过一番尝试和错误,这是我的解决方案:
SPS 和 PPS 必须在不带模拟阻止字节(删除 0x000003 中的 0x03)且不带起始代码(0x000001 或 0x00000001)的情况下传递到
CMVideoFormatDescriptionCreateFromH264ParameterSets
VCL-nal 单元必须以 AVCC 格式并带有模拟预防字节传递到
VTDecompressionSessionDecodeFrame
。
关于h.264 - 使用 VTDecompressionSessionDecodeFrame 和 H264 NAL 单元时的 kVTVideoDecoderBadDataErr,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76281273/