c++ - 使用 libav/ffmpeg 将 RGB8 转换为 NV12

标签 c++ ffmpeg libav nv12-nv21

我正在尝试使用 libav 将输入 RGB8 图像转换为 NV12,但 sws_scale 引发了读取访问冲突。我一定是飞机或步幅不对,但我不明白为什么。
在这一点上,我相信我会受益于一双新鲜的眼睛。我错过了什么?


void convertRGB2NV12(unsigned char *rgb_in, width, height) {
 struct SwsContext* sws_context = nullptr;
 const int in_linesize[1] = {3 * width}; // RGB stride
 int out_linesize[2] = {width, width}; // NV12 stride

 // NV12 data is separated in two
 // planes, one for the intensity (Y) and another one for
 // the colours(UV) interleaved, both with
 // the same width as the frame but the UV plane with
 // half of its height.
 uint8_t* out_planes[2];
 out_planes[0] = new uint8_t[width * height];
 out_planes[1] = new uint8_t[width * height/2];

 sws_context = sws_getCachedContext(sws_context, width, height,
                                    AV_PIX_FMT_RGB8, width, height,
                                    AV_PIX_FMT_NV12, 0, 0, 0, 0);
 sws_scale(sws_context, (const uint8_t* const*)rgb_in, in_linesize,
           0, height, out_planes, out_linesize);
// (.....)
}

最佳答案

主要有两个问题:

  • 替换 AV_PIX_FMT_RGB8AV_PIX_FMT_RGB24 .
  • rgb_in应该用指针数组“包装”:
     const uint8_t* in_planes[1] = {rgb_in};
    
     sws_scale(sws_context, in_planes, ...)
    

  • 测试:
    使用 FFmpeg 命令行工具创建 RGB24 像素格式的二进制输入:
    ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -vcodec rawvideo -pix_fmt rgb24 -frames 1 -f rawvideo rgb_image.bin
    
    使用 C 代码读取输入图像:
    const int width = 192;
    const int height = 108;
    unsigned char* rgb_in = new uint8_t[width * height * 3];
    
    FILE* f = fopen("rgb_image.bin", "rb");
    fread(rgb_in, 1, width * height * 3, f);
    fclose(f);
    

    执行convertRGB2NV12(rgb_in, width, height); .
    在函数结束之前,添加将输出写入二进制文件的临时代码:
    FILE* f = fopen("nv12_image.bin", "wb");
    fwrite(out_planes[0], 1, width * height, f);
    fwrite(out_planes[1], 1, width * height/2, f);
    fclose(f);
    

    将 nv12_image.bin 作为灰度输入转换为 PNG 图像文件(用于查看结果):
    ffmpeg -y -f rawvideo -s 192x162 -pix_fmt gray -i nv12_image.bin -pix_fmt rgb24 nv12_image.png
    

    完整的代码示例:
    #include <stdio.h>
    #include <string.h>
    #include <stdint.h>
    
    extern "C"
    {
    #include <libswscale/swscale.h>
    }
    
    
    void convertRGB2NV12(const unsigned char *rgb_in, int width, int height)
    {
        struct SwsContext* sws_context = nullptr;
        const int in_linesize[1] = {3 * width}; // RGB stride
        const int out_linesize[2] = {width, width}; // NV12 stride
    
        // NV12 data is separated in two
        // planes, one for the intensity (Y) and another one for
        // the colours(UV) interleaved, both with
        // the same width as the frame but the UV plane with
        // half of its height.
        uint8_t* out_planes[2];
        out_planes[0] = new uint8_t[width * height];
        out_planes[1] = new uint8_t[width * height/2];
    
        sws_context = sws_getCachedContext(sws_context, width, height,
                                        AV_PIX_FMT_RGB24, width, height,
                                        AV_PIX_FMT_NV12, SWS_BILINEAR, nullptr, nullptr, nullptr);
    
        const uint8_t* in_planes[1] = {rgb_in};
    
        int response = sws_scale(sws_context, in_planes, in_linesize,
                                 0, height, out_planes, out_linesize);
    
        if (response < 0)
        {
            printf("Error: sws_scale response = %d\n", response);
            return;
        }
    
    // (.....)
    
        //Write NV12 output image to binary file (for testing)
        ////////////////////////////////////////////////////////////////////////////
        FILE* f = fopen("nv12_image.bin", "wb");
        fwrite(out_planes[0], 1, width * height, f);
        fwrite(out_planes[1], 1, width * height/2, f);
        fclose(f);
        ////////////////////////////////////////////////////////////////////////////
    
    
        delete[] out_planes[0];
        delete[] out_planes[1];
    
        sws_freeContext(sws_context);
    }
    
    
    
    int main()
    {
        //Use ffmpeg for building raw RGB image (used as input).
        //ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -vcodec rawvideo -pix_fmt rgb24 -frames 1 -f rawvideo rgb_image.bin
        
        const int width = 192;
        const int height = 108;
        unsigned char* rgb_in = new uint8_t[width * height * 3];
    
        //Read input image for binary file (for testing)
        ////////////////////////////////////////////////////////////////////////////
        FILE* f = fopen("rgb_image.bin", "rb");
        fread(rgb_in, 1, width * height * 3, f);
        fclose(f);
        ////////////////////////////////////////////////////////////////////////////
    
    
        convertRGB2NV12(rgb_in, width, height);
    
        delete[] rgb_in;
    
        return 0;
    }
    

    输入(RGB):
    enter image description here
    输出(NV12显示为灰度):
    enter image description here

    将 NV12 转换为 RGB:
    ffmpeg -y -f rawvideo -s 192x108 -pix_fmt nv12 -i nv12_image.bin -pix_fmt rgb24 rgb_output_image.png
    
    结果:
    enter image description here

    关于c++ - 使用 libav/ffmpeg 将 RGB8 转换为 NV12,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69438388/

    相关文章:

    c++ - 斐波那契 C++ gmp 生成器

    c++ - 如何在定义后立即创建结构的指针和实例

    ffmpeg - 从 WMV 进行更快的 FFMPEG 转换

    node.js - Node js ffmpeg hls流声音重复并相互叠加

    c++ - 使用 av_frame_get_buffer() 时发生内存泄漏

    ffmpeg - 如何控制ffmpeg关键帧的生成?

    c++ - 在 QtableView 中更改行标签起始索引(垂直标题)

    c++ - 将数据序列化代码从 C++ linux/mac 移植到 C++ windows

    python-3.x - 来自机器人的土 bean 质量音频

    c++ - 使用自定义 AVIOContext 查找流信息时出现 FFMPEG 错误