当前位置：首页 > article >正文

ffmpeg音视频开发从入门到精通——ffmpeg实现音频抽取

article 2025/2/19 6:19:45

文章目录

- FFmpeg 实现音频流抽取
- - 1. 包含FFmpeg头文件与命名空间声明
  - 2. 主函数与参数处理
  - 3. 打开输入文件
  - 4. 获取文件信息
  - 5. 查找音频流
  - 6. 分配输出文件上下文
  - 7. 猜测输出文件格式
  - 8. 创建新的音频流
  - 9. 打开输出文件
  - 10. 写入文件头信息
  - 11. 读取并写入音频数据
  - 12. 写入文件尾部信息并释放资源
- 运行程序
- 注意事项
- 抽取音频完整代码

FFmpeg 实现音频流抽取

1. 包含FFmpeg头文件与命名空间声明

使用FFmpeg库前需要包含相应的头文件，并在C++中声明外部C函数的命名空间。

#ifdef __cplusplus
extern "C" {
#endif
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/log.h>
#ifdef __cplusplus
}
#endif

2. 主函数与参数处理

程序入口点，处理命令行参数。

int main(int argc, char *argv[]) {
    // 参数检查
    if (argc < 3) {
        av_log(nullptr, AV_LOG_INFO, "参数必须多于3个\n");
        exit(-1);
    }
    // 输入输出文件路径
    char *src = argv[1];
    char *dst = argv[2];
    // ...
}

3. 打开输入文件

使用avformat_open_input打开输入文件。
avformat_open_input 是 FFmpeg 库中的一个函数，用于打开输入媒体文件并读取其格式信息。这个函数是 FFmpeg 中处理多媒体文件的基础之一，通常准备解码和处理音视频流。

函数原型

int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options);

输入参数
1. AVFormatContext **ps:
  - 这是一个指向 AVFormatContext 指针的指针。AVFormatContext 是一个结构体，包含了关于输入媒体文件的所有信息，包括流的信息、格式、时长等。
  - 在调用 avformat_open_input 之前，通常需要先分配一个 AVFormatContext 结构体的内存（可以使用 avformat_alloc_context() 函数），然后将其地址传递给该参数。
  - 如果函数成功，*ps 将指向一个已填充的 AVFormatContext 结构体。
2. const char *url:
  - 这是一个指向字符串的指针，表示要打开的媒体文件的路径或 URL。可以是本地文件路径，也可以是网络流的 URL（如 HTTP、RTSP 等）。
3. AVInputFormat *fmt:
  - 这是一个指向 AVInputFormat 结构体的指针，表示希望使用的输入格式。如果为 NULL，FFmpeg 将自动检测输入文件的格式。
  - 你可以通过 av_find_input_format 函数来查找特定的输入格式。
4. AVDictionary *options:
  - 这是一个指向字典的指针，用于传递额外的选项给输入格式。字典中的每个键值对都可以用来设置特定的解码选项，例如缓冲区大小、超时设置等。
  - 如果没有额外的选项，可以将此参数设置为 NULL。
返回值

函数返回一个整数值：
- 如果成功，返回 0。
- 如果失败，返回一个负数，表示错误代码。可以使用 av_strerror 函数将错误代码转换为可读的错误信息。

示例代码

以下是一个简单的示例，展示如何使用 avformat_open_input 打开一个媒体文件：

#include <libavformat/avformat.h>

int main() {
    AVFormatContext *formatContext = NULL;
    const char *filename = "input.mp4"; // 输入文件名
    int ret;

    // 注册所有的文件格式和编解码器
    av_register_all();

    // 打开输入文件
    ret = avformat_open_input(&formatContext, filename, NULL, NULL);
    if (ret < 0) {
        char errbuf[128];
        av_strerror(ret, errbuf, sizeof(errbuf));
        fprintf(stderr, "Could not open source file %s: %s\n", filename, errbuf);
        return ret;
    }

    // 打印文件信息
    av_dump_format(formatContext, 0, filename, 0);

    // 关闭输入文件
    avformat_close_input(&formatContext);

    return 0;
}

avformat_open_input 是 FFmpeg 中用于打开和读取媒体的关键函数。通过正确设置输入参数，可以方便地打开各种格式的音视频文件，并获取其相关信息。

ret = avformat_open_input(&pFmtCtx, src, nullptr, nullptr);
if (ret < 0) {
    av_log(nullptr, AV_LOG_ERROR, "打开输入文件失败\n");
    exit(-1);
}

4. 获取文件信息

调用avformat_find_stream_info获取多媒体文件的流信息。
avformat_find_stream_info 是 FFmpeg 库中的一个函数，用于读取媒体文件的流信息（如音频流、视频流、字幕流等）。这个函数通常在成功打开输入文件后调用，以获取有关各个流的详细信息。

函数原型

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);

输入参数
1. AVFormatContext *ic:
  - 这是一个指向 AVFormatContext 结构体的指针，该结构体在调用 avformat_open_input 时被填充。它包含了关于打开的媒体文件的所有信息，包括流的数量、每个流的类型、编解码器信息等。
  - 在调用 avformat_find_stream_info 之前，必须确保已经成功打开了媒体文件并且 AVFormatContext 已经被正确初始化。
2. AVDictionary **options:
  - 这是一个指向字典的指针，用于传递额外的选项给流信息查找过程。字典中的每个键值对可以用来设置特定的选项，例如解码器的参数、缓冲区大小等。
  - 如果没有额外的选项，可以将此参数设置为 NULL。
返回值

函数返回一个整数值：
- 如果成功，返回 0。
- 如果失败，返回一个负数，表示错误代码。可以使用 av_strerror 函数将错误代码转换为可读的错误信息。

avformat_find_stream_info 函数的主要作用是填充 AVFormatContext 中的流信息，以便后续处理和解码。通过正确设置输入参数，可以有效地获取媒体文件中各个流的详细信息。

if ((ret = avformat_find_stream_info(pFmtCtx, nullptr)) < 0) {
    av_log(nullptr, AV_LOG_INFO, "获取文件信息失败\n");
    exit(-1);
}

5. 查找音频流

遍历所有流，找到音频流的索引。

for (int i = 0; i < pFmtCtx->nb_streams; ++i) {
    if (pFmtCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
        idx = i;
        break;
    }
}

6. 分配输出文件上下文

使用avformat_alloc_context分配输出文件的格式上下文。

输入参数

无输入参数：
- 该函数不需要任何输入参数。

返回值

**返回值：
- 返回一个指向 AVFormatContext 结构体的指针。
- 如果内存分配失败，返回 NULL。

总结

功能：分配并初始化一个新的 AVFormatContext 结构体，用于存储媒体文件的格式信息。
用途：在打开媒体文件之前，通常需要调用此函数以准备好格式上下文。

oFmtCtx = avformat_alloc_context();
if (!oFmtCtx) {
    av_log(nullptr, AV_LOG_ERROR, "分配输出文件上下文失败\n");
    goto _ERROR;
}

7. 猜测输出文件格式

使用av_guess_format猜测输出文件的格式。
av_guess_format 是 FFmpeg 库中的一个函数，用于根据给定的文件扩展名或 MIME 类型来猜测媒体文件的格式。这个函数在处理多媒体文件时非常有用，尤其是在需要确定输入或输出格式时。

AVInputFormat *av_guess_format(const char *short_name, const char *filename, const char *mime_type);

功能

根据文件扩展名：通过提供的文件名的扩展名来猜测媒体格式。
根据 MIME 类型：如果提供了 MIME 类型，可以根据 MIME 类型来猜测格式。
返回相应的格式：返回一个指向 AVInputFormat 结构体的指针，表示猜测的输入格式。如果无法猜测，则返回 NULL。

参数

*const char short_name：短名称（如 “mp4”、“avi” 等），用于直接匹配格式。
*const char filename：文件名，通常包含扩展名，用于推断格式。
*const char mime_type：MIME 类型字符串（如 “video/mp4”），用于进一步确认格式。

用途

在打开媒体文件之前，使用 av_guess_format 可以帮助确定合适的输入格式，从而为后续的解码和处理做好准备。

outFmt = av_guess_format(nullptr, dst, nullptr);
oFmtCtx->oformat = outFmt;

8. 创建新的音频流

为输出文件创建一个新的音频流，并复制输入音频流的参数。
avformat_new_stream 函数:

输入参数
1. *AVFormatContext s:
  - 指向 AVFormatContext 结构体的指针，表示要在其中添加新流的格式上下文。
2. *AVCodec c:
  - 指向 AVCodec 结构体的指针，表示新流所使用的编解码器。如果为 NULL，则会使用默认编解码器。
3. 返回值:
  - 返回一个指向新创建的 AVStream 结构体的指针。如果创建失败，则返回 NULL。
总结

avformat_new_stream 用于在给定的格式上下文中创建一个新的流，并可以指定其使用的编解码器。

outStream = avformat_new_stream(oFmtCtx, nullptr);
avcodec_parameters_copy(outStream->codecpar, inStream->codecpar);
outStream->codecpar->codec_tag = 0;

9. 打开输出文件

使用avio_open2打开输出文件准备写入。
avio_open2 是 FFmpeg 库中的一个函数，用于打开一个 I/O 设备（如文件、网络流等）以进行读写操作。这个函数提供了更灵活的选项来配置打开的方式和行为。

int avio_open2(AVIOContext **s, const char *url, int flags, AVDictionary **options);

功能

打开 I/O 设备：根据提供的 URL（可以是文件路径或网络地址）打开一个 I/O 设备。
配置选项：允许通过字典传递额外的选项，以定制打开设备的行为。

输入参数
1. **AVIOContext s:
  - 指向 AVIOContext 指针的指针，用于返回打开的 I/O 上下文。如果成功，*s 将指向一个已初始化的 AVIOContext 结构体。
2. *const char url:
  - 指向字符串的指针，表示要打开的设备的 URL（例如文件路径或网络地址）。
3. int flags:
  - 整数标志，用于指定打开设备的模式。常见的标志包括：
    - AVIO_FLAG_READ：以只读模式打开。
    - AVIO_FLAG_WRITE：以写入模式打开。
    - AVIO_FLAG_READ_WRITE：以读写模式打开。
4. **AVDictionary options:
  - 指向字典的指针，用于传递额外的选项给打开过程。可以用来设置特定的参数，如超时、缓冲区大小等。如果没有额外的选项，可以将此参数设置为 NULL。
返回值

函数返回一个整数值：
- 如果成功，返回 0。
- 如果失败，返回一个负数，表示错误代码。

总结
avio_open2 是用于打开 I/O 设备的关键函数，支持多种打开模式和配置选项，适用于文件和网络流的读写操作。

ret = avio_open2(&oFmtCtx->pb, dst, AVIO_FLAG_WRITE, nullptr, nullptr);
if (ret < 0) {
    av_log(nullptr, AV_LOG_ERROR, "打开输出文件失败\n");
    goto _ERROR;
}

10. 写入文件头信息

调用avformat_write_header写入文件头信息。

写入文件头：在输出文件中写入必要的头部信息，以便后续的数据流可以正确地被解析和播放。
初始化输出格式：根据 AVFormatContext 中的信息，设置输出格式并准备写入数据。
输入参数
*AVFormatContext s:

指向 AVFormatContext 结构体的指针，表示要写入的输出格式上下文。该结构体应在调用此函数之前被正确初始化，并且流信息应已设置。
**AVDictionary options:

指向字典的指针，用于传递额外的选项给写入头部的过程。可以用来设置特定的参数，如编码器选项、元数据等。如果没有额外的选项，可以将此参数设置为 NULL。

ret = avformat_write_header(oFmtCtx, nullptr);
if (ret < 0) {
    av_log(nullptr, AV_LOG_ERROR, "写入文件头失败\n");
    goto _ERROR;
}

11. 读取并写入音频数据

读取输入文件的音频数据，转换时间戳，并写入输出文件。

while (av_read_frame(pFmtCtx, &pkt) >= 0) {
    if (pkt.stream_index == idx) {
        // 转换时间戳等
        pkt.pts = av_rescale_q_rnd(pkt.pts, inStream->time_base, outStream->time_base, AV_ROUND_NEAR_INF);
        pkt.dts = pkt.pts;
        // 写入输出文件
        av_interleaved_write_frame(oFmtCtx, &pkt);
    }
    av_packet_unref(&pkt);
}

12. 写入文件尾部信息并释放资源

写入文件尾部信息，关闭文件，并释放所有分配的资源。

av_write_trailer(oFmtCtx);
avio_close(oFmtCtx->pb);
avformat_free_context(oFmtCtx);

_ERROR:
    // 清理资源
    if (pFmtCtx) {
       	avformat_free_context(pFmtCtx);
       #  avformat_close_input(&pFmtCtx);
    }
    if (oFmtCtx) {
       	avformat_free_context(oFmtCtx);
        # avformat_close_input(&oFmtCtx); // 注意：应使用 avformat_free_context 代替
    }
}

请注意，错误处理部分应使用avformat_free_context代替avformat_close_input来正确释放oFmtCtx资源。另外，程序中存在一些潜在的内存泄漏和错误处理问题，应进一步优化。

运行程序

程序需要传入至少两个参数：输入文件路径和输出文件路径。例如：

./my_ffmpeg_tool input.mp3 output.aac

注意事项

- 确保FFmpeg开发库已正确安装且可链接。
- 检查程序输出的错误信息以进行调试。
- 程序可能需要适当的读取和写入权限。

抽取音频完整代码

cmake_minimum_required(VERSION 3.27)
project(FFmpeg_exercise)
set(CMAKE_CXX_STANDARD 14)

# 定义FFmpeg的安装路径变量
set(FFMPEG_INSTALL_DIR "/usr/local/ffmpeg")

# 将FFmpeg的头文件目录添加到包含路径
include_directories(${FFMPEG_INSTALL_DIR}/include)

# 定义FFmpeg库的基础名称（根据你的需要调整）
set(FFMPEG_LIBS "avcodec;avformat;avutil") # 用分号分隔库名

# 寻找并链接FFmpeg库
foreach(FFMPEG_LIB ${FFMPEG_LIBS})
    find_library(${FFMPEG_LIB}_LIBRARY NAMES ${FFMPEG_LIB}
            PATHS ${FFMPEG_INSTALL_DIR}/lib NO_DEFAULT_PATH)
    list(APPEND FFMPEG_LIBRARIES ${${FFMPEG_LIB}_LIBRARY})
endforeach()

add_executable(FFmpeg_exercise # main.cpp
        extra_audic.cpp)
# 链接FFmpeg库
target_link_libraries(FFmpeg_exercise ${FFMPEG_LIBRARIES})

//
// Created by 陈伟峰 on 2024/6/22.
//
#ifdef __cplusplus
extern "C" {
#endif
// 包含FFmpeg的头文件
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/log.h>
#ifdef __cplusplus

}
#endif
#include <iostream>

int main(int argc,char *argv[]){
    int ret = -1;
    int idx = -1;
    //1.处理一些参数；
    char *src {nullptr};
    char *dst {nullptr};
    AVFormatContext *pFmtCtx {nullptr};
    AVFormatContext *oFmtCtx {nullptr};
    AVOutputFormat *outFmt {nullptr};

    AVStream *inStream {nullptr};
    AVStream *outStream {nullptr};
    AVPacket pkt {nullptr};

    //    设置日志级别
    av_log_set_level(AV_LOG_DEBUG);

    if(argc<3){
        av_log(nullptr,AV_LOG_INFO,"arguments must be more than 3\n");
        exit(-1);
    }

    src = argv[1];
    dst = argv[2];


    //2.打开输入多媒体文件
    ret = avformat_open_input(&pFmtCtx,src,nullptr,nullptr);
    if (ret<0){
        av_log(nullptr,AV_LOG_ERROR,"avformat_open_input failed\n");
        exit(-1);
    }

    //3.获取多媒体文件信息
    if ((ret= avformat_find_stream_info(pFmtCtx,nullptr))<0){
        av_log(nullptr,AV_LOG_INFO,"avformat_find_stream_info failed\n");
        exit(-1);
    }

    //4.遍历所有流，找到音频流
    for (int i = 0; i < pFmtCtx->nb_streams; ++i) {
        if (pFmtCtx->streams[i]->codecpar->codec_type==AVMEDIA_TYPE_AUDIO){
            idx = i;
            av_log(nullptr,AV_LOG_INFO,"find_stream_info Successed!\n");
            break;
        }
    }

    if (idx<0){
        av_log(nullptr,AV_LOG_ERROR,"can not find audio stream\n");
        exit(-1);
    }

    // 打开目的文件上下文
    oFmtCtx = avformat_alloc_context();
    if(!oFmtCtx){
        av_log(nullptr,AV_LOG_ERROR,"avformat_alloc_context failed\n");
        goto _ERROR;
    }

    outFmt = av_guess_format(nullptr,dst,nullptr);
    oFmtCtx->oformat = outFmt;

    // 为目的文件，创建一个新的音频流
    outStream = avformat_new_stream(oFmtCtx,nullptr);
    // 设置输出音频参数
    inStream = pFmtCtx->streams[idx];
    avcodec_parameters_copy(outStream->codecpar,inStream->codecpar);
    outStream->codecpar->codec_tag = 0;

    // 绑定
    ret = avio_open2(&oFmtCtx->pb,dst,AVIO_FLAG_WRITE,nullptr,nullptr);
    if(ret<0){
        av_log(nullptr,AV_LOG_ERROR,"avio_open2 failed\n");
        goto _ERROR;
    }
    // 写多媒体文件到目的文件
    ret = avformat_write_header(oFmtCtx,nullptr);
    if(ret<0){
        av_log(nullptr,AV_LOG_ERROR, "error:%s",av_err2str(ret));
        goto _ERROR;
    }
    // 读取输入文件中的音频数据
    while (av_read_frame(pFmtCtx,&pkt)>=0) {
        if(pkt.stream_index==idx){
            // 写入输出文件
            pkt.pts = av_rescale_q_rnd(pkt.pts,inStream->time_base,outStream->time_base,(AVRounding)(AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
            pkt.dts = pkt.pts;
            pkt.duration = av_rescale_q(pkt.duration,inStream->time_base,outStream->time_base);
            pkt.stream_index = 0;
            pkt.pos = -1;
            av_interleaved_write_frame(oFmtCtx,&pkt);
        }
        av_packet_unref(&pkt);
    }

    // 写入文件尾
    av_write_trailer(oFmtCtx);
    // 释放资源
    avio_close(oFmtCtx->pb);
    avformat_free_context(oFmtCtx);

_ERROR:
    if(pFmtCtx){
//        avformat_close_input(&pFmtCtx);
        avformat_free_context(pFmtCtx);
        pFmtCtx = nullptr;
    }
    if(oFmtCtx){
//        avformat_close_input(&oFmtCtx);
        avformat_free_context(oFmtCtx);
        oFmtCtx = nullptr;
    }
};