libaom 源码分析:aomenc.c 文件
libaom
libaom 是 AOMedia(开放媒体联盟)开发的一个开源视频编解码器库,它是 AV1 视频压缩格式的参考实现,并被广泛用于多种生产系统中。libaom 支持多种功能,包括可扩展视频编码(SVC)、实时通信(RTC)优化等,并定期进行更新以提高压缩效率和编码速度 。
libaom 的一些关键特性包括:
- 多空间层和时间层编码:通过
aom_svc_layer_id_t
结构体支持空间层和时间层的ID标识,允许视频在不同的分辨率和帧率下进行编码 。- 编码参数配置:通过
aom_svc_params_t
结构体等配置编码参数,如空间层数量、时间层数量、量化器、缩放因子等 。- 基础编码参数:
aom_codec_enc_cfg_t
结构体用于配置编码器的基础参数,如使用方式、时间基准、编码通道、帧重采样等 。- 多遍编码模式:支持多遍编码模式,包括单遍、双遍和多遍编码,以优化编码效率和质量 。
- 帧超分采样:支持帧超分辨率模式,通过
rc_superres_mode
枚举值控制放大过程 。- 关键帧放置:支持关键帧放置模式,通过
kf_mode
枚举值决定是否自动放置关键帧 。- SVC 编码参数:支持 SVC 编码的参数类型配置,如层数量、量化器、缩放因子等 。
libaom 的更新通常每三个月进行一次,最近的更新包括对 SVC 丢帧模式的支持、新的构建配置以减小二进制文件大小、以及对 RTC 屏幕内容压缩效率的显著提升 。此外,libaom 还提供了对 AV1 视频压缩格式的支持,包括实时编码模式和对不同质量控制策略的优化 。
libaom源码项目总图
aomenc.c
- 功能:libaom 项目完成编码过程的 demo
- 文件位置:libaom/apps/aomenc.c
函数关系
命令行说明
- 终端输入
./aomenc --help
,输出如下,展示aomenc demo 所涉及到的编码参数,可以看到没有关于 svc 相关的编码参数设置,如果想实验 libaom 的 svc 编码,可以参考examples/svc_encoder_rtc.cc
文件。
Usage: ./aomenc <options> -o dst_filename src_filename
Options:
--help Show usage options and exit
-c <arg>, --cfg=<arg> Config file to use
-D, --debug Debug mode (makes output deterministic)
-o <arg>, --output=<arg> Output filename
--codec=<arg> Codec to use
-p <arg>, --passes=<arg> Number of passes (1/2/3)
--pass=<arg> Pass to execute (1/2/3)
--fpf=<arg> First pass statistics file name
--limit=<arg> Stop encoding after n input frames
--skip=<arg> Skip the first n input frames
--good Use Good Quality Deadline
--rt Use Realtime Quality Deadline
--allintra Use all intra mode
-q, --quiet Do not print encode progress
-v, --verbose Show encoder parameters
--psnr=<arg> Show PSNR in status line (0: Disable PSNR status line display, 1: PSNR calculated using input bit-depth (default), 2: PSNR calculated using stream bit-depth); takes default option when arguments are not specified
--webm Output WebM (default when WebM IO is enabled)
--ivf Output IVF
--obu Output OBU
--q-hist=<arg> Show quantizer histogram (n-buckets)
--rate-hist=<arg> Show rate histogram (n-buckets)
--disable-warnings Disable warnings about potentially incorrect encode settings
-y, --disable-warning-prompt Display warnings, but do not prompt user to continue
--test-decode=<arg> Test encode/decode mismatch
off, fatal, warn
Encoder Global Options:
--nv12 Input file is NV12
--yv12 Input file is YV12
--i420 Input file is I420 (default)
--i422 Input file is I422
--i444 Input file is I444
-u <arg>, --usage=<arg> Usage profile number to use (0: good, 1: rt, 2: allintra)
-t <arg>, --threads=<arg> Max number of threads to use
--profile=<arg> Bitstream profile number to use
-w <arg>, --width=<arg> Frame width
-h <arg>, --height=<arg> Frame height
--forced_max_frame_width=<arg>
Maximum frame width value to force
--forced_max_frame_height=<arg>
Maximum frame height value to force
--stereo-mode=<arg> Stereo 3D video format
mono, left-right, bottom-top, top-bottom, right-left
--timebase=<arg> Output timestamp precision (fractional seconds)
--fps=<arg> Stream frame rate (rate/scale)
--global-error-resilient=<arg>
Enable global error resiliency features
-b <arg>, --bit-depth=<arg> Bit depth for codec
8, 10, 12
--input-bit-depth=<arg> Bit depth of input
--lag-in-frames=<arg> Max number of frames to lag
--large-scale-tile=<arg> Large scale tile coding (0: off (default), 1: on (ivf output only))
--monochrome Monochrome video (no chroma planes)
--full-still-picture-hdr Use full header for still picture
--use-16bit-internal Force use of 16-bit pipeline
--annexb=<arg> Save as Annex-B
Rate Control Options:
--drop-frame=<arg> Temporal resampling threshold (buf %)
--resize-mode=<arg> Frame resize mode (0: off (default), 1: fixed, 2: random, 3: dynamic)
--resize-denominator=<arg> Frame resize denominator
--resize-kf-denominator=<arg>
Frame resize keyframe denominator
--superres-mode=<arg> Frame super-resolution mode (0: disabled (default), 1: fixed, 2: random, 3: qthresh, 4: auto)
--superres-denominator=<arg>
Frame super-resolution denominator
--superres-kf-denominator=<arg>
Frame super-resolution keyframe denominator
--superres-qthresh=<arg> Frame super-resolution qindex threshold
--superres-kf-qthresh=<arg> Frame super-resolution keyframe qindex threshold
--end-usage=<arg> Rate control mode
vbr, cbr, cq, q
--target-bitrate=<arg> Bitrate (kbps)
--min-q=<arg> Minimum (best) quantizer
--max-q=<arg> Maximum (worst) quantizer
--undershoot-pct=<arg> Datarate undershoot (min) target (%)
--overshoot-pct=<arg> Datarate overshoot (max) target (%)
--buf-sz=<arg> Client buffer size (ms)
--buf-initial-sz=<arg> Client initial buffer size (ms)
--buf-optimal-sz=<arg> Client optimal buffer size (ms)
--bias-pct=<arg> CBR/VBR bias (0=CBR, 100=VBR)
--minsection-pct=<arg> GOP min bitrate (% of target)
--maxsection-pct=<arg> GOP max bitrate (% of target)
Keyframe Placement Options:
--enable-fwd-kf=<arg> Enable forward reference keyframes
--kf-min-dist=<arg> Minimum keyframe interval (frames)
--kf-max-dist=<arg> Maximum keyframe interval (frames)
--disable-kf Disable keyframe placement
--sframe-dist=<arg> S-Frame interval (frames)
--sframe-mode=<arg> S-Frame insertion mode (1..2)
AV1 Specific Options:
--cpu-used=<arg> Speed setting (0..6 in good mode, 5..11 in realtime mode, 0..9 in all intra mode)
--auto-alt-ref=<arg> Enable automatic alt reference frames
--sharpness=<arg> Bias towards block sharpness in rate-distortion optimization of transform coefficients (0..7), default is 0
--static-thresh=<arg> Motion detection threshold
--row-mt=<arg> Enable row based multi-threading (0: off, 1: on (default))
--fp-mt=<arg> Enable frame parallel multi-threading (0: off (default), 1: on)
--tile-columns=<arg> Number of tile columns to use, log2
--tile-rows=<arg> Number of tile rows to use, log2
--enable-tpl-model=<arg> RDO based on frame temporal dependency (0: off, 1: backward source based); required for deltaq mode
--enable-keyframe-filtering=<arg>
Apply temporal filtering on key frame (0: no filter, 1: filter without overlay (default), 2: filter with overlay - experimental, may break random access in players)
--arnr-maxframes=<arg> AltRef max frames (0..15)
--arnr-strength=<arg> AltRef filter strength (0..6)
--tune=<arg> Distortion metric tuned with
psnr, ssim, vmaf_with_preprocessing, vmaf_without_preprocessing, vmaf, vmaf_neg, butteraugli, vmaf_saliency_map
--cq-level=<arg> Constant/Constrained Quality level
--max-intra-rate=<arg> Max I-frame bitrate (pct)
--max-inter-rate=<arg> Max P-frame bitrate (pct)
--gf-cbr-boost=<arg> Boost for Golden Frame in CBR mode (pct)
--lossless=<arg> Lossless mode (0: false (default), 1: true)
--enable-cdef=<arg> Enable the constrained directional enhancement filter (0: false, 1: true (default), 2: disable for non-reference frames)
--enable-restoration=<arg> Enable the loop restoration filter (0: false (default in realtime mode), 1: true (default in non-realtime mode))
--enable-rect-partitions=<arg>
Enable rectangular partitions (0: false, 1: true (default))
--enable-ab-partitions=<arg>
Enable ab partitions (0: false, 1: true (default))
--enable-1to4-partitions=<arg>
Enable 1:4 and 4:1 partitions (0: false, 1: true (default))
--min-partition-size=<arg> Set min partition size (4:4x4, 8:8x8, 16:16x16, 32:32x32, 64:64x64, 128:128x128); with 4k+ resolutions or higher speed settings, min partition size will have a minimum of 8
--max-partition-size=<arg> Set max partition size (4:4x4, 8:8x8, 16:16x16, 32:32x32, 64:64x64, 128:128x128)
--enable-dual-filter=<arg> Enable dual filter (0: false, 1: true (default))
--enable-chroma-deltaq=<arg>
Enable chroma delta quant (0: false (default), 1: true)
--enable-intra-edge-filter=<arg>
Enable intra edge filtering (0: false, 1: true (default))
--enable-order-hint=<arg> Enable order hint (0: false, 1: true (default))
--enable-tx64=<arg> Enable 64-pt transform (0: false, 1: true (default))
--enable-flip-idtx=<arg> Enable extended transform type (0: false, 1: true (default)) including FLIPADST_DCT, DCT_FLIPADST, FLIPADST_FLIPADST, ADST_FLIPADST, FLIPADST_ADST, IDTX, V_DCT, H_DCT, V_ADST, H_ADST, V_FLIPADST, H_FLIPADST
--enable-rect-tx=<arg> Enable rectangular transform (0: false, 1: true (default))
--enable-dist-wtd-comp=<arg>
Enable distance-weighted compound (0: false, 1: true (default))
--enable-masked-comp=<arg> Enable masked (wedge/diff-wtd) compound (0: false, 1: true (default))
--enable-onesided-comp=<arg>
Enable one sided compound (0: false, 1: true (default))
--enable-interintra-comp=<arg>
Enable interintra compound (0: false, 1: true (default))
--enable-smooth-interintra=<arg>
Enable smooth interintra mode (0: false, 1: true (default))
--enable-diff-wtd-comp=<arg>
Enable difference-weighted compound (0: false, 1: true (default))
--enable-interinter-wedge=<arg>
Enable interinter wedge compound (0: false, 1: true (default))
--enable-interintra-wedge=<arg>
Enable interintra wedge compound (0: false, 1: true (default))
--enable-global-motion=<arg>
Enable global motion (0: false, 1: true (default))
--enable-warped-motion=<arg>
Enable local warped motion (0: false, 1: true (default))
--enable-filter-intra=<arg> Enable filter intra prediction mode (0: false, 1: true (default))
--enable-smooth-intra=<arg> Enable smooth intra prediction modes (0: false, 1: true (default))
--enable-paeth-intra=<arg> Enable Paeth intra prediction mode (0: false, 1: true (default))
--enable-cfl-intra=<arg> Enable chroma from luma intra prediction mode (0: false, 1: true (default))
--enable-diagonal-intra=<arg>
Enable diagonal (D45 to D203) intra prediction modes, which are a subset of directional modes; has no effect if enable-directional-intra is 0 (0: false, 1: true (default))
--force-video-mode=<arg> Force video mode even for a single frame (0: false (default), 1: true)
--enable-obmc=<arg> Enable OBMC (0: false, 1: true (default))
--enable-overlay=<arg> Enable coding overlay frames (0: false, 1: true (default))
--enable-palette=<arg> Enable palette prediction mode (0: false, 1: true (default))
--enable-intrabc=<arg> Enable intra block copy prediction mode (0: false, 1: true (default))
--enable-angle-delta=<arg> Enable intra angle delta (0: false, 1: true (default))
--disable-trellis-quant=<arg>
Disable trellis optimization of quantized coefficients (0: false 1: true 2: true for rd search 3: true for estimate yrd search (default))
--enable-qm=<arg> Enable quantisation matrices (0: false (default), 1: true)
--qm-min=<arg> Min quant matrix flatness (0..15), default is 8
--qm-max=<arg> Max quant matrix flatness (0..15), default is 15
--reduced-tx-type-set=<arg> Use reduced set of transform types
--use-intra-dct-only=<arg> Use DCT only for INTRA modes
--use-inter-dct-only=