当前位置：首页 > article >正文

libaom 源码分析：AV1 帧内非方向预测模式

article 2025/2/21 3:29:37

AV1 帧内非方向预测模式原理

帧内非方向预测模式包括：DC_PRED、SMOOTH_V、SMOOTH_H、SMOOTH、Peath
DC_PRED 模式：类似 H264、H265 等中帧内预测的 DC 模式，通过平均顶部和左侧邻近块的重建样本来生成当前块的预测样本。
SMOOTH 模式：这个模式包括三种变体，分别是 SMOOTH_V（垂直平滑）、SMOOTH_H（水平平滑）和 SMOOTH（垂直+水平平滑）。SMOOTH_V 和 SMOOTH_H 分别沿垂直和水平方向使用二次插值生成预测值，而 SMOOTH 模式则使用沿两个方向的二次插值结果的平均值来生成预测值。这些模式用于二次插值的样本包括来自顶部和左侧邻近重建块的重建样本，以及通过顶部和左侧重建样本估计的右侧和底部边界的样本。
Peath 模式：这种模式根据其顶部（T）、左侧（L）和左上角（TL）的参考样本预测每个样本。在这些参考样本中，与（T + L – TL）的值最接近的那个值被选为预测样本。Paeth模式是以人名命名的模式，有个人在开发 PNG 图像编码时发明了Paeth预测。

libaom相关源码分析

函数逻辑关系：
dc_predictor_rect 函数

函数通过计算上方和左侧样本的平均值来预测一个矩形区域的像素值。这种方法简单且常用于图像和视频编码中的直流预测。

static INLINE void dc_predictor_rect(uint8_t *dst, ptrdiff_t stride, int bw,
                                     int bh, const uint8_t *above,
                                     const uint8_t *left, int shift1,
                                     int multiplier) {
  int sum = 0;

  for (int i = 0; i < bw; i++) {
    sum += above[i];
  }
  for (int i = 0; i < bh; i++) {
    sum += left[i];
  }

  const int expected_dc = divide_using_multiply_shift(
      sum + ((bw + bh) >> 1), shift1, multiplier, DC_SHIFT2);
  assert(expected_dc < (1 << 8));

  for (int r = 0; r < bh; r++) {
    memset(dst, expected_dc, bw);
    dst += stride;
  }
}

smooth_v_predictor 函数

这个函数是沿着垂直方向进行预测的，即它使用当前像素上方的像素值（above）和左下方的像素值（left数组的最后一个元素）来预测当前像素块中的像素值。
定义了平滑权重的尺度，scale是2的SMOOTH_WEIGHT_LOG2_SCALE次幂；
双层 for 循环遍历当前块每个像素，然后 for 遍历每个权重和像素，累加作为当前像素预测值this_pred；
divide_round函数用于将this_pred除以scale，并进行四舍五入，得到最终的预测值。

static INLINE void smooth_v_predictor(uint8_t *dst, ptrdiff_t stride, int bw,
                                      int bh, const uint8_t *above,
                                      const uint8_t *left) {
  const uint8_t below_pred = left[bh - 1];  // estimated by bottom-left pixel
  const uint8_t *const sm_weights = smooth_weights + bh - 4;
  // scale = 2^SMOOTH_WEIGHT_LOG2_SCALE
  const int log2_scale = SMOOTH_WEIGHT_LOG2_SCALE;
  const uint16_t scale = (1 << SMOOTH_WEIGHT_LOG2_SCALE);
  sm_weights_sanity_checks(sm_weights, sm_weights, scale,
                           log2_scale + sizeof(*dst));

  int r;
  for (r = 0; r < bh; r++) {
    int c;
    for (c = 0; c < bw; ++c) {
      const uint8_t pixels[] = { above[c], below_pred };
      const uint8_t weights[] = { sm_weights[r], scale - sm_weights[r] };
      uint32_t this_pred = 0;
      assert(scale >= sm_weights[r]);
      int i;
      for (i = 0; i < 2; ++i) {
        this_pred += weights[i] * pixels[i];
      }
      dst[c] = divide_round(this_pred, log2_scale);
    }
    dst += stride;
  }
}

smooth_h_predictor 函数

类似垂直预测过程，不过这个函数是沿着水平方向进行预测的，即它使用当前像素左侧的像素值（left）和右上方的像素值（above数组的最后一个元素）来预测当前像素块中的像素值。

static INLINE void smooth_h_predictor(uint8_t *dst, ptrdiff_t stride, int bw,
                                      int bh, const uint8_t *above,
                                      const uint8_t *left) {
  const uint8_t right_pred = above[bw - 1];  // estimated by top-right pixel
  const uint8_t *const sm_weights = smooth_weights + bw - 4;
  // scale = 2^SMOOTH_WEIGHT_LOG2_SCALE
  const int log2_scale = SMOOTH_WEIGHT_LOG2_SCALE;
  const uint16_t scale = (1 << SMOOTH_WEIGHT_LOG2_SCALE);
  sm_weights_sanity_checks(sm_weights, sm_weights, scale,
                           log2_scale + sizeof(*dst));

  int r;
  for (r = 0; r < bh; r++) {
    int c;
    for (c = 0; c < bw; ++c) {
      const uint8_t pixels[] = { left[r], right_pred };
      const uint8_t weights[] = { sm_weights[c], scale - sm_weights[c] };
      uint32_t this_pred = 0;
      assert(scale >= sm_weights[c]);
      int i;
      for (i = 0; i < 2; ++i) {
        this_pred += weights[i] * pixels[i];
      }
      dst[c] = divide_round(this_pred, log2_scale);
    }
    dst += stride;
  }
}

smooth_predictor 函数

类似水平、垂直平滑预测，不过该函数是一个综合了垂直和水平方向信息的平滑预测器，用于视频编码中的像素值预测。它结合了smooth_v_predictor和smooth_h_predictor的特点，使用水平和垂直临近像素来预测当前像素的值。分别将上方像素数组 above、左边像素数组 left 的最后一个数作为下方预测、右边预测值。
与水平、垂直平滑预测不同的是，它不仅考虑了垂直方向的邻近像素（上方和下方），还考虑了水平方向的邻近像素（左侧和右侧）；此外它还使用了四个像素值和四个权重值进行计算，而其他两个函数分别只使用了两个。

static INLINE void smooth_predictor(uint8_t *dst, ptrdiff_t stride, int bw,
                                    int bh, const uint8_t *above,
                                    const uint8_t *left) {
  const uint8_t below_pred = left[bh - 1];   // estimated by bottom-left pixel
  const uint8_t right_pred = above[bw - 1];  // estimated by top-right pixel
  const uint8_t *const sm_weights_w = smooth_weights + bw - 4;
  const uint8_t *const sm_weights_h = smooth_weights + bh - 4;
  // scale = 2 * 2^SMOOTH_WEIGHT_LOG2_SCALE
  const int log2_scale = 1 + SMOOTH_WEIGHT_LOG2_SCALE;
  const uint16_t scale = (1 << SMOOTH_WEIGHT_LOG2_SCALE);
  sm_weights_sanity_checks(sm_weights_w, sm_weights_h, scale,
                           log2_scale + sizeof(*dst));
  int r;
  for (r = 0; r < bh; ++r) {
    int c;
    for (c = 0; c < bw; ++c) {
      const uint8_t pixels[] = { above[c], below_pred, left[r], right_pred };
      const uint8_t weights[] = { sm_weights_h[r], scale - sm_weights_h[r],
                                  sm_weights_w[c], scale - sm_weights_w[c] };
      uint32_t this_pred = 0;
      int i;
      assert(scale >= sm_weights_h[r] && scale >= sm_weights_w[c]);
      for (i = 0; i < 4; ++i) {
        this_pred += weights[i] * pixels[i];
      }
      dst[c] = divide_round(this_pred, log2_scale);
    }
    dst += stride;
  }
}