当前位置: 首页 > article >正文

矩阵-向量乘法的行与列的解释(Row and Column Interpretations):中英双语

本文是学习这本书的笔记

在这里插入图片描述
网站是:https://web.stanford.edu/~boyd/vmls/

矩阵-向量乘法的行与列的解释

矩阵-向量乘法(Matrix-Vector Multiplication)是线性代数中的基本操作,也是机器学习、数据科学和工程中常用的数学工具。本文将详细解释矩阵-向量乘法中的“行与列”的两种视角,并通过实际例子帮助理解其背后的数学意义。


1. 什么是矩阵-向量乘法?

矩阵 ( A A A) 和向量 ( x x x) 的乘积可以用下面的公式表示:
y = A x y = Ax y=Ax
其中:

  • 矩阵 ( A ∈ R m × n A \in \mathbb{R}^{m \times n} ARm×n) 是一个 ( m × n m \times n m×n) 的矩阵;
  • 向量 ( x ∈ R n x \in \mathbb{R}^n xRn) 是一个 ( n n n)-维列向量;
  • 结果 ( y ∈ R m y \in \mathbb{R}^m yRm) 是一个 ( m m m)-维列向量。

矩阵-向量乘法可以从“”和“”的两种视角来理解。接下来我们分别介绍这两种解释。


2. 从行的角度解释

矩阵-向量乘法可以视为“将向量 ( x x x) 与矩阵的每一行进行内积计算”,具体来说:

矩阵 ( A A A) 的第 ( i i i) 行记为 ( b i T b_i^T biT):
A = [ b 1 T b 2 T ⋮ b m T ] , A = \begin{bmatrix} b_1^T \\ b_2^T \\ \vdots \\ b_m^T \end{bmatrix}, A= b1Tb2TbmT ,
其中每个 ( b i T ∈ R n b_i^T \in \mathbb{R}^n biTRn) 是 ( A A A) 的一行(转置表示为行向量)。

对于 ( y = A x y = Ax y=Ax),结果向量 ( y y y) 的第 ( i i i) 个元素 ( y i y_i yi) 是矩阵第 ( i i i) 行与向量 ( x x x) 的内积:
y i = b i T x , i = 1 , 2 , … , m . y_i = b_i^T x, \quad i = 1, 2, \dots, m. yi=biTx,i=1,2,,m.

公式解读

  • ( b i T x b_i^T x biTx) 表示矩阵第 ( i i i) 行与向量 ( x x x) 的内积;
  • 每一行的内积结果形成 ( y y y) 中的一个元素。

例子:

假设:
A = [ 1 2 3 4 5 6 7 8 9 ] , x = [ 1 2 3 ] . A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, \quad x = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}. A= 147258369 ,x= 123 .

计算 ( y = A x y = Ax y=Ax) 时,从行的视角来看:

  1. 取第 1 行 ( b 1 T = [ 1 , 2 , 3 ] b_1^T = [1, 2, 3] b1T=[1,2,3]),计算内积 ( y 1 = b 1 T x = 1 ⋅ 1 + 2 ⋅ 2 + 3 ⋅ 3 = 14 y_1 = b_1^T x = 1 \cdot 1 + 2 \cdot 2 + 3 \cdot 3 = 14 y1=b1Tx=11+22+33=14);
  2. 取第 2 行 ( b 2 T = [ 4 , 5 , 6 ] b_2^T = [4, 5, 6] b2T=[4,5,6]),计算内积 ( y 2 = b 2 T x = 4 ⋅ 1 + 5 ⋅ 2 + 6 ⋅ 3 = 32 y_2 = b_2^T x = 4 \cdot 1 + 5 \cdot 2 + 6 \cdot 3 = 32 y2=b2Tx=41+52+63=32);
  3. 取第 3 行 ( b 3 T = [ 7 , 8 , 9 ] b_3^T = [7, 8, 9] b3T=[7,8,9]),计算内积 ( y 3 = b 3 T x = 7 ⋅ 1 + 8 ⋅ 2 + 9 ⋅ 3 = 50 y_3 = b_3^T x = 7 \cdot 1 + 8 \cdot 2 + 9 \cdot 3 = 50 y3=b3Tx=71+82+93=50)。

最终结果:
y = [ 14 32 50 ] . y = \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix}. y= 143250 .


3. 从列的角度解释

矩阵-向量乘法也可以视为“将矩阵 ( A A A) 的列按照向量 ( x x x) 中的元素加权,并进行线性组合”。具体来说:

矩阵 ( A A A) 的第 ( k k k) 列记为 ( a k a_k ak):
A = [ a 1 , a 2 , … , a n ] , A = [a_1, a_2, \dots, a_n], A=[a1,a2,,an],
其中每个 ( a k ∈ R m a_k \in \mathbb{R}^m akRm) 是 ( A A A) 的一列。

矩阵-向量乘法 ( y = A x y = Ax y=Ax) 可以写成:
y = x 1 a 1 + x 2 a 2 + ⋯ + x n a n , y = x_1 a_1 + x_2 a_2 + \dots + x_n a_n, y=x1a1+x2a2++xnan,
即,结果向量 ( y y y) 是矩阵列向量的线性组合,组合系数由 ( x x x) 的元素给出。

公式解读

  • ( x k a k x_k a_k xkak) 表示用 ( x x x) 中的第 ( k k k) 个元素 ( x k x_k xk) 对矩阵的第 ( k k k) 列进行加权;
  • 把加权后的所有列向量相加,得到结果向量 ( y y y)。

例子:

继续使用相同的矩阵和向量:
A = [ 1 2 3 4 5 6 7 8 9 ] , x = [ 1 2 3 ] . A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, \quad x = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}. A= 147258369 ,x= 123 .

从列的视角来看:

  1. ( A A A) 的第 1 列是 ( a 1 = [ 1 , 4 , 7 ] T a_1 = [1, 4, 7]^T a1=[1,4,7]T),加权系数是 ( x 1 = 1 x_1 = 1 x1=1),所以贡献向量为 ( 1 ⋅ a 1 = [ 1 , 4 , 7 ] T 1 \cdot a_1 = [1, 4, 7]^T 1a1=[1,4,7]T);
  2. ( A A A) 的第 2 列是 ( a 2 = [ 2 , 5 , 8 ] T a_2 = [2, 5, 8]^T a2=[2,5,8]T),加权系数是 ( x 2 = 2 x_2 = 2 x2=2),所以贡献向量为 ( 2 ⋅ a 2 = [ 4 , 10 , 16 ] T 2 \cdot a_2 = [4, 10, 16]^T 2a2=[4,10,16]T);
  3. ( A A A) 的第 3 列是 ( a 3 = [ 3 , 6 , 9 ] T a_3 = [3, 6, 9]^T a3=[3,6,9]T),加权系数是 ( x 3 = 3 x_3 = 3 x3=3),所以贡献向量为 ( 3 ⋅ a 3 = [ 9 , 18 , 27 ] T 3 \cdot a_3 = [9, 18, 27]^T 3a3=[9,18,27]T)。

最终结果是所有列向量的线性组合:

y = 1 ⋅ a 1 + 2 ⋅ a 2 + 3 ⋅ a 3 = [ 1 + 4 + 9 4 + 10 + 18 7 + 16 + 27 ] = [ 14 32 50 ] . y = 1 \cdot a_1 + 2 \cdot a_2 + 3 \cdot a_3 = \begin{bmatrix} 1 + 4 + 9 \\ 4 + 10 + 18 \\ 7 + 16 + 27\end{bmatrix}= \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix}. y=1a1+2a2+3a3= 1+4+94+10+187+16+27 = 143250 .


4. 行与列视角的联系与选择
  1. 行视角:更适合理解矩阵-向量乘法的每个输出分量 ( y i y_i yi) 是如何计算的,即通过行与向量的内积。
  2. 列视角:更适合理解结果向量 ( y y y) 是由矩阵列向量的线性组合得到的。

在实际应用中,可以根据问题背景选择合适的视角:

  • 行视角通常用于计算或实现算法;
  • 列视角常用于分析结果或解释几何意义。

5. 结论

矩阵-向量乘法是线性代数中极为重要的操作,而从行和列的两个视角理解,可以帮助我们更深刻地掌握其计算过程与实际意义。无论是从行的内积出发,还是从列的线性组合出发,这两种视角都揭示了矩阵操作在数学和应用中的多样性。

英文版

Matrix-Vector Multiplication: Row and Column Interpretations

Matrix-vector multiplication is a fundamental operation in linear algebra and is widely used in fields like machine learning, data science, and engineering. This article explains the row and column perspectives of matrix-vector multiplication in detail, with practical examples to help clarify the underlying mathematics.


1. What is Matrix-Vector Multiplication?

The product of a matrix (A) and a vector (x) is written as:
y = A x y = Ax y=Ax
where:

  • ( A ∈ R m × n A \in \mathbb{R}^{m \times n} ARm×n) is an ( m × n m \times n m×n) matrix;
  • ( x ∈ R n x \in \mathbb{R}^n xRn) is an ( n n n)-dimensional column vector;
  • ( y ∈ R m y \in \mathbb{R}^m yRm) is the resulting ( m m m)-dimensional column vector.

Matrix-vector multiplication can be understood from two perspectives:

  1. The row view: Treat the result as the dot products of ( x x x) with each row of ( A A A).
  2. The column view: Treat the result as a linear combination of the columns of ( A A A).

2. Row Perspective

In the row perspective, matrix-vector multiplication involves taking the dot product of the vector ( x x x) with each row of the matrix ( A A A). Let ( b i T b_i^T biT) represent the ( i i i)-th row of ( A A A), so:
A = [ b 1 T b 2 T ⋮ b m T ] , A = \begin{bmatrix} b_1^T \\ b_2^T \\ \vdots \\ b_m^T \end{bmatrix}, A= b1Tb2TbmT ,
where each ( b i T ∈ R n b_i^T \in \mathbb{R}^n biTRn) is a row vector.

The ( i i i)-th entry of the result vector ( y y y) is:
y i = b i T x , i = 1 , 2 , … , m . y_i = b_i^T x, \quad i = 1, 2, \dots, m. yi=biTx,i=1,2,,m.

This means:

  • Each entry ( y i y_i yi) is the dot product of ( x x x) with the ( i i i)-th row of ( A A A).
  • The result vector ( y y y) consists of ( m m m) such dot products, one for each row.

Example (Row Perspective)

Suppose:
A = [ 1 2 3 4 5 6 7 8 9 ] , x = [ 1 2 3 ] . A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, \quad x = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}. A= 147258369 ,x= 123 .

To compute ( y = A x y = Ax y=Ax):

  1. Take the first row ( b 1 T = [ 1 , 2 , 3 ] b_1^T = [1, 2, 3] b1T=[1,2,3]), and compute the dot product with ( x x x):
    y 1 = 1 ⋅ 1 + 2 ⋅ 2 + 3 ⋅ 3 = 14. y_1 = 1 \cdot 1 + 2 \cdot 2 + 3 \cdot 3 = 14. y1=11+22+33=14.
  2. Take the second row ( b 2 T = [ 4 , 5 , 6 ] b_2^T = [4, 5, 6] b2T=[4,5,6]), and compute the dot product:
    y 2 = 4 ⋅ 1 + 5 ⋅ 2 + 6 ⋅ 3 = 32. y_2 = 4 \cdot 1 + 5 \cdot 2 + 6 \cdot 3 = 32. y2=41+52+63=32.
  3. Take the third row ( b 3 T = [ 7 , 8 , 9 ] b_3^T = [7, 8, 9] b3T=[7,8,9]), and compute the dot product:
    y 3 = 7 ⋅ 1 + 8 ⋅ 2 + 9 ⋅ 3 = 50. y_3 = 7 \cdot 1 + 8 \cdot 2 + 9 \cdot 3 = 50. y3=71+82+93=50.

Thus:
y = [ 14 32 50 ] . y = \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix}. y= 143250 .


3. Column Perspective

In the column perspective, matrix-vector multiplication can be interpreted as a linear combination of the columns of ( A A A), with the elements of ( x x x) serving as the coefficients of the combination. Let ( a k a_k ak) represent the ( k k k)-th column of ( A A A), so:
A = [ a 1 , a 2 , … , a n ] , A = [a_1, a_2, \dots, a_n], A=[a1,a2,,an],
where each ( a k ∈ R m a_k \in \mathbb{R}^m akRm) is a column vector.

The matrix-vector product ( y = A x y = Ax y=Ax) can be written as:
y = x 1 a 1 + x 2 a 2 + ⋯ + x n a n . y = x_1 a_1 + x_2 a_2 + \cdots + x_n a_n. y=x1a1+x2a2++xnan.

This means:

  • ( x k a k x_k a_k xkak) scales the ( k k k)-th column ( a k a_k ak) of ( A A A) by the ( k k k)-th entry ( x k x_k xk) of the vector ( x x x).
  • The result ( y y y) is the sum of these scaled columns.

Example (Column Perspective)

Using the same matrix and vector:
A = [ 1 2 3 4 5 6 7 8 9 ] , x = [ 1 2 3 ] . A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, \quad x = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}. A= 147258369 ,x= 123 .

From the column perspective:

  1. The first column of ( A A A) is ( a 1 = [ 1 , 4 , 7 ] T a_1 = [1, 4, 7]^T a1=[1,4,7]T), scaled by ( x 1 = 1 x_1 = 1 x1=1):
    1 ⋅ a 1 = [ 1 , 4 , 7 ] T . 1 \cdot a_1 = [1, 4, 7]^T. 1a1=[1,4,7]T.
  2. The second column of ( A A A) is ( a 2 = [ 2 , 5 , 8 ] T a_2 = [2, 5, 8]^T a2=[2,5,8]T), scaled by ( x 2 = 2 x_2 = 2 x2=2):
    2 ⋅ a 2 = [ 4 , 10 , 16 ] T . 2 \cdot a_2 = [4, 10, 16]^T. 2a2=[4,10,16]T.
  3. The third column of ( A A A) is ( a 3 = [ 3 , 6 , 9 ] T a_3 = [3, 6, 9]^T a3=[3,6,9]T), scaled by ( x 3 = 3 x_3 = 3 x3=3):
    3 ⋅ a 3 = [ 9 , 18 , 27 ] T . 3 \cdot a_3 = [9, 18, 27]^T. 3a3=[9,18,27]T.

Add the scaled columns:
y = 1 ⋅ a 1 + 2 ⋅ a 2 + 3 ⋅ a 3 = [ 1 + 4 + 9 4 + 10 + 18 7 + 16 + 27 ] = [ 14 32 50 ] . y = 1 \cdot a_1 + 2 \cdot a_2 + 3 \cdot a_3 = \begin{bmatrix} 1 + 4 + 9 \\ 4 + 10 + 18 \\ 7 + 16 + 27 \end{bmatrix}= \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix}. y=1a1+2a2+3a3= 1+4+94+10+187+16+27 = 143250 .


4. Connection Between the Two Perspectives
  • Row Perspective: Computes the entries of the result vector ( y y y) one at a time, using dot products between the rows of ( A A A) and the vector ( x x x). This perspective is often used in implementation and numerical computation.
  • Column Perspective: Views the result vector ( y y y) as a linear combination of the columns of ( A A A), scaled by the entries of ( x x x). This perspective is useful for understanding geometric interpretations and applications like data transformations.

The two perspectives are mathematically equivalent and simply offer different ways to interpret the same operation.


5. Practical Applications
  1. Row Perspective in Machine Learning:

    • Useful when processing datasets where rows represent individual samples and columns represent features.
    • Example: In a neural network, ( A A A) can represent weights, and ( x x x) the input features. The output ( y y y) represents weighted sums for each neuron.
  2. Column Perspective in Data Transformation:

    • Common in computer graphics and signal processing, where each column represents a basis vector, and the result is a transformation of ( x x x) into a new coordinate system.
    • Example: Principal Component Analysis (PCA) involves projecting data onto principal components (columns).

6. Conclusion

Matrix-vector multiplication is a versatile operation that can be understood from two complementary perspectives:

  • The row perspective, which emphasizes the dot product computation for each entry of the result vector.
  • The column perspective, which highlights the linear combination of matrix columns.

By understanding both views, you can better analyze and apply this operation in various fields like machine learning, data analysis, and linear systems.

后记

2024年12月20日13点01分于上海,在GPT4o的辅助下完成。


http://www.kler.cn/a/449126.html

相关文章:

  • OpenCV相机标定与3D重建(26)计算两个二维点集之间的部分仿射变换矩阵(2x3)函数 estimateAffinePartial2D()的使用
  • 【086】基于51单片机电子音乐门铃【Proteus仿真+Keil程序+报告+原理图】
  • Golang学习历程【第三篇 基本数据类型类型转换】
  • java全栈day19--Web后端实战(java操作数据库3)
  • 【IMU:视觉惯性SLAM系统】
  • GM_T 0039《密码模块安全检测要求》题目
  • yolov目标检测的图片onnx输入尺寸及预处理
  • 杀死名为360安全的软件
  • 14,攻防世界Web_php_unserialize
  • 深入了解Linux —— make和makefile自动化构建工具
  • 以太坊账户详解
  • 构建一个rust生产应用读书笔记7-Mock编码浪子
  • 项目测试方案流程详解
  • ARP协议的工作原理
  • 【jvm】内存泄漏的8种情况
  • 前端面经每日一题Day19
  • 电子应用设计方案68:智能晾衣架系统设计
  • 每日一题 341. 扁平化嵌套列表迭代器
  • Linux嵌入式系统利用套接字编程(Socket Programming)实现网络通信的基础知识并附对一个简单实例的分析
  • 【Spring】控制反转(IoC)与依赖注入(DI)—IoC的概念与优点
  • 【YashanDB知识库】YMP迁移过程中报错YAS-02143或YAS-02193
  • 如何在K8S集群中查看和操作Pod内的文件?
  • 基于Spring Boot的远程教育网站
  • IPC协议获取签名信息
  • [计算机图形学] 【Unity Shader】【图形渲染】Shader数学基础6-逆矩阵与正交矩阵
  • leetcode hot100 合并区间